SlideShare une entreprise Scribd logo
1  sur  39
Télécharger pour lire hors ligne
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Running a High Performance
Kubernetes Cluster with Amazon EKS
Nathan Peck
Developer Advocate
Amazon Web Services
C O N 3 1 8
Yekesa Kosuru
Managing Director
State Street
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Breakout repeats
Tuesday, November 27th
Running a high performance Kubernetes cluster with Amazon EKS
6:15 PM | Venetian, Level 3, Murano 3205
Wednesday, November 28th
Running a high performance Kubernetes cluster with Amazon EKS
1:45 PM | Venetian, Level 4, Delfino 4002
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
Best practices of designing for performance
How do I test performance?
State Street: Database at scale demo
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Basic components of Kubernetes
You
Worker Nodes
Amazon EKS
etcd
Control Plane
Your Container
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimizing your container
Optimize for smaller size, use a multistage
Docker build to reduce the size of the
runtime container.
Use a minimalist operating system: Alpine
Linux, or similar. Or use no operating
system: statically linked Go binary.
Not all runtimes are equal. Does your app
have a cold start that requires an initial burst
of resources?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Popular base images have a huge range by size
REPOSITORY SIZE
node:latest 674MB
java:latest 643MB
node:slim 184MB
ubuntu:latest 85.8MB
alpine:latest 4.41MB
busybox:latest 1.15MB
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimize pods
How many sidecar containers do
you have in each pod?
Admission controllers make it
easy to add a lot of sidecars but
don’t underestimate the overhead
cost.
Keep pods as lightweight as you
can.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimize pod placement
Make sure you use resource constraints:
- Request the baseline average resource
needs of the app
- Put a limit on the max resources to be
made available to the pod to prevent one
pod from interfering with the
performance of another pod
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimize density vs. size of pods
4 x pod
.5 CPU
256 memory
2 x pod
1 CPU
512 memory
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Anti-affinity
Anti-affinity constraints can keep
heavy CPU using pods away from
each other, on different hosts
Warning: anti-affinity is a beta
feature pre Kubernetes 1.12,
which improves anti-affinity
performance 100x in large
clusters
Tradeoff: heavier control plane
scheduling burden, application
pod performance bonus
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Observability
for pod
performance
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimizing your worker nodes
Use the latest generation of Amazon Elastic
Compute Cloud (Amazon EC2) instances.
The c5 instance generation has up to 25%
better price/performance than c4 instances.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Optimizing your worker nodes
Choose instance class that matches your primary
pod resource needs:
• “c” instances are optimized for CPU heavy work
• “r” instances are optimized for memory heavy work
• “m” instances are general purpose
• “p” instances optimized for GPU powered machine learning
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Kubernetes control plane on tap, optimized by
Amazon EKS
mycluster.eks.amazonaws.com
EKS Workers
Kubectl
AZ 1 AZ 2 AZ 3
Your AWS account
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Networking performance: AWS VPC CNI plugin
K u b e l e t
V P C C N I
p l u g i n
1 . C N I A d d / D e l e t e
E C 2
E N I E N I E N I
P o d P o d P o d P o d
V P C
N e t w o r k
.........
0 . C r e a t e E N I
2 . S e t u p v e t h
Thin layer, no overhead
Give K8s pods native IP
addresses in the VPC
Multiple ENI per
Amazon EC2, multiple
pods per ENI, all
configurable
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pod to pod networking
E C 2
Default namespace
Pod namespace
veth veth
Main
Route Table
E C 2
Default namespace
Pod namespace
veth
Route
Table
Main
Route Table
ENI RT
veth
VPC
fabric
ENI RT
Route
Table
veth
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Kubernetes performance envelope
Number of Nodes
Pod Churn
Pod Density
Networking
Secrets
Anti-affinity
Active
Namespaces
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Heavy monolithic pods in a very large cluster
Number of Nodes
Pod Churn
Pod DensityNetworking
Secrets
Anti-affinity
Active
Namespaces
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Numerous densely bin packed microservice pods
Number of Nodes
Pod ChurnPod Density
Networking
SecretsAnti-affinity
Active
Namespaces
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
The EKS team is here to help!
We are constantly learning from the varying
use cases of the many large deployments
orchestrated using an EKS control plane
Reach out to the team via support ticket, we
will help you optimize your control plane to
your exact performance needs
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State Street Disclaimer
• Views and opinions expressed herein are those of PRESENTER as of 11/27/18 and they are
subject to change based on market and other conditions and in any event may not reflect the
views of State Street Corporation and its subsidiaries and affiliates (“State Street”).
• This information herein is for informational purposes only and it does not constitute investment
research or investment, legal, or tax advice, and it is not an offer or solicitation to buy or sell
any product, service, or securities or any financial instrument, and it does not constitute any
binding contractual arrangement or commitment of any kind.
• This information is provided “as-is” and State Street makes no guarantee, representation, or
warranty of any kind regarding such information.
• This information is not intended to be relied upon by any person or entity. State Street
disclaims all liability, whether arising in contract, tort or otherwise, for any losses, liabilities,
damages, expenses or costs arising, either direct, indirect, consequential, special or punitive,
from or in connection with the use of the information herein.
• No permission is granted to reprint, sell, copy, distribute, or modify any material herein, in any
form or by any means without the prior written consent of State Street.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• What and why
• How: Leverage Kubernetes primitives to build a high-performance
system
• Design Considerations and Best Practices
• Scaling Factors and Bottlenecks
• Live Demo: Demonstrate scale-out database on Amazon EKS
• 1Million + queries per second
• Measure latencies
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What and why
Transactional database w/ unlimited scale concurrency
High Concurrency
Low Latencies
Open Source
Cloud Native Architecture
Custom Database Features
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Replication architecture
• MySQL Master/Slave
• RocksDB Engine
• LSM Data Structure optimized for
writes
• Write intensive workloads
• Low memory demands
• MariaDB or Percona
• RocksDB
• Standard MySQL features
• Semi Sync Replication w/ GTID
• Failover
• Cloud Native
• SST Files and WAL’s synchronized
Percona
Server
(binlogs)
MyRocks
WALSST
rocksdb
local-attached
storage
bin
bin
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resilient scale-out architecture
• Scale-out
• Vitess - Sharded cluster for scale
• Each shard is one master + multiple slaves
• Custom sharding key
• Read Scale-out: add more slaves
• Write Scale-out: add more shards
• ACID compliance across cluster
• Connection pool, restrain bad queries
• Amazon S3 and Amazon EKS
• Backups stored to Amazon Simple Storage
Service (Amazon S3)
• Cluster hosted on Amazon EKS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Storage
• Persistent Volume (PV)
• Persistent storage survives pod
restarts
• HostPath PV
• Local storage SSD/NVMe devices
• PV are attached via PV Claims
• PV Claims
• Dynamic
• Abstraction to underlying
storage
• ReadWriteOnce
• Tradeoff between
resiliency and performance
Pod
Data Volume
Pod
(pvc)
Persistent Vol
Persistent Vol
Pod
(pvc)
Best resiliency
Low performance
No resiliency
Best performance
Medium resiliency
Best performance
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Getting the most out of K8S
• Taints & tolerations
• Place one master per worker
• Taint the node to maximize
performance
• Affinity & anti-affinity
• Affinity pods scheduled on
SSD/NVMe
• AntiAffinity: Ensure masters aren’t
scheduled with replicas
• Services
• Resource requests and limits
• SYSCTL
• StatefulSet
• Replicated group of pods with unique
properties
• StatefulSet restarts Pod on same
node
• Requires to use PV Claims
• Operator + ETCD
• DaemonSet
• Background processes per node on
all nodes
• Monitoring & upgrades
• e.g., Metrics agent, Local Volume
Provisioner
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pod networking
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Best practices for high-performance clusters
• Lean and mean container
• Amazon EKS optimized AMI
• Image pull policy
• Master on SSD/NVMe
• Slave:
• SSD/NVMe
• EBS for increased resiliency
• Monitor key metrics
• Watch overcommitted state
• Cluster auto scaler. HPA. VPA
• Placement groups
• 30% higher packets per second
• Cross AZ deployment
• Place writes closer to master
• Choose right size nodes
• Good n/w performance
• More CPU better than more RAM
• Use EKS CNI plugin
• Upgrade your CNI plug-in
• aws-k8s-cni-g74ecf61.yaml or later
• Bottleneck: Packets per
second, not bandwidth
• Jumbo packets increased 90% QPS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Bottleneck
Bottleneck @400K QPS
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scaling factors
Query
Throughput
Query Latency Network PPS Configuration
300K P95: 300ms
P50: 8 ms
N/A 2 Shards
4 workers
Overloaded
400K P95: 4 ms
P50: 500 nano
6.8M Single VPC
No PG. One replica. 3
Subnets
600K P95: 4 ms
P50: 500 nano
9.5M Single AZ. One
replica.
1.5K MTU
Placement Group
948K P95: 3 ms
P50: 500 nano
9.5M Single AZ. Placement
Group. One replica.
Jumbo Frames
1.36M P95: 3 ms
P50: 500 nano
9.5M Single AZ. Placement
Group. Jumbo Frames
4 Replicas
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Yekesa Kosuru
ykosuru@statestreet.com
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Contenu connexe

Tendances

(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep DiveAmazon Web Services
 
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...Amazon Web Services
 
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021AWSKRUG - AWS한국사용자모임
 
Kubernetes on AWS with Amazon EKS
Kubernetes on AWS with Amazon EKSKubernetes on AWS with Amazon EKS
Kubernetes on AWS with Amazon EKSAmazon Web Services
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAmazon Web Services
 
Aws intro to cloud_economics
Aws intro to cloud_economicsAws intro to cloud_economics
Aws intro to cloud_economicsjtaylor707
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability FaithWestdorp
 
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon Web Services Korea
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon Web Services
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack IntroductionVikram Shinde
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Kai Wähner
 

Tendances (20)

(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive(DAT407) Amazon ElastiCache: Deep Dive
(DAT407) Amazon ElastiCache: Deep Dive
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 
Introduction to Amazon EKS
Introduction to Amazon EKSIntroduction to Amazon EKS
Introduction to Amazon EKS
 
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...
Deep Dive on Amazon EC2 Instances & Performance Optimization Best Practices (...
 
infrastructure as code
infrastructure as codeinfrastructure as code
infrastructure as code
 
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
IaC로 AWS인프라 관리하기 - 이진성 (AUSG) :: AWS Community Day Online 2021
 
Kubernetes on AWS with Amazon EKS
Kubernetes on AWS with Amazon EKSKubernetes on AWS with Amazon EKS
Kubernetes on AWS with Amazon EKS
 
Amazon EKS Deep Dive
Amazon EKS Deep DiveAmazon EKS Deep Dive
Amazon EKS Deep Dive
 
Analyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon KinesisAnalyzing Streaming Data in Real-time with Amazon Kinesis
Analyzing Streaming Data in Real-time with Amazon Kinesis
 
Aws intro to cloud_economics
Aws intro to cloud_economicsAws intro to cloud_economics
Aws intro to cloud_economics
 
AWS Fargate on EKS 실전 사용하기
AWS Fargate on EKS 실전 사용하기AWS Fargate on EKS 실전 사용하기
AWS Fargate on EKS 실전 사용하기
 
Elastic Observability
Elastic Observability Elastic Observability
Elastic Observability
 
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
Amazon EKS 그리고 Service Mesh (김세호 솔루션즈 아키텍트, AWS) :: Gaming on AWS 2018
 
Amazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for KubernetesAmazon EKS - Elastic Container Service for Kubernetes
Amazon EKS - Elastic Container Service for Kubernetes
 
Performance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State StoresPerformance Tuning RocksDB for Kafka Streams’ State Stores
Performance Tuning RocksDB for Kafka Streams’ State Stores
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Elastic Stack Introduction
Elastic Stack IntroductionElastic Stack Introduction
Elastic Stack Introduction
 
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
 
ElastiCache & Redis
ElastiCache & RedisElastiCache & Redis
ElastiCache & Redis
 

Similaire à Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - AWS re:Invent 2018

Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Amazon Web Services
 
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018Amazon Web Services
 
Expert Tips for Successful Kubernetes Deployment on AWS
Expert Tips for Successful Kubernetes Deployment on AWSExpert Tips for Successful Kubernetes Deployment on AWS
Expert Tips for Successful Kubernetes Deployment on AWSAmazon Web Services
 
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Amazon Web Services
 
Expert Tips for Successful Kubernetes Deployments on AWS
Expert Tips for Successful Kubernetes Deployments on AWSExpert Tips for Successful Kubernetes Deployments on AWS
Expert Tips for Successful Kubernetes Deployments on AWSAmazon Web Services
 
Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%Amazon Web Services
 
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Amazon Web Services
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2aspyker
 
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018Amazon Web Services
 
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...Amazon Web Services
 
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018Amazon Web Services
 
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech Talks
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech TalksElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech Talks
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech TalksAmazon Web Services
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Amazon Web Services
 
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...Amazon Web Services
 
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Amazon Web Services
 
Getting-started-with-containers on AWS
Getting-started-with-containers on AWSGetting-started-with-containers on AWS
Getting-started-with-containers on AWSAmazon Web Services
 
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...Amazon Web Services
 
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS SummitOptimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS SummitAmazon Web Services
 

Similaire à Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - AWS re:Invent 2018 (20)

Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
 
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018
Expert Tips for Successful Kubernetes Deployment - AWS Summit Sydney 2018
 
Expert Tips for Successful Kubernetes Deployment on AWS
Expert Tips for Successful Kubernetes Deployment on AWSExpert Tips for Successful Kubernetes Deployment on AWS
Expert Tips for Successful Kubernetes Deployment on AWS
 
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
Achieving Global Consistency Using AWS CloudFormation StackSets - AWS Online ...
 
Expert Tips for Successful Kubernetes Deployments on AWS
Expert Tips for Successful Kubernetes Deployments on AWSExpert Tips for Successful Kubernetes Deployments on AWS
Expert Tips for Successful Kubernetes Deployments on AWS
 
Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%
 
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
 
CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2CMP376 - Another Week, Another Million Containers on Amazon EC2
CMP376 - Another Week, Another Million Containers on Amazon EC2
 
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
Module 2: Core AWS Compute and Storage Services - Virtual AWSome Day June 2018
 
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...
Another Week, Another Million Containers on Amazon EC2 (CMP376) - AWS re:Inve...
 
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018
 
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech Talks
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech TalksElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech Talks
ElastiCache: Deep Dive Best Practices and Usage Patterns - AWS Online Tech Talks
 
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
Build Your Own Log Analytics Solutions on AWS (ANT323-R) - AWS re:Invent 2018
 
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...
Getting Started with Containers in the Cloud: AWS Developer Workshop at Web S...
 
Amazon EC2 Spot Instances
Amazon EC2 Spot InstancesAmazon EC2 Spot Instances
Amazon EC2 Spot Instances
 
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
 
Getting-started-with-containers on AWS
Getting-started-with-containers on AWSGetting-started-with-containers on AWS
Getting-started-with-containers on AWS
 
Amazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances WorkshopAmazon EC2 Spot Instances Workshop
Amazon EC2 Spot Instances Workshop
 
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
 
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS SummitOptimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
Optimize Amazon EC2 for Fun and Profit - SRV203 - Chicago AWS Summit
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Running a High-Performance Kubernetes Cluster with Amazon EKS (CON318-R1) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Running a High Performance Kubernetes Cluster with Amazon EKS Nathan Peck Developer Advocate Amazon Web Services C O N 3 1 8 Yekesa Kosuru Managing Director State Street
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Breakout repeats Tuesday, November 27th Running a high performance Kubernetes cluster with Amazon EKS 6:15 PM | Venetian, Level 3, Murano 3205 Wednesday, November 28th Running a high performance Kubernetes cluster with Amazon EKS 1:45 PM | Venetian, Level 4, Delfino 4002
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda Best practices of designing for performance How do I test performance? State Street: Database at scale demo
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Basic components of Kubernetes You Worker Nodes Amazon EKS etcd Control Plane Your Container
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your container Optimize for smaller size, use a multistage Docker build to reduce the size of the runtime container. Use a minimalist operating system: Alpine Linux, or similar. Or use no operating system: statically linked Go binary. Not all runtimes are equal. Does your app have a cold start that requires an initial burst of resources?
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Popular base images have a huge range by size REPOSITORY SIZE node:latest 674MB java:latest 643MB node:slim 184MB ubuntu:latest 85.8MB alpine:latest 4.41MB busybox:latest 1.15MB
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize pods How many sidecar containers do you have in each pod? Admission controllers make it easy to add a lot of sidecars but don’t underestimate the overhead cost. Keep pods as lightweight as you can.
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize pod placement Make sure you use resource constraints: - Request the baseline average resource needs of the app - Put a limit on the max resources to be made available to the pod to prevent one pod from interfering with the performance of another pod
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimize density vs. size of pods 4 x pod .5 CPU 256 memory 2 x pod 1 CPU 512 memory
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Anti-affinity Anti-affinity constraints can keep heavy CPU using pods away from each other, on different hosts Warning: anti-affinity is a beta feature pre Kubernetes 1.12, which improves anti-affinity performance 100x in large clusters Tradeoff: heavier control plane scheduling burden, application pod performance bonus
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Observability for pod performance
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your worker nodes Use the latest generation of Amazon Elastic Compute Cloud (Amazon EC2) instances. The c5 instance generation has up to 25% better price/performance than c4 instances.
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Optimizing your worker nodes Choose instance class that matches your primary pod resource needs: • “c” instances are optimized for CPU heavy work • “r” instances are optimized for memory heavy work • “m” instances are general purpose • “p” instances optimized for GPU powered machine learning
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kubernetes control plane on tap, optimized by Amazon EKS mycluster.eks.amazonaws.com EKS Workers Kubectl AZ 1 AZ 2 AZ 3 Your AWS account
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Networking performance: AWS VPC CNI plugin K u b e l e t V P C C N I p l u g i n 1 . C N I A d d / D e l e t e E C 2 E N I E N I E N I P o d P o d P o d P o d V P C N e t w o r k ......... 0 . C r e a t e E N I 2 . S e t u p v e t h Thin layer, no overhead Give K8s pods native IP addresses in the VPC Multiple ENI per Amazon EC2, multiple pods per ENI, all configurable
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pod to pod networking E C 2 Default namespace Pod namespace veth veth Main Route Table E C 2 Default namespace Pod namespace veth Route Table Main Route Table ENI RT veth VPC fabric ENI RT Route Table veth
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Kubernetes performance envelope Number of Nodes Pod Churn Pod Density Networking Secrets Anti-affinity Active Namespaces
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Heavy monolithic pods in a very large cluster Number of Nodes Pod Churn Pod DensityNetworking Secrets Anti-affinity Active Namespaces
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Numerous densely bin packed microservice pods Number of Nodes Pod ChurnPod Density Networking SecretsAnti-affinity Active Namespaces
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. The EKS team is here to help! We are constantly learning from the varying use cases of the many large deployments orchestrated using an EKS control plane Reach out to the team via support ticket, we will help you optimize your control plane to your exact performance needs
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State Street Disclaimer • Views and opinions expressed herein are those of PRESENTER as of 11/27/18 and they are subject to change based on market and other conditions and in any event may not reflect the views of State Street Corporation and its subsidiaries and affiliates (“State Street”). • This information herein is for informational purposes only and it does not constitute investment research or investment, legal, or tax advice, and it is not an offer or solicitation to buy or sell any product, service, or securities or any financial instrument, and it does not constitute any binding contractual arrangement or commitment of any kind. • This information is provided “as-is” and State Street makes no guarantee, representation, or warranty of any kind regarding such information. • This information is not intended to be relied upon by any person or entity. State Street disclaims all liability, whether arising in contract, tort or otherwise, for any losses, liabilities, damages, expenses or costs arising, either direct, indirect, consequential, special or punitive, from or in connection with the use of the information herein. • No permission is granted to reprint, sell, copy, distribute, or modify any material herein, in any form or by any means without the prior written consent of State Street.
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Agenda • What and why • How: Leverage Kubernetes primitives to build a high-performance system • Design Considerations and Best Practices • Scaling Factors and Bottlenecks • Live Demo: Demonstrate scale-out database on Amazon EKS • 1Million + queries per second • Measure latencies
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What and why Transactional database w/ unlimited scale concurrency High Concurrency Low Latencies Open Source Cloud Native Architecture Custom Database Features
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Replication architecture • MySQL Master/Slave • RocksDB Engine • LSM Data Structure optimized for writes • Write intensive workloads • Low memory demands • MariaDB or Percona • RocksDB • Standard MySQL features • Semi Sync Replication w/ GTID • Failover • Cloud Native • SST Files and WAL’s synchronized Percona Server (binlogs) MyRocks WALSST rocksdb local-attached storage bin bin
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resilient scale-out architecture • Scale-out • Vitess - Sharded cluster for scale • Each shard is one master + multiple slaves • Custom sharding key • Read Scale-out: add more slaves • Write Scale-out: add more shards • ACID compliance across cluster • Connection pool, restrain bad queries • Amazon S3 and Amazon EKS • Backups stored to Amazon Simple Storage Service (Amazon S3) • Cluster hosted on Amazon EKS
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Storage • Persistent Volume (PV) • Persistent storage survives pod restarts • HostPath PV • Local storage SSD/NVMe devices • PV are attached via PV Claims • PV Claims • Dynamic • Abstraction to underlying storage • ReadWriteOnce • Tradeoff between resiliency and performance Pod Data Volume Pod (pvc) Persistent Vol Persistent Vol Pod (pvc) Best resiliency Low performance No resiliency Best performance Medium resiliency Best performance
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Getting the most out of K8S • Taints & tolerations • Place one master per worker • Taint the node to maximize performance • Affinity & anti-affinity • Affinity pods scheduled on SSD/NVMe • AntiAffinity: Ensure masters aren’t scheduled with replicas • Services • Resource requests and limits • SYSCTL • StatefulSet • Replicated group of pods with unique properties • StatefulSet restarts Pod on same node • Requires to use PV Claims • Operator + ETCD • DaemonSet • Background processes per node on all nodes • Monitoring & upgrades • e.g., Metrics agent, Local Volume Provisioner
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Pod networking
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Best practices for high-performance clusters • Lean and mean container • Amazon EKS optimized AMI • Image pull policy • Master on SSD/NVMe • Slave: • SSD/NVMe • EBS for increased resiliency • Monitor key metrics • Watch overcommitted state • Cluster auto scaler. HPA. VPA • Placement groups • 30% higher packets per second • Cross AZ deployment • Place writes closer to master • Choose right size nodes • Good n/w performance • More CPU better than more RAM • Use EKS CNI plugin • Upgrade your CNI plug-in • aws-k8s-cni-g74ecf61.yaml or later • Bottleneck: Packets per second, not bandwidth • Jumbo packets increased 90% QPS
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Bottleneck Bottleneck @400K QPS
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scaling factors Query Throughput Query Latency Network PPS Configuration 300K P95: 300ms P50: 8 ms N/A 2 Shards 4 workers Overloaded 400K P95: 4 ms P50: 500 nano 6.8M Single VPC No PG. One replica. 3 Subnets 600K P95: 4 ms P50: 500 nano 9.5M Single AZ. One replica. 1.5K MTU Placement Group 948K P95: 3 ms P50: 500 nano 9.5M Single AZ. Placement Group. One replica. Jumbo Frames 1.36M P95: 3 ms P50: 500 nano 9.5M Single AZ. Placement Group. Jumbo Frames 4 Replicas
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 38. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Yekesa Kosuru ykosuru@statestreet.com
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.