SlideShare a Scribd company logo
1 of 28
Neutron Deployment at
Scale
Igor Bolotin, Cloud Architecture
Vinay Bannai, SDN Architecture
eBay Inc. enables commerce by delivering flexible and
scalable solutions that foster merchant growth.
About ebay inc
With 145 million active buyers globally, eBay is one of the world's
largest online marketplaces, where practically anyone can buy and
sell practically anything.
With 148 million registered accounts in 193 markets and 26
currencies around the world, PayPal enables global
commerce, processing almost 8 million payments every day.
eBay Enterprise is a leading provider of commerce
technologies, omnichannel operations and marketing solutions.
It serves 1000 retailers and brands.
• Business case
• Cloud at eBay Inc
• Deployment Patterns
• Problem Areas
• How we addressed them
• Future Direction
• Summary
• Q & A
Outline of the Presentation
• Agility
− Reduce time to market
− Enable innovation
• Efficiency
− Elastic scale
− Reduce overall cost
• Multi-Tenancy
• Availability
• Security & Compliance
• Software Enabled Data Centers
What our businesses need?
Cloud at eBay Inc
eBay Inc
Cloud
Region/DC Region/DC Region/DC Region/DC
AZ AZ
Nova Cells
Openstack
Controllers
Identity & Image
Management
AZ AZ
Global Orchestration
5
• Private Cloud for all eBay Inc properties
• Global Orchestration with traffic and load balancing
• Identity Management
− Region level (eventually global)
• Image Management
− Region level
• Nova/Cinder/Neutron
− Availability Zones
− Active/Active servers
• Trove
• Zabbix for monitoring
• All services run behind a load balancer VIP
Deployment Patterns
• Shared Cloud between tenants
• Different types of tenants
− eBay Production, PP production, StubHub, GSI Enterprise etc
− Dev/QA
− Sandbox environment, internal tenants (IT, VPCs)
• Production Traffic
− All bridged and no overlays
− No DHCP
• Dev/QA and some of the internal tenants
− Overlays
− DHCP
Deployment Patterns (contd.)
Gateway
Nodes
Physical
Racks
8
• Hypervisor Scale Out
• Overlay Networks
• Bridged Networks
• Neutron Services (DHCP, Metadata, API server)
• SDN Controllers
• Network Gateway Nodes
• Upgrade
Areas With Scale Issues
Hypervisor Scale Out
Nova
API
Nova
Cells
Nova
Cells
Nova
Sched
Nova
Cells
Nova
Sched
Neutron
APINeutron
APINeutron
API
Nova
APINova
API
Nova
CellsNova
Cells
Nova
Cells
Nova
Cells
DHCP
AgentDHCP
Agent
SDN
ContrlSDN
Contrl
10
• Several hundreds of hypervisor in a cell
• Multiple cells in a AZ
• Several thousands of hypervisors in a AZ
• Nova cells mitigate hypervisor scale
• Neutron scaling
− Majority of the hypervisors support Bridged VM’s
− Hybrid mode with both overlay and bridged VM’s
Hypervisor Scale Out
Network Virtualization Layer
L2
VMVM VM VM VM
L2
L2
L3
VMVM VM VM VM
Tenant on
Overlay
Network
Tenant on
Bridged
Network
Bridged and Overlay
• Overlay technology
− VXLAN
− STT
• Handling BUM traffic
− ARP
− Unknown unicast
− Multicast
• Logical switches and routers
• Distributed L3 routers
− Direct tunnels from hypervisor to hypervisor
• Scale out deployment of Gateway nodes
Overlay Networks
• Keystone tokens
• Single threaded Quantum/Neutron server
• DHCP Servers
• Healing Instance Info Caching Interval
Neutron Services
Keystone Token Generation and Authentication
Nova
Server
Neutron
Server
Cinder
Server
Client
Client
Client
Keystone
Server
Image
Server
• UUID based Token
– Needs to be authenticated by
the keystone server for every
call
• PKI based Token
– Authenticated by the servers
using Keystone certs
• Token caching
– Prevents unnecessary token
creation
15
• Applies to uuid based tokens
− 98% of tokens generated by inter-API services
− 92% are quantum/neutron related
− Average of 25 to 30 tokens/sec created by quantum/neutron alone
− RPC call overhead, bloated token table
• Fix
− Use token caching (1 hour)
− Use PKI for service tenant
− Reduces network chatter and improves performance
• Openstack bugs
− Bug id : 1191159
− Bug id : 1250580
Token Caching
• Prior to Havana
• One api thread handling both REST calls and the RPC calls
• Broke up the api to two threads
− One handles REST API calls
− The other handles RPC calls
• Havana fixes
− DHCP renewals not handled by neutron servers, instead dhcp_release
− Multi-worker support
Neutron Multi Worker
• All nova computes regularly poll neutron server
• To get network info of the instances running on the compute
node
• Default is 10 seconds
• Hundreds of hypervisors and tens of thousands of VM’s will add
up
− Even though only one instance is checked for each interval
• We adjusted the interval to 600 seconds
Heal Instance Info Cache Interval
• The most common source of problems
• We employ multiple strategies
− DHCP active/standby
− Planning to support DHCP active/active
− No DHCP/Config Drive option
• Production Environment
− No DHCP
− Config drive management
− Requires “cloudinit” aware images
DHCP Scaling
SDN Controllers
SDN
Controller
SDN
Controller
SDN
Controller
Neutron
API
Neutron
API
Neutron
API
Nova
API
OS
Ctrl
OS
Ctrl
OS
Ctrl
• Only with overlay networks
• Scale out architecture
• Problems with high CPU utilization
• Number of flows in the gateway node
• East – West traffic also hitting VIPs
− Load Balancer running as a appliance on a hypervisor
− Using SNAT
• OVS Enhancements
− Use megaflows in openvswitch
− Multi-core version of ovs-vswitchd
Network Gateway Nodes
• Prior to OVS 1.11
• Megaflow introduces wildcarding in kernel module
• Fewer misses and punts to user space
• Reduced number of flows in kernel
• Requires OVS 1.11 or greater
• Cons
− Using security groups nullifies the effects of megaflows
OVS Improvements
Megaflows
OVS Improvements
Multi-Core
vswitchd
Kernel module
User Space
Kernel Space
cpu cores
vswitchd
Kernel module
OVS < 2.0 OVS >= 2.0
23
• VPC model
• Neutron Tagging Blueprint
− Network Assignment for Bridged VM’s
− Network Selection for VPC tenants
− Network Scheduling
− Additional meta data information
• Blueprint
− https://blueprints.launchpad.net/neutron/+spec/network-tagging
Future Work
• There are two primary ways to plumb a VM into a network
− Pass the net-id to the nova boot
− Create a port and pass it to nova boot
• Nova schedules the instance without much knowledge about
the underlying network
• BP proposes to address this issue as one of the use cases
Network Assignment in Bridged VM’s
Rack 1
N1
Rack 2
N2
Rack 3
N3
Rack 4
N4
FZ1 FZ2 FZ3 FZ4
Network Tagging
26
VM
• Know your requirements
• Understand your size and scale
• Pick the SDN controller based on your needs
• Design with multiple failure domains
• Overlay, Bridged or Hybrid
• Monitor your cloud for performance degradation
Summary
Thank you.
Yes. We are hiring!
dl-ebay-cloud-hiring@corp.ebay.com

More Related Content

What's hot

Cumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Networks
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deepak Mane
 
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - PivotalOpenStack Korea Community
 
How DreamHost builds a Public Cloud with OpenStack
How DreamHost builds a Public Cloud with OpenStackHow DreamHost builds a Public Cloud with OpenStack
How DreamHost builds a Public Cloud with OpenStackCarl Perry
 
Stacking up with OpenStack: Building for High Availability
Stacking up with OpenStack: Building for High AvailabilityStacking up with OpenStack: Building for High Availability
Stacking up with OpenStack: Building for High AvailabilityOpenStack Foundation
 
Livy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkLivy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkJen Aman
 
OpenStack Deployment in the Enterprise
OpenStack Deployment in the Enterprise OpenStack Deployment in the Enterprise
OpenStack Deployment in the Enterprise Cisco Canada
 
FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceEvan McGee
 
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...OpenStack Korea Community
 
Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Cumulus Networks
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Nagios
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Cisco DevNet
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...Daniel Krook
 
Openstack Neutron and SDN
Openstack Neutron and SDNOpenstack Neutron and SDN
Openstack Neutron and SDNinakipascual
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstackDeepak Mane
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak PerformanceTodd Palino
 
Akanda: Open Source, Production-Ready Network Virtualization for OpenStack
Akanda: Open Source, Production-Ready Network Virtualization for OpenStackAkanda: Open Source, Production-Ready Network Virtualization for OpenStack
Akanda: Open Source, Production-Ready Network Virtualization for OpenStackJonathan LaCour
 
Encrypt your volumes with barbican open stack 2018
Encrypt your volumes with barbican open stack 2018Encrypt your volumes with barbican open stack 2018
Encrypt your volumes with barbican open stack 2018Duncan Wannamaker
 

What's hot (20)

Cumulus Linux 2.5 Overview
Cumulus Linux 2.5 OverviewCumulus Linux 2.5 Overview
Cumulus Linux 2.5 Overview
 
Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)Deployment topologies for high availability (ha)
Deployment topologies for high availability (ha)
 
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
[2015-11월 정기 세미나] Cloud Native Platform - Pivotal
 
How DreamHost builds a Public Cloud with OpenStack
How DreamHost builds a Public Cloud with OpenStackHow DreamHost builds a Public Cloud with OpenStack
How DreamHost builds a Public Cloud with OpenStack
 
OpenStack HA
OpenStack HAOpenStack HA
OpenStack HA
 
Stacking up with OpenStack: Building for High Availability
Stacking up with OpenStack: Building for High AvailabilityStacking up with OpenStack: Building for High Availability
Stacking up with OpenStack: Building for High Availability
 
Livy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache SparkLivy: A REST Web Service For Apache Spark
Livy: A REST Web Service For Apache Spark
 
OpenStack Deployment in the Enterprise
OpenStack Deployment in the Enterprise OpenStack Deployment in the Enterprise
OpenStack Deployment in the Enterprise
 
FreeSWITCH as a Microservice
FreeSWITCH as a MicroserviceFreeSWITCH as a Microservice
FreeSWITCH as a Microservice
 
Automation Evolution with Junos
Automation Evolution with JunosAutomation Evolution with Junos
Automation Evolution with Junos
 
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...
[2015-05월 세미나] Network Bottlenecks Mutiply with NFV Don't Forget Performance ...
 
Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015Open Hardware for All - Webinar March 25, 2015
Open Hardware for All - Webinar March 25, 2015
 
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
 
Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!Automating with NX-OS: Let's Get Started!
Automating with NX-OS: Let's Get Started!
 
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
CAPS: What's best for deploying and managing OpenStack? Chef vs. Ansible vs. ...
 
Openstack Neutron and SDN
Openstack Neutron and SDNOpenstack Neutron and SDN
Openstack Neutron and SDN
 
High availability and fault tolerance of openstack
High availability and fault tolerance of openstackHigh availability and fault tolerance of openstack
High availability and fault tolerance of openstack
 
Kafka at Peak Performance
Kafka at Peak PerformanceKafka at Peak Performance
Kafka at Peak Performance
 
Akanda: Open Source, Production-Ready Network Virtualization for OpenStack
Akanda: Open Source, Production-Ready Network Virtualization for OpenStackAkanda: Open Source, Production-Ready Network Virtualization for OpenStack
Akanda: Open Source, Production-Ready Network Virtualization for OpenStack
 
Encrypt your volumes with barbican open stack 2018
Encrypt your volumes with barbican open stack 2018Encrypt your volumes with barbican open stack 2018
Encrypt your volumes with barbican open stack 2018
 

Similar to Neutron scaling

Integrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing InfrastructureIntegrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing InfrastructureHui Cheng
 
Understanding and deploying Network Virtualization
Understanding and deploying Network VirtualizationUnderstanding and deploying Network Virtualization
Understanding and deploying Network VirtualizationSDN Hub
 
CloudStack challenges for China customers
CloudStack challenges for China customersCloudStack challenges for China customers
CloudStack challenges for China customersgavin_lee
 
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networking
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack NetworkingONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networking
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networkingmarkmcclain
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stackNitin Mehta
 
Openstack Overview
Openstack OverviewOpenstack Overview
Openstack Overviewrajdeep
 
The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittalbuildacloud
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networksPLUMgrid
 
CloudStack Overview
CloudStack OverviewCloudStack Overview
CloudStack Overviewsedukull
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentOpenStack Foundation
 
Blue host openstacksummit_2013
Blue host openstacksummit_2013Blue host openstacksummit_2013
Blue host openstacksummit_2013Jun Park
 
Blue host using openstack in a traditional hosting environment
Blue host using openstack in a traditional hosting environmentBlue host using openstack in a traditional hosting environment
Blue host using openstack in a traditional hosting environmentOpenStack Foundation
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Belmiro Moreira
 
Scalable networking in Apache CloudStack
Scalable networking in Apache CloudStackScalable networking in Apache CloudStack
Scalable networking in Apache CloudStackChiradeep Vittal
 
Integrating OpenStack to Existing infrastructure
Integrating OpenStack to Existing infrastructureIntegrating OpenStack to Existing infrastructure
Integrating OpenStack to Existing infrastructurelaurabeckcahoon
 
Secure Multi Tenant Cloud with OpenContrail
Secure Multi Tenant Cloud with OpenContrailSecure Multi Tenant Cloud with OpenContrail
Secure Multi Tenant Cloud with OpenContrailPriti Desai
 
Cloud computing OpenStack_discussion_2014-05
Cloud computing OpenStack_discussion_2014-05Cloud computing OpenStack_discussion_2014-05
Cloud computing OpenStack_discussion_2014-05Le Cuong
 
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014Tran Nhan
 

Similar to Neutron scaling (20)

Integrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing InfrastructureIntegrating OpenStack To Existing Infrastructure
Integrating OpenStack To Existing Infrastructure
 
Understanding and deploying Network Virtualization
Understanding and deploying Network VirtualizationUnderstanding and deploying Network Virtualization
Understanding and deploying Network Virtualization
 
CloudStack challenges for China customers
CloudStack challenges for China customersCloudStack challenges for China customers
CloudStack challenges for China customers
 
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networking
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack NetworkingONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networking
ONUG Tutorial: Bridges and Tunnels Drive Through OpenStack Networking
 
Hacking apache cloud stack
Hacking apache cloud stackHacking apache cloud stack
Hacking apache cloud stack
 
Openstack Overview
Openstack OverviewOpenstack Overview
Openstack Overview
 
The Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep VittalThe Future of SDN in CloudStack by Chiradeep Vittal
The Future of SDN in CloudStack by Chiradeep Vittal
 
Testing the limits of cloud networks
Testing the limits of cloud networksTesting the limits of cloud networks
Testing the limits of cloud networks
 
CloudStack Overview
CloudStack OverviewCloudStack Overview
CloudStack Overview
 
Symantec SDN Deployment
Symantec SDN DeploymentSymantec SDN Deployment
Symantec SDN Deployment
 
Using OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting EnvironmentUsing OpenStack In a Traditional Hosting Environment
Using OpenStack In a Traditional Hosting Environment
 
Blue host openstacksummit_2013
Blue host openstacksummit_2013Blue host openstacksummit_2013
Blue host openstacksummit_2013
 
Blue host using openstack in a traditional hosting environment
Blue host using openstack in a traditional hosting environmentBlue host using openstack in a traditional hosting environment
Blue host using openstack in a traditional hosting environment
 
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
 
Scalable networking in Apache CloudStack
Scalable networking in Apache CloudStackScalable networking in Apache CloudStack
Scalable networking in Apache CloudStack
 
Integrating OpenStack to Existing infrastructure
Integrating OpenStack to Existing infrastructureIntegrating OpenStack to Existing infrastructure
Integrating OpenStack to Existing infrastructure
 
Secure Multi Tenant Cloud with OpenContrail
Secure Multi Tenant Cloud with OpenContrailSecure Multi Tenant Cloud with OpenContrail
Secure Multi Tenant Cloud with OpenContrail
 
OpenStack and Windows
OpenStack and WindowsOpenStack and Windows
OpenStack and Windows
 
Cloud computing OpenStack_discussion_2014-05
Cloud computing OpenStack_discussion_2014-05Cloud computing OpenStack_discussion_2014-05
Cloud computing OpenStack_discussion_2014-05
 
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
VNG/IRD - Cloud computing & Openstack discussion 3/5/2014
 

Recently uploaded

A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)Christopher H Felton
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationLinaWolf1
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一z xss
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Excelmac1
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一Fs
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleanscorenetworkseo
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Paul Calvano
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书rnrncn29
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa494f574xmv
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITMgdsc13
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxeditsforyah
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMartaLoveguard
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predieusebiomeyer
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Sonam Pathan
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一Fs
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书rnrncn29
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作ys8omjxb
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一Fs
 

Recently uploaded (20)

A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
A Good Girl's Guide to Murder (A Good Girl's Guide to Murder, #1)
 
PHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 DocumentationPHP-based rendering of TYPO3 Documentation
PHP-based rendering of TYPO3 Documentation
 
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
办理(UofR毕业证书)罗切斯特大学毕业证成绩单原版一比一
 
Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...Blepharitis inflammation of eyelid symptoms cause everything included along w...
Blepharitis inflammation of eyelid symptoms cause everything included along w...
 
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
定制(Lincoln毕业证书)新西兰林肯大学毕业证成绩单原版一比一
 
Elevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New OrleansElevate Your Business with Our IT Expertise in New Orleans
Elevate Your Business with Our IT Expertise in New Orleans
 
Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24Font Performance - NYC WebPerf Meetup April '24
Font Performance - NYC WebPerf Meetup April '24
 
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
『澳洲文凭』买拉筹伯大学毕业证书成绩单办理澳洲LTU文凭学位证书
 
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in  Rk Puram 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Rk Puram 🔝 9953056974 🔝 Delhi escort Service
 
Film cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasaFilm cover research (1).pptxsdasdasdasdasdasa
Film cover research (1).pptxsdasdasdasdasdasa
 
Git and Github workshop GDSC MLRITM
Git and Github  workshop GDSC MLRITMGit and Github  workshop GDSC MLRITM
Git and Github workshop GDSC MLRITM
 
Q4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptxQ4-1-Illustrating-Hypothesis-Testing.pptx
Q4-1-Illustrating-Hypothesis-Testing.pptx
 
Magic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptxMagic exist by Marta Loveguard - presentation.pptx
Magic exist by Marta Loveguard - presentation.pptx
 
SCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is prediSCM Symposium PPT Format Customer loyalty is predi
SCM Symposium PPT Format Customer loyalty is predi
 
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
Call Girls In The Ocean Pearl Retreat Hotel New Delhi 9873777170
 
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
定制(AUT毕业证书)新西兰奥克兰理工大学毕业证成绩单原版一比一
 
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
『澳洲文凭』买詹姆士库克大学毕业证书成绩单办理澳洲JCU文凭学位证书
 
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
Potsdam FH学位证,波茨坦应用技术大学毕业证书1:1制作
 
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Serviceyoung call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
young call girls in Uttam Nagar🔝 9953056974 🔝 Delhi escort Service
 
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
定制(Management毕业证书)新加坡管理大学毕业证成绩单原版一比一
 

Neutron scaling

  • 1. Neutron Deployment at Scale Igor Bolotin, Cloud Architecture Vinay Bannai, SDN Architecture
  • 2. eBay Inc. enables commerce by delivering flexible and scalable solutions that foster merchant growth. About ebay inc With 145 million active buyers globally, eBay is one of the world's largest online marketplaces, where practically anyone can buy and sell practically anything. With 148 million registered accounts in 193 markets and 26 currencies around the world, PayPal enables global commerce, processing almost 8 million payments every day. eBay Enterprise is a leading provider of commerce technologies, omnichannel operations and marketing solutions. It serves 1000 retailers and brands.
  • 3. • Business case • Cloud at eBay Inc • Deployment Patterns • Problem Areas • How we addressed them • Future Direction • Summary • Q & A Outline of the Presentation
  • 4. • Agility − Reduce time to market − Enable innovation • Efficiency − Elastic scale − Reduce overall cost • Multi-Tenancy • Availability • Security & Compliance • Software Enabled Data Centers What our businesses need?
  • 5. Cloud at eBay Inc eBay Inc Cloud Region/DC Region/DC Region/DC Region/DC AZ AZ Nova Cells Openstack Controllers Identity & Image Management AZ AZ Global Orchestration 5
  • 6. • Private Cloud for all eBay Inc properties • Global Orchestration with traffic and load balancing • Identity Management − Region level (eventually global) • Image Management − Region level • Nova/Cinder/Neutron − Availability Zones − Active/Active servers • Trove • Zabbix for monitoring • All services run behind a load balancer VIP Deployment Patterns
  • 7. • Shared Cloud between tenants • Different types of tenants − eBay Production, PP production, StubHub, GSI Enterprise etc − Dev/QA − Sandbox environment, internal tenants (IT, VPCs) • Production Traffic − All bridged and no overlays − No DHCP • Dev/QA and some of the internal tenants − Overlays − DHCP Deployment Patterns (contd.)
  • 9. • Hypervisor Scale Out • Overlay Networks • Bridged Networks • Neutron Services (DHCP, Metadata, API server) • SDN Controllers • Network Gateway Nodes • Upgrade Areas With Scale Issues
  • 11. • Several hundreds of hypervisor in a cell • Multiple cells in a AZ • Several thousands of hypervisors in a AZ • Nova cells mitigate hypervisor scale • Neutron scaling − Majority of the hypervisors support Bridged VM’s − Hybrid mode with both overlay and bridged VM’s Hypervisor Scale Out
  • 12. Network Virtualization Layer L2 VMVM VM VM VM L2 L2 L3 VMVM VM VM VM Tenant on Overlay Network Tenant on Bridged Network Bridged and Overlay
  • 13. • Overlay technology − VXLAN − STT • Handling BUM traffic − ARP − Unknown unicast − Multicast • Logical switches and routers • Distributed L3 routers − Direct tunnels from hypervisor to hypervisor • Scale out deployment of Gateway nodes Overlay Networks
  • 14. • Keystone tokens • Single threaded Quantum/Neutron server • DHCP Servers • Healing Instance Info Caching Interval Neutron Services
  • 15. Keystone Token Generation and Authentication Nova Server Neutron Server Cinder Server Client Client Client Keystone Server Image Server • UUID based Token – Needs to be authenticated by the keystone server for every call • PKI based Token – Authenticated by the servers using Keystone certs • Token caching – Prevents unnecessary token creation 15
  • 16. • Applies to uuid based tokens − 98% of tokens generated by inter-API services − 92% are quantum/neutron related − Average of 25 to 30 tokens/sec created by quantum/neutron alone − RPC call overhead, bloated token table • Fix − Use token caching (1 hour) − Use PKI for service tenant − Reduces network chatter and improves performance • Openstack bugs − Bug id : 1191159 − Bug id : 1250580 Token Caching
  • 17. • Prior to Havana • One api thread handling both REST calls and the RPC calls • Broke up the api to two threads − One handles REST API calls − The other handles RPC calls • Havana fixes − DHCP renewals not handled by neutron servers, instead dhcp_release − Multi-worker support Neutron Multi Worker
  • 18. • All nova computes regularly poll neutron server • To get network info of the instances running on the compute node • Default is 10 seconds • Hundreds of hypervisors and tens of thousands of VM’s will add up − Even though only one instance is checked for each interval • We adjusted the interval to 600 seconds Heal Instance Info Cache Interval
  • 19. • The most common source of problems • We employ multiple strategies − DHCP active/standby − Planning to support DHCP active/active − No DHCP/Config Drive option • Production Environment − No DHCP − Config drive management − Requires “cloudinit” aware images DHCP Scaling
  • 21. • Only with overlay networks • Scale out architecture • Problems with high CPU utilization • Number of flows in the gateway node • East – West traffic also hitting VIPs − Load Balancer running as a appliance on a hypervisor − Using SNAT • OVS Enhancements − Use megaflows in openvswitch − Multi-core version of ovs-vswitchd Network Gateway Nodes
  • 22. • Prior to OVS 1.11 • Megaflow introduces wildcarding in kernel module • Fewer misses and punts to user space • Reduced number of flows in kernel • Requires OVS 1.11 or greater • Cons − Using security groups nullifies the effects of megaflows OVS Improvements Megaflows
  • 23. OVS Improvements Multi-Core vswitchd Kernel module User Space Kernel Space cpu cores vswitchd Kernel module OVS < 2.0 OVS >= 2.0 23
  • 24. • VPC model • Neutron Tagging Blueprint − Network Assignment for Bridged VM’s − Network Selection for VPC tenants − Network Scheduling − Additional meta data information • Blueprint − https://blueprints.launchpad.net/neutron/+spec/network-tagging Future Work
  • 25. • There are two primary ways to plumb a VM into a network − Pass the net-id to the nova boot − Create a port and pass it to nova boot • Nova schedules the instance without much knowledge about the underlying network • BP proposes to address this issue as one of the use cases Network Assignment in Bridged VM’s
  • 26. Rack 1 N1 Rack 2 N2 Rack 3 N3 Rack 4 N4 FZ1 FZ2 FZ3 FZ4 Network Tagging 26 VM
  • 27. • Know your requirements • Understand your size and scale • Pick the SDN controller based on your needs • Design with multiple failure domains • Overlay, Bridged or Hybrid • Monitor your cloud for performance degradation Summary
  • 28. Thank you. Yes. We are hiring! dl-ebay-cloud-hiring@corp.ebay.com