1. Open Cloud Networking Vision
The state of OpenStack networking and a vision of things to come...
Dan Sneddon
Member Technical Staff
Twitter: @dxs
OCS 2.0
Public Cloud Benefits | Private Cloud Control | Open Cloud Economics
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution 1
Thursday, December 13, 12
2. Our Journey Today
1. Cloudscaling Introduction
2. Elastic Clouds and Private Hybrid Clouds
3. Advanced Networking
4. Future Vision
5. Network-Based Service Resiliency
2
Thursday, December 13, 12
3. Elastic Clouds
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
3
Thursday, December 13, 12
5. Two Cloud Infrastructure Models
1
Enterprise
Virtualization
Legacy Apps
4
Thursday, December 13, 12
6. Two Cloud Infrastructure Models
1 2
Enterprise Elastic
Virtualization Infrastructure
New
Legacy Apps Dynamic Apps
4
Thursday, December 13, 12
7. Elastic Cloud vs Enterprise
Virtualization
Enterprise Virtualization Elastic Cloud
Applications Traditional & Legacy Dynamic
Scaling Architecture Managed Silos Horizontal
Technology Stack Heavy & Proprietary Distributed & Open
Price/Performance Low High (4-7x better)
Failure Domains Large Small
Provisioning Slower & Manual Faster & 100% API
Server consolidation and lower On-demand, scale-out
Best For:
datacenter mgmt costs infrastructure for new apps
5
Thursday, December 13, 12
8. Public Elastic Clouds Are Not
Enough
Why?
• Loss of control/governance
• Security concerns
• Integration w/ existing systems
• Data privacy/patriation issues
• Expensive at scale
• Limited performance options
• Geographic reach
Thursday, December 13, 12
9. Wanted: Private Elastic Clouds
Dynamic
Applications
Public Benefits Private Control
7
Thursday, December 13, 12
11. Wanted: Open Economics/Choice
• No vendor lock-in
• Open source software
• Standard interfaces
• Commodity hardware
• Community/rapid innovation
9
Thursday, December 13, 12
12. Can I Build A Private Elastic
Cloud Myself?
Yes, but painfully
• Highly complex
• Costly to build and maintain
• Not where the value is
• No innovation path
• Not production-grade
• Requires special skills
10
Thursday, December 13, 12
13. Our Solution:
Open Cloud System
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
11
Thursday, December 13, 12
14. Our Product
Open Cloud System 2.0
The most reliable, scalable and production-grade
Powered by solution for private elastic clouds powered by
OpenStack technology.
12
Thursday, December 13, 12
15. OpenStack Elastic Cloud
Block Object Advanced
Compute
Storage Storage Networking
Nova Cinder Swift / Hadoop Quantum
13
Thursday, December 13, 12
16. Production-Grade Focus
Performance Can the system guarantee quality of service,
responsiveness, scalability, and economic performance?
Availability Does the system offer redundancy, resiliency, fault
isolation, graceful degradation, and scale-out
engineering?
Security Are best practices of default deny, least privilege, a
minimal attack surface, and encryption/data privacy
followed?
Maintainability Does the system provide a reliable upgrade path for
new enhancements & system updates, transparency &
measurability of performance, comprehensive lifecycle
management, and predictable behavior?
14
Thursday, December 13, 12
17. OCS is Production OpenStack
OCS is more than just software ... it’s OpenStack release synchronicity, Cloudscaling
innovations, a compelling roadmap, community involvement & production support
from a team with deep operational experience.
OCS is open core software that
tracks the OpenStack release
Virtual Machines (Nova)
cycle closely.
Object Storage (Swift)
VM Image Management (Glance)
Identity Service (Keystone)
Future
Hardware Lifecycle Mgmt
Topology & Data Model
Block-based Architecture Folsom
Security Hardening
Essex
Scale-Out Networking
Service Redundancy
OpenStack ecosystem contributions
aws-compat, zeromq-rpc-driver, nova-gerrit-monitor,
Public Cloud Compatibility
tarkin, cs-nova-simplescheduler, sheep
15
Thursday, December 13, 12
18. Advanced Networking
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
16
Thursday, December 13, 12
19. OCS Advanced Networking
Layer 3 Networking Plugin
Distributed NAT Service
Network Definition API
Tenant Isolation
Distributed Load Balancing
17
Thursday, December 13, 12
21. Brokerless Messaging With ZeroMQ
Avoiding RabbitMQ’s Single Point Of Failure
Nova-Compute
Single Point
Of Failure
RabbitMQ
Broker
Nova-Scheduler Nova-API
RabbitMQ
(Brokered)
19
Thursday, December 13, 12
22. Brokerless Messaging With ZeroMQ
Avoiding RabbitMQ’s Single Point Of Failure
Nova-Compute Nova-Compute
Single Point
Of Failure
RabbitMQ
Broker
Nova-Scheduler Nova-API Nova-Scheduler Nova-API
RabbitMQ vs. ZeroMQ
(Brokered) (Peer To Peer)
20
Thursday, December 13, 12
23. Quantum Development
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
21
Thursday, December 13, 12
24. OpenStack Networking
Nova Manages VMs, Quantum Manages Network
22
Thursday, December 13, 12
25. Quantum Architecture
Unified Open APIs For Advanced Networking
* Image From OpenStack Documentation
23
Thursday, December 13, 12
26. Quantum in Folsom (2012)
• Layer 3 plugin to Nova Networking
• DHCP service plugin
• Network Address Translation (NAT)
• No overlapping IP assignments
• Very limited multi-tenant security
24
Thursday, December 13, 12
27. Quantum in Grizzly (2013)
• Complete Rewrite of Quantum API (v3)
• Security Group Enhancements
• Distributed Firewall Configuration API
• Distributed Load Balancer Services API
• VPN Services API
• Better integration with OpenFlow Controllers
25
Thursday, December 13, 12
28. Quantum Compatibility
Lots Of Choices For Virtual Network/SDN Providers
•Open vSwitch. http://www.openvswitch.org/openstack/documentation
•Nicira NVP. quantum/plugins/nicira/nicira_nvp_plugin/README and http://
www.nicira.com/support.
•Midokura. http://www.midokura.com/midonet/openstack/
•BigSwitch. http://www.bigswitch.com/sites/default/files/sdn_resources/
openstack_aag.pdf
•Cisco. quantum/plugins/cisco/README and http://wiki.openstack.org/cisco-
quantum
•Linux Bridge. quantum/plugins/linuxbridge/README and http://
wiki.openstack.org/Quantum-Linux-Bridge-Plugin
•Ryu. quantum/plugins/ryu/README and http://www.osrg.net/ryu/
using_with_openstack.html
•NEC OpenFlow. http://wiki.openstack.org/Quantum-NEC-OpenFlow-Plugin
26
Thursday, December 13, 12
29. Network Virtualization
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
27
Thursday, December 13, 12
30. Network Virtualization Use Case
Isolation and Network Tunnels in Multi-Tenant VM Host
28
Thursday, December 13, 12
31. Origin Of Virtual Networking
It makes more sense if you know where it came from
• Despite the appearance of new technologies, virtual
networking and SDN have been around a long time
• Prior to the latest protocols, virtual networking was
achieved through creative use of VLANs, proxies, dynamic
routing, GRE tunnels, and expect scripts.
• MPLS was an attempt by Cisco and the telcom providers to
do managed virtual networking
• SDN isn’t new either, e.g. Ciscoworks, NetConf, Opnet
29
Thursday, December 13, 12
32. Future Of Virtual Networking
A few bold predictions
• Increased adoption of OpenFlow is an obvious inevitability
• The biggest change on the horizon: virtualized hardware
• OSS soon to be a viable option for elastic cloud networking
• Networking will be about programming, not configuring
• It is already easier to do virtual networking at scale, but
soon it will also be cheaper, leading to massive disruption
30
Thursday, December 13, 12
33. The Path To The Future
• Virtualizing network hardware: VRFs, VM support on routers,
VXLANs, virtual instances
• SDN needs to be a software and hardware approach, device
configuration should be done through common APIs
• A dramatic shift away from HA and failover, and toward distributed
services with smaller failure domains and expectation of failure
31
Thursday, December 13, 12
34. Service Resiliency
Bay Area Network Virtualization Meetup – Dec 2012
CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
32
Thursday, December 13, 12
35. What Do We Mean By “HA”?
We mean what most people mean ...
Two servers or network devices that look like one
33
Thursday, December 13, 12
36. “HA HA”?
HA pairs come in a couple flavors
Active / Passive
34
Thursday, December 13, 12
37. “HA HA”?
People like this flavor best, but it’s not always possible...
Active / Active
35
Thursday, December 13, 12
38. “HA HA HA HA HA”??
Many people wish they could get it more like this ...
HA cluster aka ‘massive operational nightmare’
36
Thursday, December 13, 12
39. What is Scale-out?
A B
A B C D N
A B
Scale-up - Make boxes Scale-out - Make moar
bigger (usually an HA pair) boxes
37
Thursday, December 13, 12
40. Scaling out is a mindset
Scaling up is like treating your servers as pets
bowzer.company.com web001.company.com
Servers *are* cattle
38
Thursday, December 13, 12
41. “HA” Pairs Are an All-in Move
They better not fail ...
39
Thursday, December 13, 12
42. Risk Reduction
Many small failure domains is usually better
40
Thursday, December 13, 12
43. Big failure domains vs. small
Would you rather have the whole cloud down or just a
small bit for a short period of time?
Still a scale-up pattern ...
wouldn’t you rather scale-out?
41
Thursday, December 13, 12
44. What’s Usually an “HA” Pair in OpenStack?
Everything ...
Service Endpoints Messaging System
(APIs) (RPC)
Worker Threads
Database
(e.g. Scheduler,
(MySQL)
Networking)
42
Thursday, December 13, 12
45. What needs to be an HA pair?
Not much needs state synchronization
Service Endpoints Messaging System
(APIs) (RPC)
Worker Threads
Database
(e.g. Scheduler,
Networking) (MySQL)
43
Thursday, December 13, 12
50. Service Distribution
High Availability Without Compromise
Resilient Stateless Scale-out
48
Thursday, December 13, 12
51. Service Distribution
Combines Standard Networking Technologies
router ospf
OSPF /etc/quagga/ospfd.conf ospf router-id 10.1.1.1
network 10.1.255.1 area 0.0.0.0
interface lo:2
Anycast /etc/quagga/zebra.conf description Pound listening address
ip address 10.1.255.1/32
ListenHTTP
Address 10.1.255.1
Port 8774
Load- xHTTP
Service
BackEnd
1
Balancing /etc/pound/pound.conf
End
Address 10.1.1.1
Port 8774
Proxy BackEnd
Address 10.1.1.2
Port 8774
End
End
End
49
Thursday, December 13, 12
52. Resilient OpenStack
Horizontally Scalable, No Single Point Of Failure
Service Distribution ZeroMQ
Service Endpoints Messaging System
(APIs) (RPC)
Service Distribution MMR + HA
Worker Threads Database
(e.g. Scheduler,
Networking) (MySQL)
Thursday, December 13, 12
53. Service Distribution Advantages
What Makes This a Superior Solution?
• True horizontal scalability with no centralized controller
• Services are always running, failover is nearly instant
• Reduced complexity, fewer idle resources
• No need for separate load balancers
Server Server Server Server Server Server Server
...
Failover vs. Distributed Services
51
Thursday, December 13, 12
54. Site Failover and Global LB
Service Distribution Works With Multiple Sites
• Traditional HA pairs do not support cross-site resiliency
• Service Distribution fail across sites without DNS redirections
52
Thursday, December 13, 12
55. Service Distribution in Action
Example: Distributed Load Balancing
1) OSPF
OSPF Router(s)
OSPF OSPF
advertisement advertisement
V
Quagga Quagga
HTTP Proxy HTTP Proxy
53
Thursday, December 13, 12
56. Service Distribution in Action
Example: Distributed Load Balancing
1) OSPF
OSPF Router(s)
2) ECMP Per-flow
Load Balancing
OSPF OSPF
advertisement advertisement
Per-Flow
Load
3) Load-balancing V
Balancing
Quagga Quagga
HTTP Proxy
HTTP Proxy HTTP Proxy
54
Thursday, December 13, 12
57. Service Distribution in Action
Example: Distributed Load Balancing
1) OSPF
OSPF Router(s)
2) ECMP Per-flow
Load Balancing
OSPF OSPF
advertisement advertisement
Per-Flow
Load
3) Load-balancing V
Balancing
Quagga Quagga
HTTP Proxy
HTTP Proxy HTTP Proxy
4) Unlimited #
of Back-End
Servers
Server Server Server Server
55
Thursday, December 13, 12
58. Failure Resiliency
Client Client Client Client
1 2 3 4
1 2 3 4
Load Balancer/
Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/
Proxy
Proxy Proxy Proxy Proxy
10%
Server Server Server Server Server Load
Each
Server Server Server Server Server Server
56
Thursday, December 13, 12
59. Failure Resiliency
Client Client Client Client
1 2 3 4
1 12 3 4
X
Load Balancer/
Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/
Proxy
Proxy Proxy Proxy Proxy
10%
Server Server Server Server Server Load
Each
Server Server Server Server Server Server
57
Thursday, December 13, 12
60. Failure Resiliency
Client Client Client Client
1 2 3 4
1 2 3 4
Load Balancer/
Load Balancer/ Load Balancer/ Load Balancer/ Load Balancer/
Proxy
Proxy Proxy Proxy Proxy
10%
X
Server
Server
Server
Server
Server
Server
Server
Server
Server
Server
Increased
Server
Load
58
Thursday, December 13, 12
61. CCA - NoDerivs 3.0 Unported License - Usage OK, no modifications, full attribution
59
Thursday, December 13, 12