Hybrid clouds provide a good balance between the privacy offered by private clouds and the elasticity and reliability of public clouds. The presentation offers an introduction to the decision criteria when switching from a private to a hybrid cloud architecture and where to start from.
2. THE CASE FOR HYBRID
CLOUD
• Large resource allocation variance. E.g. Netflix reaches
peak traffic in evenings, weekends & Christmas.
• Disaster Recovery: a replica environment in stand-by
mode in the public cloud.
• Added-value services expensive to implement in-house
at scale.
3. WHICH SERVICES BELONG TO
THE PUBLIC CLOUD?
• Services that exhibit large usage variance:
1. Provision entirely in the public
cloud.
2. Overflow to the public cloud
(not practical with frequent
updates and strict synchronization
requirements unless collocated).
Client
Elastic Cloud
Fixed Resource Cloud
LB Client
• Services that are expensive to implement in-house and difficult to scale or provide 99.9%
uptime.
• Services with heavy interaction should stick together whether in public or private cloud.
4. WHICH SERVICES BELONG TO
THE PRIVATE CLOUD?
• Services with unique security requirements.
• Services with unique privacy requirements.
• Cost: very complicated to evaluate upfront. It
drops 20% / year.
5. SERVICES LIKELY TO EXHIBIT
USAGE VARIANCE
• Media streaming.
• Analytics usually running at the EoD.
• All services during a corporate event such as the
launch of a new product line.
6. SERVICES EXPENSIVE TO
MAINTAIN IN-HOUSE
• Geographically distributed private clouds
• Content Delivery Networks (CDNs)
• Media Transcoding
• GPU cloud: very high cost upfront
7. DISASTER RECOVERY
• Define recovery time (RTO - how much
downtime) and recovery point (RPO - how much
data lost) objectives.
• Replicate VMs to cloud storage.
• Replicate DBs to cloud DBs.
• Create a deployment configuration ready to
launch when disaster hits (cold-standby) or a small
set of VMs that are always live (warm-standby).
• Avoid early fail-over to DR environment as it will
aggravate the damage:
RPO+RTO < Recovery Time of Master.
8. DESIGN FOR THE CLOUD
• Loose coupling:
• Use publish/subscribe for service interactions.
• Adopt share-nothing architectures, they scale better.
• Fail fast: show an error immediately rather than a spinning wheel for 10 mins and then
an error.
• Favor monolithic app designs rather than a network of interconnected micro-services.
• Favor automatic recovery rather than focus on diagnostics and logging.
• Get familiar with cloud automation tools (puppet, chef, pallet, AWS CloudFormation,
etc.).
9. DEPLOY FOR THE CLOUD
• The Docker revolution: no more “works
on my PC” statements.
• Clear release path:
• QA certifies Docker containers and
pushes them to the repository.
• DevOps perform rolling updates of the
published containers.
• Amazon, Google & Redhat have all jumped into it. It will probably
affect everybody by mid next year.
10. • Storage cost dropped
84% in 5 years.
$ / GB
0.3
0.225
0.15
0.075
HISTORIC PRICES
AWS Google Microsoft
2006 2012 2014
• VM cost dropped 56% in 3
years. 0.7
0.525
Hour
0.35
/ $ 0.175
0
AWS - m3.xlarge (15GB, 4 cores)
2012 2013 2014 Sep 2014
11. CASE STUDY
MAPMYFITNESS.COM
• Allows users to map, record and share their exercise routes and workouts online
(400,000 activities logged daily).
• 17 million users. Peak traffic during weekends and sport events (Tour de France, etc.).
• Both private & public cloud hosted with Rackspace.
• Hybrid cloud used for:
• Overflow traffic.
• Testing and development.
• Hosting event-type websites that are shorter lived.
12. NEXT STEPS
• Move your public content to a CDN.
• Build a DR environment in Amazon or Google and maintain
it.
• Use AWS Route 53 (DNS) to redirect traffic to geo-local
data centers.
• Deploy to more than one cloud provider: Amazon failed on
the Christmas Eve of 2012 reflecting downtime to Netflix.