6. Google Confidential and Proprietary
Google Compute Engine - VMs in Google Datacenters
● Public Preview - May 2013
● General Availability - December 2013
11. Google Confidential and Proprietary
Hadoop on GCE + Cloud Storage (GCS) Connector
Accenture:
Cloud vs. Bare-Metal
● Cloud-based Hadoop
deployments offer
better price-
performance ratios
than bare-metal
● Cloud’s virtualization
expands performance-
tuning opportunities
● Using remote storage
outperforms local disk
HDFS
12. Google Confidential and Proprietary
Data in GCS, Lipstick DB in Cloud SQL
Google Cloud Platform
Output Data
Lipstick
Database
Hadoop Master
MapReduce
JobTracker
Hadoop
Worker
MapReduce
TaskTrackerHadoop Worker
MapReduce
TaskTracker
Hadoop Worker
MapReduce
TaskTrackerLipstick Server
Input Data
13. Google Confidential and Proprietary
● Netflix Lipstick on Google Compute Engine
https://cloud.google.com/developers/articles/netflix-lipstick-on-google-compute-engine
● GCS Connector for Hadoop
https://developers.google.com/hadoop/google-cloud-storage-connector
● Cloud-based Hadoop Deployments: Benefits and Considerations
http://www.accenture.com/SiteCollectionDocuments/PDF/Accenture-Cloud-Based-Hadoop-Deployments-Benefits-and-Considerations.
pdf
● Apache Hadoop, Hive, and Pig on Google Compute Engine
https://cloud.google.com/developers/articles/apache-hadoop-hive-and-pig-on-google-compute-engine
Resources
24. @Answers4AWS
AMIs
• Initially built using my own scripts based
on Eric Hammond’s (@esh) work
• Then using Aminator
• Created Ubuntu Foundation AMIs
• Added the Ansible Provisioner for
Aminator
• Put a couple of them on the AWS
Marketplace for free
25. @Answers4AWS
CloudFormation
• One-click deploy
• Well, about 10 going through the AWS
Web Console wizard
• Designed to get you up and running
quickly
• Test it out, see if you like it
• NOT production quality
• No real security
• No HA
30. @Answers4AWS
Beta users
• From a successful CI build
• To a Fully Baked AMI
• Use in Testing and Production
• Without you doing anything
• ZERO clicks
• Signups are open
33. History and Future
2012
SPECjEnterprise
2013
AcmeAir Run
On IBM Cloud at
“Web Scale”
2014
Scalable Services
Fabric internally for
IBM Services
Scalable Services
Fabric SaaS and On-Prem?
Sample application
cloud prize work
AcmeAir Cloud/Mobile
Sample/Benchmark born
Codename:
BlueMix
Portability cloud
prize work
34. Scalable Service Fabric Work
Netflix OSS IBM port/enablement
Netflix “Zen” of Cloud • Worked with initial services to enable cloud native arch
• Worked with initial services to enable NetflixOSS usages
• Created scorecard and tests for “cloud native” readiness
Highly Available IaaS and
Cloud Services
• Deployment across multiple IBM SoftLayer IaaS datacenters and global
and local load balancers
• Complete automation via IBM SoftLayer IaaS API’s
• Ensured facilities for automatic failure recovery
Micro-service Runtimes
(Karyon, Eureka Client, Ribbon,
Hystrix, Archaius)
• Ported to work with IBM SoftLayer IaaS and on the WebSphere Liberty
Profile application server
• Created “eureka-sidecar” for non-Java runtimes and ElasticSearch
discovery
Netflix OSS Servers
(Asgard, Eureka Server,
Turbine)
• Ported to work with IBM SoftLayer IaaS + RightScale
• Operationalized HA and secure deployments for multiple service tenants
Adopted Chaos Testing • Ported Chaos Monkey to IBM SoftLayer IaaS
• Performed manual Chaos Gorilla validation on services
Worked through devops tool
chain
• Worked with initial services to enable continuous delivery with devops
(and imagine baking via Animator like tool)
35. Come meet the team!
Looks like … Tweets from … Talks about …
Adolfo @adolforod
API Management and Cloud Integration, user of
NetflixOSS platform. Appliances in the cloud.
Brian @bkmartin IBM BlueMix (PaaS), enabling composable apps in PaaS
Darrell
IBM Research focusing on NetflixOSS devops and on-
premise deployments
David @dcurrie
WebSphere Liberty Profile application server NetflixOSS
development and PaaS integration
Jonathan @ma4jpb
NetflixOSS portability across many aspects
Cloud messaging (in relation to Suro)
Matt @matrober
API Management, user of NetflixOSS platform
Converted service to be cloud native
Rachel @rreinitz
IBM Services, interested in helping you get to this cloud
native in SaaS and on-premise
Ricky
@rickymoorhouse API Management, user of NetflixOSS platform
Creator of Imaginator
Will @auwilli98 API Management operations, user of NetflixOSS platform
45. Zeno
● In-memory data distribution platform
● Contains tools for:
○ data quality management
○ data serialization
● We use it to distribute and keep up to date
gigabytes of video metadata on tens of
thousands of servers across the globe
46. Zeno
Why in-memory data?
- Netflix serves billions of requests
per day
- Each request requires metadata
about many movies to answer
47. Zeno
Netflix Use Case:
● Gigabytes of in-memory data
● Hundreds of thousands of in-memory cache
requests per second, per application
instance
● Tens of thousands of application instances
48. Distribution
FastBlob:
Binary serialization of a complete
state of data, and/or the changes
in data over time.
Serialization format designed to
propagate, and keep up to date, a
large amount of in-memory data
across many servers.
Optimized for: memory GC effects,
memory footprint, data transfer
size, deserialization CPU usage
52. Zeno Framework
Data Schema (Serializers)
Operation (SerializationFramework)
Input Data (POJOs)
Output
JsonSerializationFramework
HashSerializationFramework
DiffSerializationFramework
FastBlobStateEngine
53. Zeno Benefits
Development Agility:
● Easy to evolve data model, no need to change serialization formats or
operation logic
● Easy to create new functionality, no need to think about data model
structure or semantics
● Included “Diff” tools support high data quality across releases without too
much effort
Resource Efficiency:
● Included “FastBlob” optimized for Netflix scale
● Ask about in-development functionality!
69. Dynomite
● Cross AZ & Region replication to existing Key Value
stores
○ memcached
○ Reddis
● Thin Dynamo implementation provides the
replication
● Keep existing native KV protocol
○ No code refactoring
81. ● A splittable input format for SSTables
○ Need less files from the cluster.
○ Faster - just deserializing/serializing the files.
● An input format for the JSON
○ Allow incremental processing of backups
● A reducer that can compact SSTables.