Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Big data talk barcelona - jsr - jc
1. Architectural Considerations for Big
Data Workloads on OpenStack
OpenStack Summit, Barcelona
October 27, 2016
Jonathan Chiang - Cloud Architect, Comcast
James Saint-Rossy - Principal Engineer, Comcast
3. Agenda
•What we do
•Comcast’s journey with OpenStack
•Big data use cases at Comcast
•Our application profiles
•Key Objectives of Modern workloads
•Disaggregated vs Hyper-Converged
•Recommended Approaches for the different use cases
•HDFS and S3 working together
4. 4
A Fortune 50 Company Uniquely Positioned at
the Intersection of Media and Technology
TV, Internet, Voice and Home
Cable Networks
Film
Broadcast Television
Theme Parks
5. Stretching the Comcast Elastic Cloud | Our
Journey with OpenStack
•Petabyte of Memory and One Million vCPU Cores in 2016
•Multi-Petabyte Ceph Block and Object Storage
•Multi-Terabyte SSD Block Storage
•Deployed across 34 Regions
• National and Regional Data Centers
•Icehouse Release Today, Moving Directly to Mitaka
6. Community Contributions
•Lines of code: 95,000
•Commits: 1200
•Core Developers and Reviewers on Multiple Projects
•Since Vancouver Summit (Kilo), Comcast has doubled its
upstream contributions
7. Big Data Use Cases at Comcast
7
Real-time
Telemetry Data
Streaming
Image Recognition
Statistical Data Analysis
Machine Learning
NoSQL Databases
Pulsar
8. Application Profile
• Designed to be 100% sequential writes, with reads
served from OS page cache
• Writes relatively low IOPS, high throughput, large block
size, and sequential
• Reads from disk can be intermittent depending on the
existence of latent consumers, and when reads occur
they are typically random small block high IOPS reads
• Kafka is somewhat latency sensitive but more tolerant
than a NoSQL database for example
9. Application Profile
• Internal cloud NoSQL database
• Medium/high IOPS, small block sizes, random reads
and writes
• Designed to support low latency read and write use
cases, therefore latency sensitive
• Mixture of reads and writes and block size is use case
dependent, typical observed distribution in standard
key-value cluster is 70r/30w
Pulsar
10. Application Profile
• HDFS Data Node – low IOPS, very large blocks,
sequential reads and writes, not extremely latency
sensitive
• YARN NodeManager Temp Space – medium IOPS,
higher throughput, tends towards more random write
patterns, slightly more sequential read patterns
• High Performance Admin Nodes (name nodes,
journal/zookeeper nodes) high IOPS, small block size,
random reads and writes, these nodes typically
perform better and improve overall cluster performance
with high performance storage
11. Key Objectives for Modern Workloads
• Performance
• Availability, Reliability, Resiliency
• Manageability, APIs, Integrations
• Workload Isolation
• Data Intensive Applications
14. Recommended Approach for Kafka
Divide and Conquer
• Use HDDs for Collectors
• Use SSDs for
Aggregates
15. Recommended Approach for Pulsar
Disaggregated if:
• Can Handle high number of IOPS
• Meet the capacity
• Network latency issues can be mitigated
Hyper-Converged if:
• Compute has local SSDs/NVMEs
• Enough capacity
Pulsar
21. Testing and Validation
Approach
The test plans for each application platform are designed to represent typical use cases for those
applications and test their performance, latency, and storage capacity.
Hadoop Big Data Platform
• Benchmark Tools
• Application Testing
Kafka Stream Data Platform
• Use internally developed automation to deploy and test Kafka clusters.
• Test Configuration and Scenarios
• ZooKeepers
23. Operations and Support at Scale
23
• Noisy Neighbor - Which one?
• Where is the handoff between Ops and Engineering?
• Do you have Devops?
• When things start to break
• Synthetic workloads