5. Kubernetes provides a common API and
self-healing framework which
automatically handles machine failures
and application deployments, logging,
and monitoring.
5
10. Clusters - set of compute, storage, network
resource
Pods - colocated group of application containers
that share volumes and a networking stack
Replication Controllers - ensure a specific number
of pods, manage pods, status updates
Services - cluster wide service discovery
10
11. Node #1 192.168.0.2
Pod #1
10.0.0.2
Node #5 192.168.0.6
Volume
Network
Pod #2
10.0.0.3
Volume
Network
Pod #8
10.0.0.9
Volume
Network
8080 8080 8080
11
12. Support for Event Stream Processing
Fast Data Queries in Real Time
Improved Programmer Productivity
Fast Batch Processing of Large Data Set
12
13. Driver Process that contains the SparkContext
Executor Process that executes one or more Spark tasks
Master Process that manages applications across the cluster
Worker Process that manages executors on a particular node
13
http://spark.apache.org/docs/latest/cluster-overview.html
30. Node Manager # 1…N
external
shuffle plugin
RDD
(IntermediateFile)
RDD
(IntermediateFile)
External Shuffle
30
Executor
Long-Running ETL jobs
Interactive application or Server
Any application with large shuffles
Executor