10. 0 20 40 60 80 100 120 140 160 180 200
Cosmos
SparkSQL
SparkSQL with Cache
Write and Compile Query Submit and Wait in Job Queue Job Run Time
11.
12.
13.
14. Mesos Cluster/HDFS
Job Manager
Zookeeper
Job Frontend Web API
Spark Driver Host Pool
Spark Hive Thrift Server
Zeppelin Server
Avocado
(Hive Query + Schedule Task)
Rover
(Drag & Drop
BI tool with
Hive Code
Gen)
Zeppelin Web UI
MetastoreDB Hive Loader
Cosmos Storage
15. Partition
1
Partition
2
...
Partition
n
Export
Cosmos
Partition
Partition
1
Partition
2
...
Partition
n
Task
2
HDFS.copyFrom
LocalFile
...
Task
n
Partition
1
Partition
2
...
Partition
n
saveAsParquetFile
Task
2
...
Task
n