Pedal to the Metal: Accelerating Spark* with Silicon Innovation discusses how Intel is optimizing the Spark analytics framework for better performance. Hardware optimizations like Intel Xeon CPUs, Intel Optane DC persistent memory, and Intel Omni-Path Architecture can accelerate Spark workloads by up to 7x. Intel is also contributing open source code to Spark projects and optimizing machine learning libraries like MLLib to further boost Spark performance using Intel Math Kernel Library. The document advocates that these hardware and software optimizations can help Spark be more easily deployed for analytics, machine learning, and IoT applications.
4. Intel® Omni-Path ArchitectureL E A D E R S H I P V S . I N F I N I B A N D E D R
10% performance advantage
60% lower system power
20% system cost savings
2X lower latency
2X higher bandwidth
Intel® Xeon® + FPGAINTEGRATED CUSTOM ACCELERATION
3D XPoi nt™ DI MMsNEW CLASS OF NON-VOLATILE MEMORY
4X memory capacity
½ the cost of DDR
Accelerate Spark* with Silicon innovation
*Other names and brands may be claimed as the property of others
-Intel Internal estimates
5. HardwareOptimizationforPerformanceImprovement
7X1
for Spark* thru hardware migration
2X2
for MLlib* thru Intel Math Kernel Library
1.7X3
Spark thru hierarchical storage
1.35X4
Spark shuffle RPC encryption for Terasort and
1.12X for Big Bench*
*Other names and brands may be claimed as the property of others
1.Tested by Intel for Cloud Service Provider real life workloads like nweight
2.Tested by Intel using micro workload and Intel MKL
3.Tested by Intel and partner using spark performance thru hierachical storage
4. Tested by Intel using Spark shuffle encryption performance for terasort and BigBench
6. OpenSource Commitment
• SparkStreaming*
SPARK CONTRIBUTIONS
• Sparkcore/Tungsten*
• SparkSQL*
• Mllib*• SparkR*
• Graph X*
SPARK
COMMUNITY SUPPORT
• Performance Benchmark
• Big Bench*
SPARK-OPTIMIZED SOFTWARE
• Trusted Analytics Platform (TAP)
• Intel Math Kernel Library
• Intel Data Analytics Library
• Machine Learning – 10X~70 scalability,
8X training cycle reduction
*Other names and brands may be claimed as the property of others
-Intel Internal estimates
• SparkMeet-up
7. Spark UseCases
Other names and brands may beclaimed as the property of others
Making Solutions Easier to Deploy
8. Optimize Machine and Deep Learning
Using Hardware and Math Kernels
Scale out Machine and Deep Learning
Enable Internet-of-Things & Cloud Analytics
MovingForward