SlideShare a Scribd company logo
1 of 22
Download to read offline
WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
Xie Qi, Intel
Accelerating Apache
Spark with Intel
QuickAssist Technology
#UnifiedDataAnalytics #SparkAISummit
LEGAL NOTICES
3
No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a
particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in
trade.
This document contains information on products, services and/or processes in development. All information provided here is subject to
change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps.
The products and services described may contain defects or errors known as errata which may cause deviations from published
specifications. Current characterized errata are available on request.
Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by
visiting www.intel.com/design/literature.htm.
Intel, the Intel logo, Intel® are trademarks of Intel Corporation in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others
Copyright © 2018 Intel Corporation.
Agenda
• QAT Overview
• QAT Acceleration Opportunity in Big Data
• High Level Architecture for QATCodec in Big Data
• Performance Evaluation
• QAT Impacts On TPCX-BB Queries Analysis
4#UnifiedDataAnalytics #SparkAISummit
Intel QuickAssist Technology Overview
• QAT provides security(encryption) HW acceleration and
compression HW acceleration
• QAT makes use of a set of APIs to abstract out the
hardware, so the same application can run on multiple
generations of QAT hardware
• Customers can also make use of patches that we have
provided to popular open source software, so they can
minimize or eliminate their effort to learn the API
• Get QAT resources from
https://01.org/intel-quickassist-technology
5
Why QAT is important to BIG Data
6
Data Compression Pipeline in
BIG Data Framework
7
Data Compression Pipeline in
BIG Data - I/O Characteristics
8
QAT Acceleration Opportunity
9
About QATCodec
10
Benchmark Configuration
11
Performance Evaluation – Spark Sort
12
Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads
used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the
results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel®
company. Configurations: see slides 11
Performance Evaluation – Map Reduce
13
Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads used
in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to
vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel® company.
Configurations: see slides 11
Performance Evaluation – Hive on Map Reduce
14
Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads used
in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to
vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel® company.
Configurations: see slides 11
QAT VS. Snappy – Compression Ratio
(1TB data scale)
15
Impact Analysis On Queries -
Compression Ratio
16
Improved Query - Q22 (Map Join
conversion)
17
Degrade Query - Q12 (GC issue
caused by Map Join)
18
Degrade Query - Q12 (GC Issue
Caused By Map Join) – Cont’d
19
Micro-view Query Comparison - IO wait
20
Micro-view Query Comparison – No IO wait
21
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

More Related Content

What's hot

Clusters and Wharehouse Scale Computers
Clusters and Wharehouse Scale ComputersClusters and Wharehouse Scale Computers
Clusters and Wharehouse Scale Computersbabuece
 
IBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationIBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationCOMMON Europe
 
Chapter 8 : Memory
Chapter 8 : MemoryChapter 8 : Memory
Chapter 8 : MemoryAmin Omi
 
Kernel security Concepts
Kernel security ConceptsKernel security Concepts
Kernel security ConceptsMohit Saxena
 
Unit II Arm7 Thumb Instruction
Unit II Arm7 Thumb InstructionUnit II Arm7 Thumb Instruction
Unit II Arm7 Thumb InstructionDr. Pankaj Zope
 
Cs423 raw sockets_bw
Cs423 raw sockets_bwCs423 raw sockets_bw
Cs423 raw sockets_bwjktjpc
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel systemManish Singh
 
Exokernel operating systems
Exokernel operating systemsExokernel operating systems
Exokernel operating systemsGaurav Dalvi
 
Linux & Unix Operating System's
Linux & Unix Operating System'sLinux & Unix Operating System's
Linux & Unix Operating System'sRiaz Ahmed Channa
 
ARM® Cortex M Boot & CMSIS Part 1-3
ARM® Cortex M Boot & CMSIS Part 1-3ARM® Cortex M Boot & CMSIS Part 1-3
ARM® Cortex M Boot & CMSIS Part 1-3Raahul Raghavan
 
Advanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfAdvanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfJeanChristian12
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architectureArpan Baishya
 
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from IntelEdge AI and Vision Alliance
 
RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems Bayar shahab
 

What's hot (20)

RAID
RAIDRAID
RAID
 
Clusters and Wharehouse Scale Computers
Clusters and Wharehouse Scale ComputersClusters and Wharehouse Scale Computers
Clusters and Wharehouse Scale Computers
 
IBM i client partitions concepts and implementation
IBM i client partitions concepts and implementationIBM i client partitions concepts and implementation
IBM i client partitions concepts and implementation
 
Chapter 8 : Memory
Chapter 8 : MemoryChapter 8 : Memory
Chapter 8 : Memory
 
Kernel security Concepts
Kernel security ConceptsKernel security Concepts
Kernel security Concepts
 
Unit II Arm7 Thumb Instruction
Unit II Arm7 Thumb InstructionUnit II Arm7 Thumb Instruction
Unit II Arm7 Thumb Instruction
 
Cs423 raw sockets_bw
Cs423 raw sockets_bwCs423 raw sockets_bw
Cs423 raw sockets_bw
 
Distributed & parallel system
Distributed & parallel systemDistributed & parallel system
Distributed & parallel system
 
Exokernel operating systems
Exokernel operating systemsExokernel operating systems
Exokernel operating systems
 
Linux & Unix Operating System's
Linux & Unix Operating System'sLinux & Unix Operating System's
Linux & Unix Operating System's
 
ARM® Cortex M Boot & CMSIS Part 1-3
ARM® Cortex M Boot & CMSIS Part 1-3ARM® Cortex M Boot & CMSIS Part 1-3
ARM® Cortex M Boot & CMSIS Part 1-3
 
Advanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdfAdvanced Troublesshooting Nexus 7K.pdf
Advanced Troublesshooting Nexus 7K.pdf
 
Multiprocessor architecture
Multiprocessor architectureMultiprocessor architecture
Multiprocessor architecture
 
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
"Accelerating Deep Learning Using Altera FPGAs," a Presentation from Intel
 
Microkernel
MicrokernelMicrokernel
Microkernel
 
Unit vi (2)
Unit vi (2)Unit vi (2)
Unit vi (2)
 
ARM CORTEX M3 PPT
ARM CORTEX M3 PPTARM CORTEX M3 PPT
ARM CORTEX M3 PPT
 
Notes on NUMA architecture
Notes on NUMA architectureNotes on NUMA architecture
Notes on NUMA architecture
 
Interrupts.ppt
Interrupts.pptInterrupts.ppt
Interrupts.ppt
 
RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems
 

Similar to Accelerating Apache Spark with Intel QuickAssist Technology

Crooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumCrooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumAlan Frost
 
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)IntelAPAC
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12Jomar Silva
 
2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etcvideos
 
Driving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleDriving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleIntel IT Center
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyIntel IT Center
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureJim St. Leger
 
Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop Intel® Software
 
Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersSri Ambati
 
3 additional dpdk_theory(1)
3 additional dpdk_theory(1)3 additional dpdk_theory(1)
3 additional dpdk_theory(1)videos
 
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: PerformanceIntel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: PerformanceDESMOND YUEN
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicestdc-globalcode
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developersMichelle Holley
 
Como criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaComo criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaiMasters
 
Xeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointXeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointIntel IT Center
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY
 
Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetupHaidee McMahon
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Ceph Community
 
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Amazon Web Services
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hwvideos
 

Similar to Accelerating Apache Spark with Intel QuickAssist Technology (20)

Crooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinumCrooke CWF Keynote FINAL final platinum
Crooke CWF Keynote FINAL final platinum
 
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
Lynn Comp - Intel Big Data & Cloud Summit 2013 (2)
 
AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12AI & Computer Vision (OpenVINO) - CPBR12
AI & Computer Vision (OpenVINO) - CPBR12
 
2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc2 new hw_features_cat_cod_etc
2 new hw_features_cat_cod_etc
 
Driving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to ExascaleDriving Industrial InnovationOn the Path to Exascale
Driving Industrial InnovationOn the Path to Exascale
 
High Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge EconomyHigh Performance Computing: The Essential tool for a Knowledge Economy
High Performance Computing: The Essential tool for a Knowledge Economy
 
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel ArchitectureDPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
DPDK Summit - 08 Sept 2014 - Intel - Networking Workloads on Intel Architecture
 
Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop Microsoft Build 2019- Intel AI Workshop
Microsoft Build 2019- Intel AI Workshop
 
Accelerating AI Adoption with Partners
Accelerating AI Adoption with PartnersAccelerating AI Adoption with Partners
Accelerating AI Adoption with Partners
 
3 additional dpdk_theory(1)
3 additional dpdk_theory(1)3 additional dpdk_theory(1)
3 additional dpdk_theory(1)
 
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: PerformanceIntel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
Intel® QuickAssist Technology (Intel® QAT) and OpenSSL-1.1.0: Performance
 
TDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devicesTDC2019 Intel Software Day - Inferencia de IA em edge devices
TDC2019 Intel Software Day - Inferencia de IA em edge devices
 
Performance out of the box developers
Performance   out of the box developersPerformance   out of the box developers
Performance out of the box developers
 
Como criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar SilvaComo criar um mundo autônomo e conectado - Jomar Silva
Como criar um mundo autônomo e conectado - Jomar Silva
 
Xeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPointXeon E5 Making the Business Case PowerPoint
Xeon E5 Making the Business Case PowerPoint
 
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
HPC DAY 2017 | Accelerating tomorrow's HPC and AI workflows with Intel Archit...
 
Introduction to container networking in K8s - SDN/NFV London meetup
Introduction to container networking in K8s - SDN/NFV  London meetupIntroduction to container networking in K8s - SDN/NFV  London meetup
Introduction to container networking in K8s - SDN/NFV London meetup
 
Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques Accelerate Ceph performance via SPDK related techniques
Accelerate Ceph performance via SPDK related techniques
 
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
Extend HPC Workloads to Amazon EC2 Instances with Intel and Rescale (CMP373-S...
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw
 

More from Databricks

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Databricks
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Databricks
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Databricks
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of HadoopDatabricks
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDatabricks
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceDatabricks
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringDatabricks
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixDatabricks
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationDatabricks
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchDatabricks
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesDatabricks
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesDatabricks
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsDatabricks
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkDatabricks
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkDatabricks
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesDatabricks
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkDatabricks
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeDatabricks
 

More from Databricks (20)

DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 1
 
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 1 | Part 2
 
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 2
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
 
Democratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized PlatformDemocratizing Data Quality Through a Centralized Platform
Democratizing Data Quality Through a Centralized Platform
 
Learn to Use Databricks for Data Science
Learn to Use Databricks for Data ScienceLearn to Use Databricks for Data Science
Learn to Use Databricks for Data Science
 
Why APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML MonitoringWhy APM Is Not the Same As ML Monitoring
Why APM Is Not the Same As ML Monitoring
 
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch FixThe Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
 
Stage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI IntegrationStage Level Scheduling Improving Big Data and AI Integration
Stage Level Scheduling Improving Big Data and AI Integration
 
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorchSimplify Data Conversion from Spark to TensorFlow and PyTorch
Simplify Data Conversion from Spark to TensorFlow and PyTorch
 
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on KubernetesScaling your Data Pipelines with Apache Spark on Kubernetes
Scaling your Data Pipelines with Apache Spark on Kubernetes
 
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark PipelinesScaling and Unifying SciKit Learn and Apache Spark Pipelines
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
 
Sawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature AggregationsSawtooth Windows for Feature Aggregations
Sawtooth Windows for Feature Aggregations
 
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen SinkRedis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
 
Re-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and SparkRe-imagine Data Monitoring with whylogs and Spark
Re-imagine Data Monitoring with whylogs and Spark
 
Raven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction QueriesRaven: End-to-end Optimization of ML Prediction Queries
Raven: End-to-end Optimization of ML Prediction Queries
 
Processing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache SparkProcessing Large Datasets for ADAS Applications using Apache Spark
Processing Large Datasets for ADAS Applications using Apache Spark
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 

Recently uploaded

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 

Recently uploaded (20)

Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 

Accelerating Apache Spark with Intel QuickAssist Technology

  • 1. WIFI SSID:Spark+AISummit | Password: UnifiedDataAnalytics
  • 2. Xie Qi, Intel Accelerating Apache Spark with Intel QuickAssist Technology #UnifiedDataAnalytics #SparkAISummit
  • 3. LEGAL NOTICES 3 No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. The products and services described may contain defects or errors known as errata which may cause deviations from published specifications. Current characterized errata are available on request. Copies of documents which have an order number and are referenced in this document may be obtained by calling 1-800-548-4725 or by visiting www.intel.com/design/literature.htm. Intel, the Intel logo, Intel® are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others Copyright © 2018 Intel Corporation.
  • 4. Agenda • QAT Overview • QAT Acceleration Opportunity in Big Data • High Level Architecture for QATCodec in Big Data • Performance Evaluation • QAT Impacts On TPCX-BB Queries Analysis 4#UnifiedDataAnalytics #SparkAISummit
  • 5. Intel QuickAssist Technology Overview • QAT provides security(encryption) HW acceleration and compression HW acceleration • QAT makes use of a set of APIs to abstract out the hardware, so the same application can run on multiple generations of QAT hardware • Customers can also make use of patches that we have provided to popular open source software, so they can minimize or eliminate their effort to learn the API • Get QAT resources from https://01.org/intel-quickassist-technology 5
  • 6. Why QAT is important to BIG Data 6
  • 7. Data Compression Pipeline in BIG Data Framework 7
  • 8. Data Compression Pipeline in BIG Data - I/O Characteristics 8
  • 12. Performance Evaluation – Spark Sort 12 Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel® company. Configurations: see slides 11
  • 13. Performance Evaluation – Map Reduce 13 Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel® company. Configurations: see slides 11
  • 14. Performance Evaluation – Hive on Map Reduce 14 Performance estimates were obtained prior to implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown." Implementation of these updates may make these results inapplicable to your device or system. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to www.intel.com/benchmarks. Tests performed by Intel® company. Configurations: see slides 11
  • 15. QAT VS. Snappy – Compression Ratio (1TB data scale) 15
  • 16. Impact Analysis On Queries - Compression Ratio 16
  • 17. Improved Query - Q22 (Map Join conversion) 17
  • 18. Degrade Query - Q12 (GC issue caused by Map Join) 18
  • 19. Degrade Query - Q12 (GC Issue Caused By Map Join) – Cont’d 19
  • 21. Micro-view Query Comparison – No IO wait 21
  • 22. DON’T FORGET TO RATE AND REVIEW THE SESSIONS SEARCH SPARK + AI SUMMIT