SlideShare une entreprise Scribd logo
1  sur  23
Télécharger pour lire hors ligne
Facebook Presto 
Interactive and Distributed SQL Query Engine for 
Big Data 
liangguorong@baidu.com, 2014. 11.20
Presto’s Brief History 
• 2012 fall started at Facebook (6 developers) 
✦ Designed for interactive SQL query on PB data 
✦ Hive is for reliable and large scale batch processing 
• 2013 spring rolled out to entire company 
• 2013 Nov. open sourced (https://github.com/facebook/presto ) 
• 2014 Nov., 88 releases, 41 contributors, 3943commits 
• current version 0.85 (http://prestodb.io/ ) 
• java, fast development , java ecosystem, easy integration
Advantages 
• High Performance: 10x faster than Hive 
✦ 2013 Nov. Facebook 1000 nodes, 1000 employees run 30,000 queries on 1PB per day 
• Extensibility 
✦ Pluggable backends: Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, MySQL, 
SystemSchema, TPCH 
✦ JDBC, ODBC(in future) for commercial BI tools or Dashboards, like data visualization 
✦ Client Protocol: HTTP+JSON, support various languages(Python, Ruby, PHP, Node.js 
Java(JDBC)…) 
• ANSI SQL 
• complex queries, joins, aggregations, various functions(Window 
functions)
• http://blog.cloudera.com/blog/2014/09/new-benchmarks- 
for-sql-on-hadoop-impala-1-4-widens-the- 
performance-gap/
Architecture
Why Presto Fast? 
1. In memory parallel computing 
2. Pipeline task execution 
3. Data local computation with multi-threads 
4. Cache hot queries and data 
5. JIT compile operator to byte code 
6. SQL optimization 
7. Other optimization
1. In memory parallel computing 
• Custom query engine, not MapReduce
SQL compile process 
antlr3
• select name, count(*) as count from orders as t1 join customer as t2 on 
t1.custkey = t2.custkey group by name order by count desc limit 100;
Sink! 
TopN! 
Exchange! 
Sink! 
TopN! 
Final Aggregation! 
Exchange! 
Sink! 
Partial Aggregation! 
Table Scan! 
orders! 
Exchange! 
Sink! 
Table Scan! 
customers! 
Project! 
Join! 
Sink! 
TopN! 
Exchange! 
Sink! 
TopN! 
Final Aggregation! 
Exchange! 
Sink! 
Partial Aggregation! 
1 thread! 
1 thread! 
Table Scan! 
orders! 
Project! 
Join! 
Table Scan! 
customers! 
Worker2! 
Sink! 
TopN! 
Final Aggregation! 
Sink! 
Partial Aggregation! 
Project! 
Join! 
Table Scan! 
Sink! 
Exchange! 
Exchange! 
Worker1! 
2 workers! 
All tasks in parallel! 
many splits ! 
many threads! 
1 thread! 
Sink! 
orders! 
Table Scan! 
customers! 
Exchange! 
many splits ! 
many threads!
Prioritized 
SplitRunner 
• SQL->Stages, Tasks, Splits 
• One task fail, query must rerun 
• Aggregation memory limit
2.Pipeline task execution 
• In worker, TaskExecutor, split pipeline 
1s by default
• Operator Pipeline 
• Page: smallest data processing unit(like 
RowBatch) 
• max page size 1MB, max rows: 
16*1024 
Page 
Exchange Operator: 
each client for each 
split
3. Data local computation with 
multi-threads 
• NodeSelector select available nodes(10 nodes 
default) 
• Nodes has the same address 
• If not enough, add nodes in the same rack 
• If not enough, randomly select nodes in other racks 
• Select the node with the smallest number of 
assignments (pending tasks)
• 4. Cache hot queries and data 
✦ Google Guava loading cache byte code 
✦ Cache Objects: Hive database/table/partition, JIT byte code 
class, functions 
• 5. JIT compile operator to byte code 
✦ Compile ScanFilterAndProjectOperator , 
FilterAndProjectOperator
6. SQL Optimization 
• PredicatePushDown 
• PruneRedundantProjections 
• PruneUnreferencedOutputs 
• MergeProjections 
• LimitPushDown 
• CanonicalizeExpressions 
• CountConstantOptimizer 
• ImplementSampleAsFilter 
• MetadataQueryOptimizer 
• SetFlatteningOptimizer 
• SimplifyExpressions 
• UnaliasSymbolReferences 
• WindowFilterPushDown
7. Other Optimization 
• BlinkDB liked approximate queries 
• JVM GC Control 
✦ JDK1.7 
✦ forcing the code cache evictor make room before the cache fills up 
• Careful use mem  data structure 
✦ Airlift slice for efficient heap and off-heap memory(https://github.com/airlift/slice ) 
✦ Java future async callback
Presto Extensibility 
• Connectors(Catalogs): Hive, Cassandra, Hive, JMX, Kafka, 
MySQL, PostgreSQL, System, TPCH 
• Custom connectors 
(http://prestodb.io/docs/current/spi/overview.html ): 
• Service Provider Interface(SPI): 
• ConnectorMetadata 
• ConnectorSplitManager 
• ConnectorRecordSetProvider
Presto’s Limitations 
• No fault tolerance, Unstable 
• Memory Limitations for aggregations, huge joins 
• SQL features like: 
• only CTAS 
• no support UDF
Presto’s Future 
Presto, Past, Present, and Future by Dain Sundstrom at Facebook, 2014.May 
• Basic Task Recovery 
• Huge joins and Group by 
• Spill to Disk(Implemented), Insert 
• Create View(Implemented), not compatible with hive 
• Native Store, Cache Hot data(Implemented) 
• Security : Authentication, Authorization, Permissions 
• ODBC Driver 
• Improve DDL DML
References 
• http://prestodb.io/ 
• https://github.com/facebook/presto 
• https://www.facebook.com/notes/facebook-engineering/ 
presto-interacting-with-petabytes-of-data- 
at-facebook/10151786197628920

Contenu connexe

Tendances

Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
Cloudera, Inc.
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 

Tendances (20)

Building Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache SparkBuilding Robust ETL Pipelines with Apache Spark
Building Robust ETL Pipelines with Apache Spark
 
Parquet performance tuning: the missing guide
Parquet performance tuning: the missing guideParquet performance tuning: the missing guide
Parquet performance tuning: the missing guide
 
Apache Spark Architecture
Apache Spark ArchitectureApache Spark Architecture
Apache Spark Architecture
 
Log Structured Merge Tree
Log Structured Merge TreeLog Structured Merge Tree
Log Structured Merge Tree
 
Apache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS SessionApache Arrow Workshop at VLDB 2019 / BOSS Session
Apache Arrow Workshop at VLDB 2019 / BOSS Session
 
[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기[245] presto 내부구조 파헤치기
[245] presto 내부구조 파헤치기
 
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
 
RedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ TwitterRedisConf17- Using Redis at scale @ Twitter
RedisConf17- Using Redis at scale @ Twitter
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
Designing Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things RightDesigning Structured Streaming Pipelines—How to Architect Things Right
Designing Structured Streaming Pipelines—How to Architect Things Right
 
Apache Ranger
Apache RangerApache Ranger
Apache Ranger
 
InnoDB Internal
InnoDB InternalInnoDB Internal
InnoDB Internal
 
Presto overview
Presto overviewPresto overview
Presto overview
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Apache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In PracticeApache Arrow: In Theory, In Practice
Apache Arrow: In Theory, In Practice
 
Performance Optimizations in Apache Impala
Performance Optimizations in Apache ImpalaPerformance Optimizations in Apache Impala
Performance Optimizations in Apache Impala
 
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation BuffersHBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
HBase HUG Presentation: Avoiding Full GCs with MemStore-Local Allocation Buffers
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing Apache Tez: Accelerating Hadoop Query Processing
Apache Tez: Accelerating Hadoop Query Processing
 

En vedette

Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
DataWorks Summit
 

En vedette (8)

Presto: Distributed sql query engine
Presto: Distributed sql query engine Presto: Distributed sql query engine
Presto: Distributed sql query engine
 
Presto - SQL on anything
Presto  - SQL on anythingPresto  - SQL on anything
Presto - SQL on anything
 
Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016Presto at Hadoop Summit 2016
Presto at Hadoop Summit 2016
 
Presto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and FuturePresto @ Facebook: Past, Present and Future
Presto @ Facebook: Past, Present and Future
 
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CAPresto: Distributed SQL on Anything -  Strata Hadoop 2017 San Jose, CA
Presto: Distributed SQL on Anything - Strata Hadoop 2017 San Jose, CA
 
How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case How to ensure Presto scalability 
in multi use case
How to ensure Presto scalability 
in multi use case
 
Optimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud StorageOptimizing Presto Connector on Cloud Storage
Optimizing Presto Connector on Cloud Storage
 
Hive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmarkHive, Presto, and Spark on TPC-DS benchmark
Hive, Presto, and Spark on TPC-DS benchmark
 

Similaire à Facebook Presto presentation

Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
MongoDB APAC
 

Similaire à Facebook Presto presentation (20)

Workflow Engines for Hadoop
Workflow Engines for HadoopWorkflow Engines for Hadoop
Workflow Engines for Hadoop
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Buildingsocialanalyticstoolwithmongodb
BuildingsocialanalyticstoolwithmongodbBuildingsocialanalyticstoolwithmongodb
Buildingsocialanalyticstoolwithmongodb
 
Webinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case StudyWebinar - DreamObjects/Ceph Case Study
Webinar - DreamObjects/Ceph Case Study
 
Middleware in Golang: InVision's Rye
Middleware in Golang: InVision's RyeMiddleware in Golang: InVision's Rye
Middleware in Golang: InVision's Rye
 
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
Ceph Day Beijing - Our journey to high performance large scale Ceph cluster a...
 
Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...
Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...
Ceph Day Beijing - Our Journey to High Performance Large Scale Ceph Cluster a...
 
Big Data Developers Moscow Meetup 1 - sql on hadoop
Big Data Developers Moscow Meetup 1  - sql on hadoopBig Data Developers Moscow Meetup 1  - sql on hadoop
Big Data Developers Moscow Meetup 1 - sql on hadoop
 
SQL on Hadoop
SQL on HadoopSQL on Hadoop
SQL on Hadoop
 
Be faster then rabbits
Be faster then rabbitsBe faster then rabbits
Be faster then rabbits
 
Top ten-list
Top ten-listTop ten-list
Top ten-list
 
SharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 PerformanceSharePoint Saturday San Antonio: SharePoint 2010 Performance
SharePoint Saturday San Antonio: SharePoint 2010 Performance
 
DOTNET8.pptx
DOTNET8.pptxDOTNET8.pptx
DOTNET8.pptx
 
DrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalabilityDrupalCampLA 2014 - Drupal backend performance and scalability
DrupalCampLA 2014 - Drupal backend performance and scalability
 
Intro to CakePHP
Intro to CakePHPIntro to CakePHP
Intro to CakePHP
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h base
 
Scaling with swagger
Scaling with swaggerScaling with swagger
Scaling with swagger
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
DrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an AfterthoughtDrupalSouth 2015 - Performance: Not an Afterthought
DrupalSouth 2015 - Performance: Not an Afterthought
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Dernier (20)

Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 

Facebook Presto presentation

  • 1. Facebook Presto Interactive and Distributed SQL Query Engine for Big Data liangguorong@baidu.com, 2014. 11.20
  • 2. Presto’s Brief History • 2012 fall started at Facebook (6 developers) ✦ Designed for interactive SQL query on PB data ✦ Hive is for reliable and large scale batch processing • 2013 spring rolled out to entire company • 2013 Nov. open sourced (https://github.com/facebook/presto ) • 2014 Nov., 88 releases, 41 contributors, 3943commits • current version 0.85 (http://prestodb.io/ ) • java, fast development , java ecosystem, easy integration
  • 3. Advantages • High Performance: 10x faster than Hive ✦ 2013 Nov. Facebook 1000 nodes, 1000 employees run 30,000 queries on 1PB per day • Extensibility ✦ Pluggable backends: Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, MySQL, SystemSchema, TPCH ✦ JDBC, ODBC(in future) for commercial BI tools or Dashboards, like data visualization ✦ Client Protocol: HTTP+JSON, support various languages(Python, Ruby, PHP, Node.js Java(JDBC)…) • ANSI SQL • complex queries, joins, aggregations, various functions(Window functions)
  • 5.
  • 6.
  • 8. Why Presto Fast? 1. In memory parallel computing 2. Pipeline task execution 3. Data local computation with multi-threads 4. Cache hot queries and data 5. JIT compile operator to byte code 6. SQL optimization 7. Other optimization
  • 9. 1. In memory parallel computing • Custom query engine, not MapReduce
  • 11. • select name, count(*) as count from orders as t1 join customer as t2 on t1.custkey = t2.custkey group by name order by count desc limit 100;
  • 12. Sink! TopN! Exchange! Sink! TopN! Final Aggregation! Exchange! Sink! Partial Aggregation! Table Scan! orders! Exchange! Sink! Table Scan! customers! Project! Join! Sink! TopN! Exchange! Sink! TopN! Final Aggregation! Exchange! Sink! Partial Aggregation! 1 thread! 1 thread! Table Scan! orders! Project! Join! Table Scan! customers! Worker2! Sink! TopN! Final Aggregation! Sink! Partial Aggregation! Project! Join! Table Scan! Sink! Exchange! Exchange! Worker1! 2 workers! All tasks in parallel! many splits ! many threads! 1 thread! Sink! orders! Table Scan! customers! Exchange! many splits ! many threads!
  • 13. Prioritized SplitRunner • SQL->Stages, Tasks, Splits • One task fail, query must rerun • Aggregation memory limit
  • 14. 2.Pipeline task execution • In worker, TaskExecutor, split pipeline 1s by default
  • 15. • Operator Pipeline • Page: smallest data processing unit(like RowBatch) • max page size 1MB, max rows: 16*1024 Page Exchange Operator: each client for each split
  • 16. 3. Data local computation with multi-threads • NodeSelector select available nodes(10 nodes default) • Nodes has the same address • If not enough, add nodes in the same rack • If not enough, randomly select nodes in other racks • Select the node with the smallest number of assignments (pending tasks)
  • 17. • 4. Cache hot queries and data ✦ Google Guava loading cache byte code ✦ Cache Objects: Hive database/table/partition, JIT byte code class, functions • 5. JIT compile operator to byte code ✦ Compile ScanFilterAndProjectOperator , FilterAndProjectOperator
  • 18. 6. SQL Optimization • PredicatePushDown • PruneRedundantProjections • PruneUnreferencedOutputs • MergeProjections • LimitPushDown • CanonicalizeExpressions • CountConstantOptimizer • ImplementSampleAsFilter • MetadataQueryOptimizer • SetFlatteningOptimizer • SimplifyExpressions • UnaliasSymbolReferences • WindowFilterPushDown
  • 19. 7. Other Optimization • BlinkDB liked approximate queries • JVM GC Control ✦ JDK1.7 ✦ forcing the code cache evictor make room before the cache fills up • Careful use mem data structure ✦ Airlift slice for efficient heap and off-heap memory(https://github.com/airlift/slice ) ✦ Java future async callback
  • 20. Presto Extensibility • Connectors(Catalogs): Hive, Cassandra, Hive, JMX, Kafka, MySQL, PostgreSQL, System, TPCH • Custom connectors (http://prestodb.io/docs/current/spi/overview.html ): • Service Provider Interface(SPI): • ConnectorMetadata • ConnectorSplitManager • ConnectorRecordSetProvider
  • 21. Presto’s Limitations • No fault tolerance, Unstable • Memory Limitations for aggregations, huge joins • SQL features like: • only CTAS • no support UDF
  • 22. Presto’s Future Presto, Past, Present, and Future by Dain Sundstrom at Facebook, 2014.May • Basic Task Recovery • Huge joins and Group by • Spill to Disk(Implemented), Insert • Create View(Implemented), not compatible with hive • Native Store, Cache Hot data(Implemented) • Security : Authentication, Authorization, Permissions • ODBC Driver • Improve DDL DML
  • 23. References • http://prestodb.io/ • https://github.com/facebook/presto • https://www.facebook.com/notes/facebook-engineering/ presto-interacting-with-petabytes-of-data- at-facebook/10151786197628920