Query Compilation in Impala

Query Compilation in Impala
Alexander Behm | Software Engineer
May 2014 @ Impala User Group

Compile Query
Execute Query
Client
Client
SQL Text
Executable Plan
Query Results
Impala Frontend
(Java)
Impala Backend
(C++)
Focus of this talk
Flow of a SQL Query

Client
SQL Text
Executable Plan
Query Compilation
Query
Compiler
SQL
Parsing
Semantic
Analysis
Query
Planning
Parse Tree
Parse Tree + Analyzer

Query Parsing
SELECT c1, SUM(c2)
FROM t1 JOIN t2 USING(id)
WHERE c3 > 10 GROUP BY c1
SelectList TableRefs WhereClause
SelectStmt
GroupByClause
ColRef AggExpr
ColRef
BinaryPredicate
ColRef IntLiteral
ColRefTableRef TableRef
UsingClause
ColRef
• Applies SQL grammar, reports syntax errors
• Produces parse tree capturing syntactic structure of query

Semantic Analysis…
• Precondition: Query is syntactically valid. Analysis operates on parse tree.
• Consults table metadata
• Do t1 and t2 exist? Does c1 exist in t1 or t2 (or both  error)? Does id exist in t1 and t2?
• Does the user have privileges to SELECT from t1?
• Checks type compatibility of expressions, adds implicit casts
• c3 > 10  c3 > cast(10 as bigint)
• SQL rules (semantic, not syntactic)
• Does c1 appear in the GROUP BY clause?
SELECT c1, SUM(c2)
FROM t1 JOIN t2 USING(id)

… Semantic Analysis
• Expression substitution for views
• Resolve column references against base tables
• Preparation for Planning
• Register state in analyzer for correct predicate assignment during planning
• Register predicates (WHERE, HAVING, ON, USING, etc.)
• Register outer-joined tables
• Compute value-transfer graph and equivalence classes for predicate inference
• (…)
• Postcondition: Query is valid. An executable plan can be produced.
SELECT c1, SUM(c2)
FROM (SELECT dept AS c1, revenue AS c2,
month AS c3 FROM t1) AS v
SELECT dept, SUM(revenue)
FROM t1
WHERE month > 10
GROUP BY dept

• Generate executable plan (“tree” of operators)
• Maximize scan locality using DN block metadata
• Minimize data movement
• Full distribution of operators
• Query operators
• Scan, HashJoin, HashAggregation, Union, TopN,
Exchange
Query Planning: Goals

Query Planning: Overview
Semantic
Analysis
Parse Tree + Analyzer
Query
Planner
Walk Parse Tree
Parallelize
& Fragment
Single-node Plan
Executable Plan

Query Planning: Single-Node Plan
• Four major functions:
1. Parse Tree  Plan Tree
2. Assigns predicates to lowest plan node
3. Optimizes join order
4. Prunes irrelevant columns

Parse Tree  Single-Node Plan Tree
HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
TopN
Agg
SELECT t1.dept, SUM(t2.revenue)
FROM LargeHdfsTable t1
JOIN HugeHdfsTable t2 ON (t1.id1 = t2.id)
JOIN SmallHbaseTable t3 ON (t1.id2 = t3.id)
WHERE t3.category = 'Online‘ AND t1.id > 10
GROUP BY t1.dept
HAVING COUNT(t2.revenue) > 10
ORDER BY revenue LIMIT 10

GROUP BY t1.dept
Predicate Assignment & Inference
HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
TopN
Agg
COUNT(t2.revenue) > 10
t1.id2 = t3.id
t1.id1 = t2.id
id1 > 10
category = ‘Online’
id > 10
Inferred
Predicate

Join-Order Optimization
• Inner joins are commutative and associative
• Query results correct independent of execution order
• Query execution costs vary dramatically!
• Hash table sizes, network transfers, #hash lookups
• Join-order optimization
• Impala only considers left-deep join trees
• (Right join input is a table, not another join)
• Find cheapest valid join order
• Relies heavily on table and column statistics
• Limitation: Choice of join order independent of join strategy

Invalid Join Orders
GROUP BY t1.dept
No explicit or implicit
predicate between t2 and t3

HashJoin
Scan: t1
Scan: t3
Scan: t2
HashJoin
HashJoin
Scan: t1
Scan: t2
Scan: t3
HashJoin
HashJoin
Scan: t2
Scan: t3
Scan: t1
HashJoin
HashJoin
Scan: t2
Scan: t1
Scan: t3
HashJoin
HashJoin
Scan: t3
Scan: t2
Scan: t1
HashJoin
HashJoin
Scan: t3
Scan: t1
Scan: t2
HashJoin
Order:
t1, t2, t3
Order:
t1, t3, t2
Order:
t2, t1, t3
Order:
t2, t3, t1
Order:
t3, t1, t2
Order:
t3, t2, t1

• Impala’s Implementation:
1. Heuristic
• Order tables descending by size
• Best plan typically has largest table on the left (if valid)
2. Plan enumeration & costing
• Generate all possible join orders starting from a given
left-most table (starting with largest one)
• Ignore invalid join orders
• Estimate intermediate result sizes (key!)
• Choose plan that minimizes intermediate result sizes

Query Planning: Distributed Plans
• Distributed Aggregation
• Pre-aggregation where data is first materialized
• Merge-aggregation partitioned by grouping columns
• Distinct aggregation: additional level of pre- and merge aggregation
• Distributed Top-N
• Initial Top-N where data is first materialized
• Final Top-N at coordinator
• Distributed Union
• Pre-aggregation/top-n placed into plans of each union operand
• Union-operand plans executed in parallel, merged via exchange
• Above strategies are currently fixed in Impala
• Independent of column/table stats

Query Planning: Distributed Joins
• Broadcast Join
• Join is co-located with left input
• Broadcast right input to all nodes executing join
• Build hash table on right input, streaming probe from left input
•  Preferred for small right side (relative to left side)
• Partitioned Join
• Both tables hash-partitioned on join columns
• Same build/probe procedure as above
•  Preferred for joins where both left and right side are large
• Cost-based decision based on table/column stats
• Minimize required network transfer

Query Planning: Distributed Plans
HashJoinScan: t2
Scan: t3
Scan: t1
HashJoin
TopN
Pre-Agg
MergeAgg
TopN
Broadcast
Merge
hash t2.idhash t1.id1
hash
t1.custid
at HDFS DN
at HBase RS
at coordinator
HashJoin
Scan: t2
Scan: t3
Scan: t1
HashJoin
TopN
Agg
Single-Node
Plan

Explain Example: TPCDS Q42
SELECT d.d_year, i.i_category_id, i.i_category, SUM(ss_ext_sales_price)
FROM store_sales ss
JOIN date_dim d
ON (ss.ss_sold_date_sk = d.d_date_sk)
JOIN item i
ON (ss.ss_item_sk = i.i_item_sk)
WHERE i.i_manager_id = 1 AND d.d_moy = 12 AND d.d_year = 1998
GROUP BY d.d_year, i.i_category_id, i.i_category
ORDER BY total_sales DESC, d_year, i_category_id, i_category
LIMIT 100

+---------------------------------------------------------------------+
| Explain String |
+---------------------------------------------------------------------+
| Estimated Per-Host Requirements: Memory=3.76GB VCores=3 |
| |
| 12:TOP-N [LIMIT=100] |
| 11:EXCHANGE [PARTITION=UNPARTITIONED] |
| 06:TOP-N [LIMIT=100] |
| 10:AGGREGATE [MERGE FINALIZE] |
| 09:EXCHANGE [PARTITION=HASH(d.d_year,i.i_category_id,i.i_category)] |
| 05:AGGREGATE |
| 04:HASH JOIN [INNER JOIN, BROADCAST] |
| |--08:EXCHANGE [BROADCAST] |
| | 02:SCAN HDFS [tpcds1000gb.item i] |
| | 01:SCAN HDFS [tpcds1000gb.date_dim d] |
| 00:SCAN HDFS [tpcds1000gb.store_sales ss] |
+---------------------------------------------------------------------+
set num_nodes=0;

Conclusion
• Cost-based choice of join order and strategy
• Critical for performance
• Relies on table and column stats
• Other plan optimizations currently independent of stats
• Likely to expand plan choices in the future
• Likely to increase reliance on stats
• Helpful Impala commands
• compute stats
• show table/column stats
• explain query/insert stmt
• set explain_level=[0-3]
• set num_nodes=0  show single-node plan

Try It Out!
•Questions/comments?
• Download: cloudera.com/impala
• Email: impala-user@cloudera.org
• Join: groups.cloudera.org

Query Compilation in Impala

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Query Compilation in Impala

Similaire à Query Compilation in Impala (20)

Plus de Cloudera, Inc.

Plus de Cloudera, Inc. (20)

Dernier

Dernier (20)

Query Compilation in Impala