SlideShare une entreprise Scribd logo
Debugging Planning Issues
Using Calcite's Built-in Loggers
Stamatis Zampetakis
Alessandro Solimando
Virtual Meetup
March 15, 2023
Alessandro Solimando @asolimando
Senior Software Engineer @ Rakuten*
Committer of Apache Calcite & Apache Hive
PhD in Data Management, University of Genoa
(* the work presented here was done while at Cloudera)
Stamatis Zampetakis @szampetak
Staff Software Engineer @ Cloudera, Hive query optimizer team
PMC member of Apache Calcite & Apache Hive
PhD in Data Management, INRIA & Paris-Sud University
About us
Outline
● Motivation / Common planning issues
● Calcite’s built-in logging (RuleEventLogger)
○ Configuration (XML/Properties)
○ Output explained
● Hive case studies:
○ Hanging TPC-DS queries
○ OutOfMemoryError (HIVE-25758)
○ Wrong results (HIVE-26722)
● Conclusion
Common Planning Issues
Issue Diagnostic Information
OutOfMemoryError Heap dumps
StackOverflowError Stack traces
Unresponsive server Stack traces
Very slow (or hanging) queries Stack traces
Wrong results Query plans
Query crashes (NPE, AssertionError, ClassCastException) Stack traces
Common Planning Issues
● Many times the root cause lies in rule based transformations
Issue Diagnostic Information
OutOfMemoryError Heap dumps
StackOverflowError Stack traces
Unresponsive server Stack traces
Very slow (or hanging) queries Stack traces
Wrong results Query plans
Query crashes (NPE, AssertionError, ClassCastException) Stack traces
Common Planning Issues
● Many times the root cause lies in rule based transformations
● Good logs are key for finding this root cause
Issue Diagnostic Information
OutOfMemoryError Heap dumps
StackOverflowError Stack traces
Unresponsive server Stack traces
Very slow (or hanging) queries Stack traces
Wrong results Query plans
Query crashes (NPE, AssertionError, ClassCastException) Stack traces
Common Planning Issues
● Interactive debugging not always feasible:
○ CI/CU environment with restricted access
○ Data specific problem
● Debugging planning issues interactively (without logs) is difficult:
○ Large number of transformations
○ Complex transformation logic
○ Lots of intermediate objects
○ Repeated occurrences of transformations
Built-in logging: RuleEventLogger
RuleEventLogger - XML Configuration
https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration
Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file
<Configuration>
<Appenders>
<Console name="A1" target="SYSTEM_OUT">
<PatternLayout
pattern="%m%n"/>
</Console>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="A1"/>
</Root>
<logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG">
<MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/>
</logger>
</Loggers>
</Configuration>
RuleEventLogger - XML Configuration
https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration
Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file
<Configuration>
<Appenders>
<Console name="A1" target="SYSTEM_OUT">
<PatternLayout
pattern="%m%n"/>
</Console>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="A1"/>
</Root>
<logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG">
<MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/>
</logger>
</Loggers>
</Configuration>
RuleEventLogger - XML Configuration
https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration
Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file
<Configuration>
<Appenders>
<Console name="A1" target="SYSTEM_OUT">
<PatternLayout
pattern="%m%n"/>
</Console>
</Appenders>
<Loggers>
<Root level="INFO">
<AppenderRef ref="A1"/>
</Root>
<logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG">
<MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/>
</logger>
</Loggers>
</Configuration>
CALCITE-4704 (1.29.0)
CALCITE-4991 (1.30.0)
RuleEventLogger - Properties Configuration
appenders = A1
appender.A1.type = Console
appender.A1.name = A1
appender.A1.target = SYSTEM_OUT
appender.A1.layout.type = PatternLayout
appender.A1.layout.pattern = %m%n
loggers = CBORuleLogger
rootLogger.level = INFO
rootLogger.appenderRefs = A1
rootLogger.appenderRef.A1.ref = A1
logger.CBORuleLogger.name = org.apache.calcite.plan.RelOptPlanner
logger.CBORuleLogger.level = DEBUG
logger.CBORuleLogger.filter.marker.type= MarkerFilter
logger.CBORuleLogger.filter.marker.marker= FULL_PLAN
logger.CBORuleLogger.filter.marker.onMatch= DENY
logger.CBORuleLogger.filter.marker.onMismatch= NEUTRAL
https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration
Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.properties file
CALCITE-4704 (1.29.0)
CALCITE-4991 (1.30.0)
RuleEventLogger - Properties Configuration (HIVE)
appenders = A1
appender.A1.type = Console
appender.A1.name = A1
appender.A1.target = SYSTEM_OUT
appender.A1.layout.type = PatternLayout
appender.A1.layout.pattern = %m%n
loggers = …,CBORuleLogger
rootLogger.level = INFO
rootLogger.appenderRefs = A1
rootLogger.appenderRef.A1.ref = A1
logger.CBORuleLogger.name = org.apache.hadoop.hive.ql.optimizer.calcite.RuleEventLogger
logger.CBORuleLogger.level = DEBUG
logger.CBORuleLogger.filter.marker.type= MarkerFilter
logger.CBORuleLogger.filter.marker.marker= FULL_PLAN
logger.CBORuleLogger.filter.marker.onMatch= DENY
logger.CBORuleLogger.filter.marker.onMismatch= NEUTRAL
Hive tests: Modify data/conf/hive-log4j2.properties file
HIVE-25816
RuleEventLogger - Example
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
logger.CBORuleLogger.filter.marker.onMatch= DENY
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Full plan for rule input [rel#11:LogicalAggregate]:
LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT($2)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#0: Full plan for [rel#15:LogicalProject]:
LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
logger.CBORuleLogger.filter.marker.onMatch= ACCEPT
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate]
call#0: Full plan for rule input [rel#11:LogicalAggregate]:
LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT($2)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject]
call#0: Full plan for [rel#15:LogicalProject]:
LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
logger.CBORuleLogger.filter.marker.onMatch= ACCEPT
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject]
call#1: Full plan for rule input [rel#13:LogicalProject]:
LogicalProject(EMPNO=[$0], EXPR$1=[$2])
LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
call#1: Full plan for rule input [rel#15:LogicalProject]:
...
call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject]
call#1: Full plan for [rel#17:LogicalProject]:
LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
logger.CBORuleLogger.filter.marker.onMatch= ACCEPT
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
RuleEventLogger - Output Explained
./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3
RelOptRulesTest > testAggregateRemove3() STANDARD_OUT
call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject]
call#2: Full plan for rule input [rel#17:LogicalProject]:
LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)])
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
call#2: Full plan for rule input [rel#9:LogicalProject]:
LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject]
call#2: Full plan for [rel#19:LogicalProject]:
LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($3), 1:BIGINT, 0:BIGINT)])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
logger.CBORuleLogger.filter.marker.onMatch= ACCEPT
SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
Hive Case Studies
Hanging TPC-DS queries
● Context: Upgrade Calcite version from 1.25.0 to 1.33.0
● Symptom: TPCDS queries hanging
Hanging TPC-DS queries
● Context: Upgrade Calcite version from 1.25.0 to 1.33.0
● Symptom: TPCDS queries hanging (e.g., query13)
select avg(ss_quantity),
avg(ss_ext_sales_price),
avg(ss_ext_wholesale_cost),
sum(ss_ext_wholesale_cost)
from store_sales
,store
,customer_demographics
,household_demographics
,customer_address
,date_dim
where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and (
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'M'
and cd_education_status = '4 yr Degree' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3) or
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'D'
and cd_education_status = 'Primary' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1) or
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'U'
and cd_education_status = 'Advanced Degree' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1))
and(
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM')
and ss_net_profit between 100 and 200) or
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN')
and ss_net_profit between 150 and 300) or
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV')
and ss_net_profit between 50 and 250));
Hanging TPC-DS queries
● Context: Upgrade Calcite version from 1.25.0 to 1.33.0
● Symptom: TPCDS queries hanging (e.g., query13)
select avg(ss_quantity),
avg(ss_ext_sales_price),
avg(ss_ext_wholesale_cost),
sum(ss_ext_wholesale_cost)
from store_sales
,store
,customer_demographics
,household_demographics
,customer_address
,date_dim
where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and (
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'M'
and cd_education_status = '4 yr Degree' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3) or
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'D'
and cd_education_status = 'Primary' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1) or
(ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'U'
and cd_education_status = 'Advanced Degree' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1))
and(
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM')
and ss_net_profit between 100 and 200) or
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN')
and ss_net_profit between 150 and 300) or
(ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV')
and ss_net_profit between 50 and 250));
Hanging TPC-DS queries
● Why does it take so long?
● Where is it stuck?
● What does it do?
● Gather information:
○ Profile the application (async-profiler)
○ Collect stack traces (jstack)
○ Check the logs
Hanging TPC-DS queries
● Why does it take so long?
● Where is it stuck?
● What does it do?
● Gather information:
○ Profile the application (async-profiler)
○ Collect stack traces (jstack)
○ Check the logs
java.lang.Thread.State: RUNNABLE
at java.util.TreeMap.compare(TreeMap.java:1294)
at java.util.TreeMap.put(TreeMap.java:538)
at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272)
at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222)
at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225)
at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64)
at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41)
at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778)
at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761)
at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279)
at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
java.lang.Thread.State: RUNNABLE
at java.util.TreeMap.compare(TreeMap.java:1294)
at java.util.TreeMap.put(TreeMap.java:538)
at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272)
at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222)
at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225)
at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64)
at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41)
at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778)
at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761)
at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279)
at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
Depth:~ 2K
Lines
java.lang.Thread.State: RUNNABLE
at java.util.TreeMap.compare(TreeMap.java:1294)
at java.util.TreeMap.put(TreeMap.java:538)
at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272)
at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222)
at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225)
at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64)
at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41)
at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778)
at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761)
at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279)
at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
Depth:~ 2K
Lines
From stack + code we can infer that the plan has more than 1K nested Filter operators
java.lang.Thread.State: RUNNABLE
at java.util.TreeMap.compare(TreeMap.java:1294)
at java.util.TreeMap.put(TreeMap.java:538)
at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272)
at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222)
at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225)
at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64)
at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41)
at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793)
at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778)
at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761)
at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source)
at java.util.ArrayList.forEach(ArrayList.java:1259)
at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279)
at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248)
at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source)
at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source)
at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
Depth:~ 2K
Lines
From stack + code we can infer that the plan has more than 1K nested Filter operators
What creates the operators? Why?
Hanging TPC-DS queries - Logs to the rescue
grep -A 10 "Rule.*produced" hive.log
Hanging TPC-DS queries - Logs to the rescue
2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter]
2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
HiveTableScan(table=[[default, store]], table:alias=[store])
--
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter]
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
--
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter]
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
Hanging TPC-DS queries - Logs to the rescue
2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter]
2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
HiveTableScan(table=[[default, store]], table:alias=[store])
--
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter]
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
--
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter]
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
Hanging TPC-DS queries - Root cause
2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter]
2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
HiveTableScan(table=[[default, store]], table:alias=[store])
--
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter]
2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales])
--
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter]
2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]:
HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '),
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
HivePreFilteringRule matches infinitely and creates identical filters multiple times
Hanging TPC-DS queries - Interactive debugging
OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
Hanging TPC-DS queries - Interactive debugging
OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
// 3. If the new conjuncts are already present in the plan, we bail out
final List<RexNode> newConjuncts =
HiveCalciteUtil.getPredsNotPushedAlready(filter.getInput(),operandsToPushDown);
Hanging TPC-DS queries - Interactive debugging
OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1))
// 3. If the new conjuncts are already present in the plan, we bail out
final List<RexNode> newConjuncts =
HiveCalciteUtil.getPredsNotPushedAlready(filter.getInput(),operandsToPushDown);
mq.getPulledUpPredicates(inp).pulledUpPredicates
Hanging TPC-DS queries - Interactive debugging
OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1))
Hanging TPC-DS queries - Interactive debugging
OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1))
CALCITE-4173 Add internal SEARCH operator and Sarg literal, replacing use of IN in RexCall (1.26.0)
CALCITE-5036 RelMetadataQuery#getPulledUpPredicates support to analyze constant key for the operator of
IS_NOT_DISTINCT_FROM (1.31.0)
Check What CBO Rules Are Applied
Command: grep --no-filename "produced" hive.log | cut -c 100-
Output format: call#$X: Rule [$RULE_DESCRIPTION] produced [rel#$Y:$REL_KIND]
Sample output:
call#1: Rule [HivePreFilteringRule] produced [rel#73:HiveFilter]
call#12: Rule [ReduceExpressionsRule(Filter)] produced [rel#76:HiveFilter]
call#14: Rule [FilterCondition] produced [rel#78:HiveFilter]
call#17: Rule [ReduceExpressionsRule(Filter)] produced [rel#80:HiveFilter]
call#21: Rule [FilterCondition] produced [rel#82:HiveFilter]
call#29: Rule [HiveProjectFilterPullUpConstantsRule] produced [rel#84:HiveProject]
call#46: Rule [HiveJoinAddNotNullRule] produced [rel#89:HiveJoin]
Once you have the rel number ($Y), you can look it up in the logs to see it printed in full
If you are interested in a particular rule application, search for the call ($X), input and output rels are printed close to the
corresponding “produced” line
OutOfMemory at Planning Time
● We look for a sequence of rule applications which loops from a certain point
of the query planning process
● Run “EXPLAIN” against the offending query and look at the rules which are
invoked in the logs to identify the loop
● Once identified, find the first occurrence of applications of this list, and start
analyzing the plans generated right before/after it
● Note: the rel triggering the loop might also be created outside this sequence
OutOfMemory at Planning Time - Example (HIVE-25758)
SELECT c.month, d.con_usd
FROM
(SELECT
cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(),
'yyyy-MM-dd'), -1), 1, 7), '-', '') AS int) AS month
FROM test1
UNION ALL
SELECT month
FROM test2
WHERE month = 202110
) c
JOIN test3 d ON c.month = d.mth;
● Symptoms: OOM due to the creation of ever increasingly complex filters
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], cost=[not
available])
HiveUnion(all=[true])
HiveProject(month=[CAST(regexp_replace(...):INTEGER**])
HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...):INTEGER)])
HiveTableScan(table=[[default, test1]], table:alias=[test1])
HiveProject(month=[CAST(202110):INTEGER])
HiveFilter(condition=[=($0, 202110)])
HiveTableScan(table=[[default, test2]], table:alias=[test2])
HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[default, test3]], table:alias=[d])
** Abbreviation of “CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
_UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647)
CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET
"UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET
"UTF-16LE")):INTEGER”
OutOfMemory at Planning Time - Example (HIVE-25758)
call#198: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#136:HiveJoin]
call#212: Rule [HiveFilterSetOpTransposeRule] produced [rel#142:HiveUnion]
call#223: Rule [HiveFilterProjectTransposeRule] produced [rel#148:HiveProject]
call#228: Rule [HiveFilterMergeRule] produced [rel#152:HiveFilter]
call#239: Rule [HiveFilterProjectTransposeRule] produced [rel#153:HiveProject]
call#258: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#155:HiveJoin]
call#273: Rule [HiveFilterProjectTransposeRule] produced [rel#160:HiveProject]
call#278: Rule [HiveFilterMergeRule] produced [rel#164:HiveFilter]
call#283: Rule [ReduceExpressionsRule(Filter)] produced [rel#166:HiveFilter]
RHS → LHS
LHS → RHS
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveJoin(condition=[=($0, $1)], joinType=[inner], [...])
HiveUnion(all=[true])
HiveProject(month=[CAST(regexp_replace(...)):INTEGER**])
HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...):INTEGER)])
HiveTableScan(table=[[default, test1]], table:alias=[test1])
HiveProject(month=[CAST(202110):INTEGER])
HiveFilter(condition=[=($0, 202110)])
HiveTableScan(table=[[default, test2]], table:alias=[test2])
HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[default, test3]], table:alias=[d])
** Abbreviation of “CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP,
_UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647)
CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET
"UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET
"UTF-16LE")):INTEGER”
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveJoin(condition=[=($0, $1)], joinType=[inner], [...])
HiveUnion(all=[true])
HiveProject(month=[CAST(regexp_replace(...)):INTEGER])
HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)])
HiveTableScan(table=[[default, test1]], table:alias=[test1])
HiveProject(month=[CAST(202110):INTEGER])
HiveFilter(condition=[=($0, 202110)])
HiveTableScan(table=[[default, test2]], table:alias=[test2])
HiveFilter(condition=[OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0,
202110))])
HiveFilter(condition=[IS NOT NULL($0)])
HiveTableScan(table=[[default, test3]], table:alias=[d])
call#258: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#155:HiveJoin]
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveJoin(condition=[=($0, $1)], joinType=[inner], [...])
HiveUnion(all=[true])
HiveProject(month=[CAST(regexp_replace(...)):INTEGER])
HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)])
HiveTableScan(table=[[default, test1]], table:alias=[test1])
HiveProject(month=[CAST(202110):INTEGER])
HiveFilter(condition=[=($0, 202110)])
HiveTableScan(table=[[default, test2]], table:alias=[test2])
HiveFilter(condition=[AND(OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0,
202110)), IS NOT NULL($0))])
HiveTableScan(table=[[default, test3]], table:alias=[d])
call#278: Rule [HiveFilterMergeRule] produced [rel#164:HiveFilter]
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveJoin(condition=[=($0, $1)], joinType=[inner], [...])
HiveUnion(all=[true])
HiveProject(month=[CAST(regexp_replace(...)):INTEGER])
HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)])
HiveTableScan(table=[[default, test1]], table:alias=[test1])
HiveProject(month=[CAST(202110):INTEGER])
HiveFilter(condition=[=($0, 202110)])
HiveTableScan(table=[[default, test2]], table:alias=[test2])
HiveFilter(condition=[AND(OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0,
202110)), IS NOT NULL($0))])
HiveTableScan(table=[[default, test3]], table:alias=[d])
ReduceExpressionsRule(Filter) can’t simplify the predicate
OutOfMemory at Planning Time - Example (HIVE-25758)
HiveFilter(condition=[
AND(
IN($0, CAST(regexp_replace(...)):INTEGER, 202110),
OR(
AND(
OR(
IS NOT NULL(CAST(regexp_replace(...)):INTEGER),
=(CAST(regexp_replace(...)):INTEGER, 202110)
),
=($0, CAST(regexp_replace(...)):INTEGER)
),
=($0, 202110)
)
)
])
Incomplete / Incorrect Plan
● Usually boils down to identifying the first rel that looks “bad”, and trace it back
by looking for the rel#, which is unique, and should appear in the preceding
part of the logs
● Resolution highly depends on the specific issue, but with a precise call# or
rel# it’s possible to set a conditional breakpoint (rules can be invoked multiple
times before producing the issue)
Incomplete / Incorrect Plan Hive - Example (HIVE-26722)
● OK:
SELECT * FROM (
SELECT a, b FROM t
UNION ALL
SELECT a, NULL FROM t
) AS t2 WHERE a = 1;
● KO: (missing results)
SELECT * FROM (
SELECT a, b FROM t
UNION ALL
SELECT a, cast(NULL as STRING) FROM t
) AS t2 WHERE a = 1;
HiveUnion(all=[true])
HiveProject(a=[$0], b=[$1])
HiveFilter(condition=[=(CAST($0):DOUBLE, 1)])
HiveTableScan(table=[[default, t]], table:alias=[t])
HiveProject(a=[$0], _o__c1=[null:VARCHAR(2147483647)
CHARACTER SET "UTF-16LE"])
HiveFilter(condition=[=(CAST($0):DOUBLE, 1)])
HiveTableScan(table=[[default, t]], table:alias=[t])
HiveProject(a=[$0], b=[$1])
HiveFilter(condition=[=(CAST($0):DOUBLE, 1)])
HiveTableScan(table=[[default, t]], table:alias=[t])
Incomplete / Incorrect Plan Hive - Example (HIVE-26722)
call#5: Rule [HiveFilterProjectTransposeRule]
produced [rel#41:HiveFilter]
call#6: Rule [HiveFilterSetOpTransposeRule]
produced [rel#43:HiveFilter
]
call#7: Rule [HiveFilterProjectTransposeRule]
produced [rel#47:HiveProject]
call#24: Rule [HivePartitionPruneRule(Filter)]
produced [rel#52:HiveFilter]
call#32: Rule [HiveFieldTrimmerRule] produced
[rel#70:HiveFilter]
call#41: Rule [HiveFilterProjectTSTransposeRule]
produced [rel#77:HiveProject]
call#43: Rule [HiveCardinalityPreservingJoinRule]
produced [rel#87:HiveProject]
call#5: Rule [HiveFilterProjectTransposeRule]
produced [rel#47:HiveFilter]
call#6: Rule [HiveFilterSetOpTransposeRule]
produced [rel#51:HiveUnion
]
call#7: Rule [HiveFilterProjectTransposeRule]
produced [rel#57:HiveProject]
call#13: Rule [HiveFilterProjectTransposeRule]
produced [rel#62:HiveProject]
call#15: Rule [ReduceExpressionsRule(Project)]
produced [rel#66:HiveProject]
call#19: Rule [HiveFilterProjectTransposeRule]
produced [rel#69:HiveProject]
call#49: Rule [HivePartitionPruneRule(Filter)]
produced [rel#73:HiveFilter]
call#64: Rule [HiveFieldTrimmerRule] produced
...
Disabling Rules via Configuration
● AbstractRelOptPlanner#setRuleDescExclusionFilter allows to exclude rules
based on a regex over their description, even if registered in the planner
● Benefits:
○ It avoids recompiling to exclude some rules from planning while troubleshooting
○ Can be a quick workaround to customers once the faulty rule(s) is identified
○ Can be activated “per-query”
○ Less invasive than disabling CBO altogether (e.g., “hive.cbo.enable=false”)
● HIVE-25880: “Add property to exclude CBO rules by a regex on their
description”
● Example:
set hive.cbo.rule.exclusion.regex=HiveJoinPushTransitivePredicatesRule|HivePreFilteringRule;
Conclusion
● Many planning issues boil down to rule transformations making plans bigger
and bigger
● Built-in loggers indispensable for fast troubleshooting
● Easy configuration via XML/Property files (Log4j2)
● Configurable verbosity via FULL_PLAN marker
● Common pain points:
○ Using multiple equivalent operators (OR, SEARCH, IN, etc.)
○ Push/Pull predicate logic in various rules
○ Inconsistent simplifications during planning
● Exploit AbstractRelOptPlanner#setRuleDescExclusionFilter for quick
workarounds and hypotheses verification
Thank you

Contenu connexe

Similaire à Debugging Planning Issues Using Calcite's Built-in Loggers

OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at RuntimeOSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
NETWAYS
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
Graham Dumpleton
 
Regain Control Thanks To Prometheus
Regain Control Thanks To PrometheusRegain Control Thanks To Prometheus
Regain Control Thanks To Prometheus
Etienne Coutaud
 
手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務
Mu Chun Wang
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?
Enkitec
 
Dot Net Application Monitoring
Dot Net Application MonitoringDot Net Application Monitoring
Dot Net Application Monitoring
Ravi Okade
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
Jano Suchal
 
Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profiler
Ihor Bobak
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
Engine Yard
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
ManageIQ
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
Sylvain Wallez
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]
Igor Lozynskyi
 
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Brian Brazil
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
Wojciech Barczyński
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
Alex Zaballa
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
Alex Zaballa
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
Jakub Hajek
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
PROIDEA
 
Squeezing Blood From a Stone V1.2
Squeezing Blood From a Stone V1.2Squeezing Blood From a Stone V1.2
Squeezing Blood From a Stone V1.2
Jen Costillo
 
.NET @ apache.org
 .NET @ apache.org .NET @ apache.org
.NET @ apache.org
Ted Husted
 

Similaire à Debugging Planning Issues Using Calcite's Built-in Loggers (20)

OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at RuntimeOSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
OSMC 2021 | inspectIT Ocelot: Dynamic OpenTelemetry Instrumentation at Runtime
 
PyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web ApplicationsPyCon AU 2012 - Debugging Live Python Web Applications
PyCon AU 2012 - Debugging Live Python Web Applications
 
Regain Control Thanks To Prometheus
Regain Control Thanks To PrometheusRegain Control Thanks To Prometheus
Regain Control Thanks To Prometheus
 
手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務手把手教你如何串接 Log 到各種網路服務
手把手教你如何串接 Log 到各種網路服務
 
How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?How Many Ways Can I Manage Oracle GoldenGate?
How Many Ways Can I Manage Oracle GoldenGate?
 
Dot Net Application Monitoring
Dot Net Application MonitoringDot Net Application Monitoring
Dot Net Application Monitoring
 
Garelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoringGarelic: Google Analytics as App Performance monitoring
Garelic: Google Analytics as App Performance monitoring
 
Hadoop cluster performance profiler
Hadoop cluster performance profilerHadoop cluster performance profiler
Hadoop cluster performance profiler
 
6 tips for improving ruby performance
6 tips for improving ruby performance6 tips for improving ruby performance
6 tips for improving ruby performance
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
 
Native Java with GraalVM
Native Java with GraalVMNative Java with GraalVM
Native Java with GraalVM
 
RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]RxJava applied [JavaDay Kyiv 2016]
RxJava applied [JavaDay Kyiv 2016]
 
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
Better Monitoring for Python: Inclusive Monitoring with Prometheus (Pycon Ire...
 
How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?How to monitor your micro-service with Prometheus?
How to monitor your micro-service with Prometheus?
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2DBA Commands and Concepts That Every Developer Should Know - Part 2
DBA Commands and Concepts That Every Developer Should Know - Part 2
 
Docker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic StackDocker Logging and analysing with Elastic Stack
Docker Logging and analysing with Elastic Stack
 
Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek Docker Logging and analysing with Elastic Stack - Jakub Hajek
Docker Logging and analysing with Elastic Stack - Jakub Hajek
 
Squeezing Blood From a Stone V1.2
Squeezing Blood From a Stone V1.2Squeezing Blood From a Stone V1.2
Squeezing Blood From a Stone V1.2
 
.NET @ apache.org
 .NET @ apache.org .NET @ apache.org
.NET @ apache.org
 

Dernier

Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
Peter Muessig
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
Quickdice ERP
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
Ayan Halder
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Top 9 Trends in Cybersecurity for 2024.pptx
Top 9 Trends in Cybersecurity for 2024.pptxTop 9 Trends in Cybersecurity for 2024.pptx
Top 9 Trends in Cybersecurity for 2024.pptx
devvsandy
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 

Dernier (20)

Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemUI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
UI5con 2024 - Keynote: Latest News about UI5 and it’s Ecosystem
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesE-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian Companies
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Using Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional SafetyUsing Xen Hypervisor for Functional Safety
Using Xen Hypervisor for Functional Safety
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Top 9 Trends in Cybersecurity for 2024.pptx
Top 9 Trends in Cybersecurity for 2024.pptxTop 9 Trends in Cybersecurity for 2024.pptx
Top 9 Trends in Cybersecurity for 2024.pptx
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 

Debugging Planning Issues Using Calcite's Built-in Loggers

  • 1. Debugging Planning Issues Using Calcite's Built-in Loggers Stamatis Zampetakis Alessandro Solimando Virtual Meetup March 15, 2023
  • 2. Alessandro Solimando @asolimando Senior Software Engineer @ Rakuten* Committer of Apache Calcite & Apache Hive PhD in Data Management, University of Genoa (* the work presented here was done while at Cloudera) Stamatis Zampetakis @szampetak Staff Software Engineer @ Cloudera, Hive query optimizer team PMC member of Apache Calcite & Apache Hive PhD in Data Management, INRIA & Paris-Sud University About us
  • 3. Outline ● Motivation / Common planning issues ● Calcite’s built-in logging (RuleEventLogger) ○ Configuration (XML/Properties) ○ Output explained ● Hive case studies: ○ Hanging TPC-DS queries ○ OutOfMemoryError (HIVE-25758) ○ Wrong results (HIVE-26722) ● Conclusion
  • 4. Common Planning Issues Issue Diagnostic Information OutOfMemoryError Heap dumps StackOverflowError Stack traces Unresponsive server Stack traces Very slow (or hanging) queries Stack traces Wrong results Query plans Query crashes (NPE, AssertionError, ClassCastException) Stack traces
  • 5. Common Planning Issues ● Many times the root cause lies in rule based transformations Issue Diagnostic Information OutOfMemoryError Heap dumps StackOverflowError Stack traces Unresponsive server Stack traces Very slow (or hanging) queries Stack traces Wrong results Query plans Query crashes (NPE, AssertionError, ClassCastException) Stack traces
  • 6. Common Planning Issues ● Many times the root cause lies in rule based transformations ● Good logs are key for finding this root cause Issue Diagnostic Information OutOfMemoryError Heap dumps StackOverflowError Stack traces Unresponsive server Stack traces Very slow (or hanging) queries Stack traces Wrong results Query plans Query crashes (NPE, AssertionError, ClassCastException) Stack traces
  • 7. Common Planning Issues ● Interactive debugging not always feasible: ○ CI/CU environment with restricted access ○ Data specific problem ● Debugging planning issues interactively (without logs) is difficult: ○ Large number of transformations ○ Complex transformation logic ○ Lots of intermediate objects ○ Repeated occurrences of transformations
  • 9. RuleEventLogger - XML Configuration https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file <Configuration> <Appenders> <Console name="A1" target="SYSTEM_OUT"> <PatternLayout pattern="%m%n"/> </Console> </Appenders> <Loggers> <Root level="INFO"> <AppenderRef ref="A1"/> </Root> <logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG"> <MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/> </logger> </Loggers> </Configuration>
  • 10. RuleEventLogger - XML Configuration https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file <Configuration> <Appenders> <Console name="A1" target="SYSTEM_OUT"> <PatternLayout pattern="%m%n"/> </Console> </Appenders> <Loggers> <Root level="INFO"> <AppenderRef ref="A1"/> </Root> <logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG"> <MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/> </logger> </Loggers> </Configuration>
  • 11. RuleEventLogger - XML Configuration https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.xml file <Configuration> <Appenders> <Console name="A1" target="SYSTEM_OUT"> <PatternLayout pattern="%m%n"/> </Console> </Appenders> <Loggers> <Root level="INFO"> <AppenderRef ref="A1"/> </Root> <logger name="org.apache.calcite.plan.RelOptPlanner"level="DEBUG"> <MarkerFilter marker="FULL_PLAN" onMatch="DENY" onMismatch="NEUTRAL"/> </logger> </Loggers> </Configuration> CALCITE-4704 (1.29.0) CALCITE-4991 (1.30.0)
  • 12. RuleEventLogger - Properties Configuration appenders = A1 appender.A1.type = Console appender.A1.name = A1 appender.A1.target = SYSTEM_OUT appender.A1.layout.type = PatternLayout appender.A1.layout.pattern = %m%n loggers = CBORuleLogger rootLogger.level = INFO rootLogger.appenderRefs = A1 rootLogger.appenderRef.A1.ref = A1 logger.CBORuleLogger.name = org.apache.calcite.plan.RelOptPlanner logger.CBORuleLogger.level = DEBUG logger.CBORuleLogger.filter.marker.type= MarkerFilter logger.CBORuleLogger.filter.marker.marker= FULL_PLAN logger.CBORuleLogger.filter.marker.onMatch= DENY logger.CBORuleLogger.filter.marker.onMismatch= NEUTRAL https://logging.apache.org/log4j/2.x/manual/configuration.html#automatic-configuration Calcite unit tests: Add/Modify core/src/test/resources/log4j2-test.properties file CALCITE-4704 (1.29.0) CALCITE-4991 (1.30.0)
  • 13. RuleEventLogger - Properties Configuration (HIVE) appenders = A1 appender.A1.type = Console appender.A1.name = A1 appender.A1.target = SYSTEM_OUT appender.A1.layout.type = PatternLayout appender.A1.layout.pattern = %m%n loggers = …,CBORuleLogger rootLogger.level = INFO rootLogger.appenderRefs = A1 rootLogger.appenderRef.A1.ref = A1 logger.CBORuleLogger.name = org.apache.hadoop.hive.ql.optimizer.calcite.RuleEventLogger logger.CBORuleLogger.level = DEBUG logger.CBORuleLogger.filter.marker.type= MarkerFilter logger.CBORuleLogger.filter.marker.marker= FULL_PLAN logger.CBORuleLogger.filter.marker.onMatch= DENY logger.CBORuleLogger.filter.marker.onMismatch= NEUTRAL Hive tests: Modify data/conf/hive-log4j2.properties file HIVE-25816
  • 14. RuleEventLogger - Example ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 15. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 16. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 17. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 18. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 19. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] logger.CBORuleLogger.filter.marker.onMatch= DENY SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 20. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Full plan for rule input [rel#11:LogicalAggregate]: LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT($2)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#0: Full plan for [rel#15:LogicalProject]: LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) logger.CBORuleLogger.filter.marker.onMatch= ACCEPT SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 21. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#0: Apply rule [AggregateRemoveRule] to [rel#11:LogicalAggregate] call#0: Full plan for rule input [rel#11:LogicalAggregate]: LogicalAggregate(group=[{0, 1}], EXPR$1=[COUNT($2)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) call#0: Rule [AggregateRemoveRule] produced [rel#15:LogicalProject] call#0: Full plan for [rel#15:LogicalProject]: LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) logger.CBORuleLogger.filter.marker.onMatch= ACCEPT SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 22. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#1: Apply rule [ProjectMergeRule] to [rel#13:LogicalProject,rel#15:LogicalProject] call#1: Full plan for rule input [rel#13:LogicalProject]: LogicalProject(EMPNO=[$0], EXPR$1=[$2]) LogicalProject(EMPNO=[$0], DEPTNO=[$1], $f2=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) call#1: Full plan for rule input [rel#15:LogicalProject]: ... call#1: Rule [ProjectMergeRule] produced [rel#17:LogicalProject] call#1: Full plan for [rel#17:LogicalProject]: LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) logger.CBORuleLogger.filter.marker.onMatch= ACCEPT SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 23. RuleEventLogger - Output Explained ./gradlew :core:test --tests RelOptRulesTest.testAggregateRemove3 RelOptRulesTest > testAggregateRemove3() STANDARD_OUT call#2: Apply rule [ProjectMergeRule] to [rel#17:LogicalProject,rel#9:LogicalProject] call#2: Full plan for rule input [rel#17:LogicalProject]: LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($2), 1:BIGINT, 0:BIGINT)]) LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) call#2: Full plan for rule input [rel#9:LogicalProject]: LogicalProject(EMPNO=[$0], DEPTNO=[$7], MGR=[$3]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) call#2: Rule [ProjectMergeRule] produced [rel#19:LogicalProject] call#2: Full plan for [rel#19:LogicalProject]: LogicalProject(EMPNO=[$0], EXPR$1=[CASE(IS NOT NULL($3), 1:BIGINT, 0:BIGINT)]) LogicalTableScan(table=[[CATALOG, SALES, EMP]]) logger.CBORuleLogger.filter.marker.onMatch= ACCEPT SELECT empno, count(mgr) FROM sales.emp GROUP BY empno, deptno
  • 25. Hanging TPC-DS queries ● Context: Upgrade Calcite version from 1.25.0 to 1.33.0 ● Symptom: TPCDS queries hanging
  • 26. Hanging TPC-DS queries ● Context: Upgrade Calcite version from 1.25.0 to 1.33.0 ● Symptom: TPCDS queries hanging (e.g., query13) select avg(ss_quantity), avg(ss_ext_sales_price), avg(ss_ext_wholesale_cost), sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and ( (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'M' and cd_education_status = '4 yr Degree' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'D' and cd_education_status = 'Primary' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'U' and cd_education_status = 'Advanced Degree' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1)) and( (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ss_net_profit between 100 and 200) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ss_net_profit between 150 and 300) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ss_net_profit between 50 and 250));
  • 27. Hanging TPC-DS queries ● Context: Upgrade Calcite version from 1.25.0 to 1.33.0 ● Symptom: TPCDS queries hanging (e.g., query13) select avg(ss_quantity), avg(ss_ext_sales_price), avg(ss_ext_wholesale_cost), sum(ss_ext_wholesale_cost) from store_sales ,store ,customer_demographics ,household_demographics ,customer_address ,date_dim where s_store_sk = ss_store_sk and ss_sold_date_sk = d_date_sk and d_year = 2001 and ( (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'M' and cd_education_status = '4 yr Degree' and ss_sales_price between 100.00 and 150.00 and hd_dep_count = 3) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'D' and cd_education_status = 'Primary' and ss_sales_price between 50.00 and 100.00 and hd_dep_count = 1) or (ss_hdemo_sk=hd_demo_sk and cd_demo_sk = ss_cdemo_sk and cd_marital_status = 'U' and cd_education_status = 'Advanced Degree' and ss_sales_price between 150.00 and 200.00 and hd_dep_count = 1)) and( (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('KY', 'GA', 'NM') and ss_net_profit between 100 and 200) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('MT', 'OR', 'IN') and ss_net_profit between 150 and 300) or (ss_addr_sk = ca_address_sk and ca_country = 'United States' and ca_state in ('WI', 'MO', 'WV') and ss_net_profit between 50 and 250));
  • 28. Hanging TPC-DS queries ● Why does it take so long? ● Where is it stuck? ● What does it do? ● Gather information: ○ Profile the application (async-profiler) ○ Collect stack traces (jstack) ○ Check the logs
  • 29. Hanging TPC-DS queries ● Why does it take so long? ● Where is it stuck? ● What does it do? ● Gather information: ○ Profile the application (async-profiler) ○ Collect stack traces (jstack) ○ Check the logs
  • 30. java.lang.Thread.State: RUNNABLE at java.util.TreeMap.compare(TreeMap.java:1294) at java.util.TreeMap.put(TreeMap.java:538) at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272) at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222) at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225) at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64) at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41) at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812) at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793) at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778) at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761) at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279) at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841)
  • 31. java.lang.Thread.State: RUNNABLE at java.util.TreeMap.compare(TreeMap.java:1294) at java.util.TreeMap.put(TreeMap.java:538) at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272) at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222) at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225) at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64) at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41) at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812) at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793) at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778) at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761) at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279) at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) Depth:~ 2K Lines
  • 32. java.lang.Thread.State: RUNNABLE at java.util.TreeMap.compare(TreeMap.java:1294) at java.util.TreeMap.put(TreeMap.java:538) at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272) at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222) at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225) at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64) at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41) at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812) at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793) at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778) at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761) at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279) at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) Depth:~ 2K Lines From stack + code we can infer that the plan has more than 1K nested Filter operators
  • 33. java.lang.Thread.State: RUNNABLE at java.util.TreeMap.compare(TreeMap.java:1294) at java.util.TreeMap.put(TreeMap.java:538) at org.apache.hive.com.google.common.collect.TreeRangeSet.replaceRangeWithSameLowerBound(TreeRangeSet.java:272) at org.apache.hive.com.google.common.collect.TreeRangeSet.add(TreeRangeSet.java:222) at org.apache.hive.com.google.common.collect.RangeSet.addAll(RangeSet.java:225) at org.apache.hive.com.google.common.collect.AbstractRangeSet.addAll(AbstractRangeSet.java:64) at org.apache.hive.com.google.common.collect.TreeRangeSet.addAll(TreeRangeSet.java:41) at org.apache.calcite.rex.RexSimplify$RexSargBuilder.addSarg(RexSimplify.java:3056) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2b(RexSimplify.java:2894) at org.apache.calcite.rex.RexSimplify$SargCollector.accept2(RexSimplify.java:2812) at org.apache.calcite.rex.RexSimplify$SargCollector.accept_(RexSimplify.java:2793) at org.apache.calcite.rex.RexSimplify$SargCollector.accept(RexSimplify.java:2778) at org.apache.calcite.rex.RexSimplify$SargCollector.access$400(RexSimplify.java:2761) at org.apache.calcite.rex.RexSimplify.lambda$simplifyAnd$3(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify$$Lambda$1099/1247334493.accept(Unknown Source) at java.util.ArrayList.forEach(ArrayList.java:1259) at org.apache.calcite.rex.RexSimplify.simplifyAnd(RexSimplify.java:1488) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:279) at org.apache.calcite.rex.RexSimplify.simplifyUnknownAs(RexSimplify.java:248) at org.apache.calcite.rex.RexSimplify.simplify(RexSimplify.java:223) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:299) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) at org.apache.calcite.rel.metadata.RelMdPredicates.getPredicates(RelMdPredicates.java:292) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates_$(Unknown Source) at org.apache.calcite.rel.metadata.janino.GeneratedMetadata_PredicatesHandler.getPredicates(Unknown Source) at org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:841) Depth:~ 2K Lines From stack + code we can infer that the plan has more than 1K nested Filter operators What creates the operators? Why?
  • 34. Hanging TPC-DS queries - Logs to the rescue grep -A 10 "Rule.*produced" hive.log
  • 35. Hanging TPC-DS queries - Logs to the rescue 2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter] 2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) HiveTableScan(table=[[default, store]], table:alias=[store]) -- 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter] 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) -- 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter] 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
  • 36. Hanging TPC-DS queries - Logs to the rescue 2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter] 2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) HiveTableScan(table=[[default, store]], table:alias=[store]) -- 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter] 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) -- 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter] 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available])
  • 37. Hanging TPC-DS queries - Root cause 2022-10-07T05:52:23,575 DEBUG calcite.RuleEventLogger: call#1: Rule [HivePreFilteringRule] produced [rel#84:HiveFilter] 2022-10-07T05:52:23,576 DEBUG calcite.RuleEventLogger: call#1: Full plan for [rel#84:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) HiveTableScan(table=[[default, store]], table:alias=[store]) -- 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Rule [HivePreFilteringRule] produced [rel#89:HiveFilter] 2022-10-07T05:52:23,601 DEBUG calcite.RuleEventLogger: call#2: Full plan for [rel#89:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveTableScan(table=[[default, store_sales]], table:alias=[store_sales]) -- 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Rule [HivePreFilteringRule] produced [rel#94:HiveFilter] 2022-10-07T05:52:23,610 DEBUG calcite.RuleEventLogger: call#3: Full plan for [rel#94:HiveFilter]: HiveFilter(condition=[AND(=($27, $6), =($22, $99), =($105, 2001), =($4, $73), =($60, $3), OR(AND(=($62, _UTF-16LE'M'), =($63, _UTF-16LE'4 yr Degree HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveFilter(condition=[AND(OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')), OR(=($63, _UTF-16LE'4 yr Degree '), HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HiveJoin(condition=[true], joinType=[inner], algorithm=[none], cost=[not available]) HivePreFilteringRule matches infinitely and creates identical filters multiple times
  • 38. Hanging TPC-DS queries - Interactive debugging OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U'))
  • 39. Hanging TPC-DS queries - Interactive debugging OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')) // 3. If the new conjuncts are already present in the plan, we bail out final List<RexNode> newConjuncts = HiveCalciteUtil.getPredsNotPushedAlready(filter.getInput(),operandsToPushDown);
  • 40. Hanging TPC-DS queries - Interactive debugging OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')) SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1)) // 3. If the new conjuncts are already present in the plan, we bail out final List<RexNode> newConjuncts = HiveCalciteUtil.getPredsNotPushedAlready(filter.getInput(),operandsToPushDown); mq.getPulledUpPredicates(inp).pulledUpPredicates
  • 41. Hanging TPC-DS queries - Interactive debugging OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')) SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1))
  • 42. Hanging TPC-DS queries - Interactive debugging OR(=($62, _UTF-16LE'M'), =($62, _UTF-16LE'D'), =($62, _UTF-16LE'U')) SEARCH($62, Sarg[_UTF-16LE'D', _UTF-16LE'M', _UTF-16LE'U']:CHAR(1)) CALCITE-4173 Add internal SEARCH operator and Sarg literal, replacing use of IN in RexCall (1.26.0) CALCITE-5036 RelMetadataQuery#getPulledUpPredicates support to analyze constant key for the operator of IS_NOT_DISTINCT_FROM (1.31.0)
  • 43. Check What CBO Rules Are Applied Command: grep --no-filename "produced" hive.log | cut -c 100- Output format: call#$X: Rule [$RULE_DESCRIPTION] produced [rel#$Y:$REL_KIND] Sample output: call#1: Rule [HivePreFilteringRule] produced [rel#73:HiveFilter] call#12: Rule [ReduceExpressionsRule(Filter)] produced [rel#76:HiveFilter] call#14: Rule [FilterCondition] produced [rel#78:HiveFilter] call#17: Rule [ReduceExpressionsRule(Filter)] produced [rel#80:HiveFilter] call#21: Rule [FilterCondition] produced [rel#82:HiveFilter] call#29: Rule [HiveProjectFilterPullUpConstantsRule] produced [rel#84:HiveProject] call#46: Rule [HiveJoinAddNotNullRule] produced [rel#89:HiveJoin] Once you have the rel number ($Y), you can look it up in the logs to see it printed in full If you are interested in a particular rule application, search for the call ($X), input and output rels are printed close to the corresponding “produced” line
  • 44. OutOfMemory at Planning Time ● We look for a sequence of rule applications which loops from a certain point of the query planning process ● Run “EXPLAIN” against the offending query and look at the rules which are invoked in the logs to identify the loop ● Once identified, find the first occurrence of applications of this list, and start analyzing the plans generated right before/after it ● Note: the rel triggering the loop might also be created outside this sequence
  • 45. OutOfMemory at Planning Time - Example (HIVE-25758) SELECT c.month, d.con_usd FROM (SELECT cast(regexp_replace(substr(add_months(from_unixtime(unix_timestamp(), 'yyyy-MM-dd'), -1), 1, 7), '-', '') AS int) AS month FROM test1 UNION ALL SELECT month FROM test2 WHERE month = 202110 ) c JOIN test3 d ON c.month = d.mth; ● Symptoms: OOM due to the creation of ever increasingly complex filters
  • 46. OutOfMemory at Planning Time - Example (HIVE-25758) HiveJoin(condition=[=($0, $1)], joinType=[inner], algorithm=[none], cost=[not available]) HiveUnion(all=[true]) HiveProject(month=[CAST(regexp_replace(...):INTEGER**]) HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...):INTEGER)]) HiveTableScan(table=[[default, test1]], table:alias=[test1]) HiveProject(month=[CAST(202110):INTEGER]) HiveFilter(condition=[=($0, 202110)]) HiveTableScan(table=[[default, test2]], table:alias=[test2]) HiveFilter(condition=[IS NOT NULL($0)]) HiveTableScan(table=[[default, test3]], table:alias=[d]) ** Abbreviation of “CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER”
  • 47. OutOfMemory at Planning Time - Example (HIVE-25758) call#198: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#136:HiveJoin] call#212: Rule [HiveFilterSetOpTransposeRule] produced [rel#142:HiveUnion] call#223: Rule [HiveFilterProjectTransposeRule] produced [rel#148:HiveProject] call#228: Rule [HiveFilterMergeRule] produced [rel#152:HiveFilter] call#239: Rule [HiveFilterProjectTransposeRule] produced [rel#153:HiveProject] call#258: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#155:HiveJoin] call#273: Rule [HiveFilterProjectTransposeRule] produced [rel#160:HiveProject] call#278: Rule [HiveFilterMergeRule] produced [rel#164:HiveFilter] call#283: Rule [ReduceExpressionsRule(Filter)] produced [rel#166:HiveFilter] RHS → LHS LHS → RHS
  • 48. OutOfMemory at Planning Time - Example (HIVE-25758) HiveJoin(condition=[=($0, $1)], joinType=[inner], [...]) HiveUnion(all=[true]) HiveProject(month=[CAST(regexp_replace(...)):INTEGER**]) HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...):INTEGER)]) HiveTableScan(table=[[default, test1]], table:alias=[test1]) HiveProject(month=[CAST(202110):INTEGER]) HiveFilter(condition=[=($0, 202110)]) HiveTableScan(table=[[default, test2]], table:alias=[test2]) HiveFilter(condition=[IS NOT NULL($0)]) HiveTableScan(table=[[default, test3]], table:alias=[d]) ** Abbreviation of “CAST(regexp_replace(substr(add_months(FROM_UNIXTIME(UNIX_TIMESTAMP, _UTF-16LE'yyyy-MM-dd':VARCHAR(2147483647) CHARACTER SET "UTF-16LE"), -1), 1, 7), _UTF-16LE'-':VARCHAR(2147483647) CHARACTER SET "UTF-16LE", _UTF-16LE'':VARCHAR(2147483647) CHARACTER SET "UTF-16LE")):INTEGER”
  • 49. OutOfMemory at Planning Time - Example (HIVE-25758) HiveJoin(condition=[=($0, $1)], joinType=[inner], [...]) HiveUnion(all=[true]) HiveProject(month=[CAST(regexp_replace(...)):INTEGER]) HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)]) HiveTableScan(table=[[default, test1]], table:alias=[test1]) HiveProject(month=[CAST(202110):INTEGER]) HiveFilter(condition=[=($0, 202110)]) HiveTableScan(table=[[default, test2]], table:alias=[test2]) HiveFilter(condition=[OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0, 202110))]) HiveFilter(condition=[IS NOT NULL($0)]) HiveTableScan(table=[[default, test3]], table:alias=[d]) call#258: Rule [HiveJoinPushTransitivePredicatesRule] produced [rel#155:HiveJoin]
  • 50. OutOfMemory at Planning Time - Example (HIVE-25758) HiveJoin(condition=[=($0, $1)], joinType=[inner], [...]) HiveUnion(all=[true]) HiveProject(month=[CAST(regexp_replace(...)):INTEGER]) HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)]) HiveTableScan(table=[[default, test1]], table:alias=[test1]) HiveProject(month=[CAST(202110):INTEGER]) HiveFilter(condition=[=($0, 202110)]) HiveTableScan(table=[[default, test2]], table:alias=[test2]) HiveFilter(condition=[AND(OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0, 202110)), IS NOT NULL($0))]) HiveTableScan(table=[[default, test3]], table:alias=[d]) call#278: Rule [HiveFilterMergeRule] produced [rel#164:HiveFilter]
  • 51. OutOfMemory at Planning Time - Example (HIVE-25758) HiveJoin(condition=[=($0, $1)], joinType=[inner], [...]) HiveUnion(all=[true]) HiveProject(month=[CAST(regexp_replace(...)):INTEGER]) HiveFilter(condition=[IS NOT NULL(CAST(regexp_replace(...)):INTEGER)]) HiveTableScan(table=[[default, test1]], table:alias=[test1]) HiveProject(month=[CAST(202110):INTEGER]) HiveFilter(condition=[=($0, 202110)]) HiveTableScan(table=[[default, test2]], table:alias=[test2]) HiveFilter(condition=[AND(OR(=($0, CAST(regexp_replace(...)):INTEGER), =($0, 202110)), IS NOT NULL($0))]) HiveTableScan(table=[[default, test3]], table:alias=[d]) ReduceExpressionsRule(Filter) can’t simplify the predicate
  • 52. OutOfMemory at Planning Time - Example (HIVE-25758) HiveFilter(condition=[ AND( IN($0, CAST(regexp_replace(...)):INTEGER, 202110), OR( AND( OR( IS NOT NULL(CAST(regexp_replace(...)):INTEGER), =(CAST(regexp_replace(...)):INTEGER, 202110) ), =($0, CAST(regexp_replace(...)):INTEGER) ), =($0, 202110) ) ) ])
  • 53. Incomplete / Incorrect Plan ● Usually boils down to identifying the first rel that looks “bad”, and trace it back by looking for the rel#, which is unique, and should appear in the preceding part of the logs ● Resolution highly depends on the specific issue, but with a precise call# or rel# it’s possible to set a conditional breakpoint (rules can be invoked multiple times before producing the issue)
  • 54. Incomplete / Incorrect Plan Hive - Example (HIVE-26722) ● OK: SELECT * FROM ( SELECT a, b FROM t UNION ALL SELECT a, NULL FROM t ) AS t2 WHERE a = 1; ● KO: (missing results) SELECT * FROM ( SELECT a, b FROM t UNION ALL SELECT a, cast(NULL as STRING) FROM t ) AS t2 WHERE a = 1; HiveUnion(all=[true]) HiveProject(a=[$0], b=[$1]) HiveFilter(condition=[=(CAST($0):DOUBLE, 1)]) HiveTableScan(table=[[default, t]], table:alias=[t]) HiveProject(a=[$0], _o__c1=[null:VARCHAR(2147483647) CHARACTER SET "UTF-16LE"]) HiveFilter(condition=[=(CAST($0):DOUBLE, 1)]) HiveTableScan(table=[[default, t]], table:alias=[t]) HiveProject(a=[$0], b=[$1]) HiveFilter(condition=[=(CAST($0):DOUBLE, 1)]) HiveTableScan(table=[[default, t]], table:alias=[t])
  • 55. Incomplete / Incorrect Plan Hive - Example (HIVE-26722) call#5: Rule [HiveFilterProjectTransposeRule] produced [rel#41:HiveFilter] call#6: Rule [HiveFilterSetOpTransposeRule] produced [rel#43:HiveFilter ] call#7: Rule [HiveFilterProjectTransposeRule] produced [rel#47:HiveProject] call#24: Rule [HivePartitionPruneRule(Filter)] produced [rel#52:HiveFilter] call#32: Rule [HiveFieldTrimmerRule] produced [rel#70:HiveFilter] call#41: Rule [HiveFilterProjectTSTransposeRule] produced [rel#77:HiveProject] call#43: Rule [HiveCardinalityPreservingJoinRule] produced [rel#87:HiveProject] call#5: Rule [HiveFilterProjectTransposeRule] produced [rel#47:HiveFilter] call#6: Rule [HiveFilterSetOpTransposeRule] produced [rel#51:HiveUnion ] call#7: Rule [HiveFilterProjectTransposeRule] produced [rel#57:HiveProject] call#13: Rule [HiveFilterProjectTransposeRule] produced [rel#62:HiveProject] call#15: Rule [ReduceExpressionsRule(Project)] produced [rel#66:HiveProject] call#19: Rule [HiveFilterProjectTransposeRule] produced [rel#69:HiveProject] call#49: Rule [HivePartitionPruneRule(Filter)] produced [rel#73:HiveFilter] call#64: Rule [HiveFieldTrimmerRule] produced ...
  • 56. Disabling Rules via Configuration ● AbstractRelOptPlanner#setRuleDescExclusionFilter allows to exclude rules based on a regex over their description, even if registered in the planner ● Benefits: ○ It avoids recompiling to exclude some rules from planning while troubleshooting ○ Can be a quick workaround to customers once the faulty rule(s) is identified ○ Can be activated “per-query” ○ Less invasive than disabling CBO altogether (e.g., “hive.cbo.enable=false”) ● HIVE-25880: “Add property to exclude CBO rules by a regex on their description” ● Example: set hive.cbo.rule.exclusion.regex=HiveJoinPushTransitivePredicatesRule|HivePreFilteringRule;
  • 57. Conclusion ● Many planning issues boil down to rule transformations making plans bigger and bigger ● Built-in loggers indispensable for fast troubleshooting ● Easy configuration via XML/Property files (Log4j2) ● Configurable verbosity via FULL_PLAN marker ● Common pain points: ○ Using multiple equivalent operators (OR, SEARCH, IN, etc.) ○ Push/Pull predicate logic in various rules ○ Inconsistent simplifications during planning ● Exploit AbstractRelOptPlanner#setRuleDescExclusionFilter for quick workarounds and hypotheses verification