SlideShare une entreprise Scribd logo
1  sur  15
Stinger.next 
Alan F Gates 
@alanfgates 
December 2014 
Page 1 © Hortonworks Inc. 2014
Disclaimer 
This document may contain product features and technology directions that are under 
development or may be under development in the future. 
Technical feasibility, market demand, user feedback, and the Apache Software 
Foundation community development process can all effect timing and final delivery. 
This document’s description of these features and technology directions does not 
represent a contractual commitment from Hortonworks to deliver these features in any 
generally available product. 
Product features and technology directions are subject to change, and must not be 
included in contracts, purchase orders, or sales agreements of any kind. 
Page 2 © Hortonworks Inc. 2014
Hadoop Summit EU Call For Abstracts Open 
Open until December 5, 2014 
Share your Hadoop knowledge and experience with the wider community 
Summit is April 15-16 2015 in Brussels Belgium 
Tracks: 
• Committer Track 
• Data Science & Hadoop 
• Hadoop Governance, Security & Operations 
• Hadoop Access Engines 
• Applications of Hadoop and the Data Driven Business 
• The Future of Apache Hadoop 
Page 3 © Hortonworks Inc. 2014
Interactive SQL-IN-Hadoop Delivered 
Stinger Initiative – DELIVERED 
Next generation SQL based 
interactive query in Hadoop 
Speed 
Improve Hive query performance has increased by 100X to allow for 
interactive query times (seconds) 
Scale 
The only SQL interface to Hadoop designed for queries that scale 
from TB to PB 
SQL 
Support broadest range of SQL semantics for analytic applications 
running against Hadoop 
Business Analytics Custom 
SQL 
Apps 
Window 
Functions 
Apache Hive 
Apache 
MapReduce 
Apache 
Tez 
Apache YARN 
1 ° ° ° 
° ° ° ° 
° ° ° ° 
Apache Hive Contribution… an Open Community at its finest 
1,672 
Jira Tickets Closed 
145 
Developers 
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 
44 
Companies 
~390,000 
Lines Of Code Added… (2x) 
° 
° 
N 
HDFS 
(Hadoop Distributed File System) 
Stinger Project 
Stinger Phase 1: 
• Base Optimizations 
• SQL Types 
• SQL Analytic Functions 
• ORCFile Modern File Format 
Stinger Phase 2: 
HDP 2.1 
• SQL Types 
• SQL Analytic Functions 
• Advanced Optimizations 
• Performance Boosts via YARN 
Stinger Phase 3 
• Hive on Apache Tez 
• Query Service (always on) 
• Buffer Cache 
• Cost Based Optimizer (Optiq) 
13 
Months 
Governance 
& Integration 
Security 
Operations 
Data Access 
Data 
Management 
ORC File
Hive – Single tool for all SQL use cases 
Page 5 © Hortonworks Inc. 2014 
OLTP, ERP, CRM Systems 
Unstructured documents, emails 
Server logs 
Clickstream 
Sentiment, Web Data 
Sensor. Machine Data 
Geolocation 
Interactive 
Analytics 
Batch Reports / 
Deep Analytics 
Hive - SQL 
ETL / ELT
Stinger.next - Delivery Themes 
Beyond Read-Only 
2nd Half 2014 
• Transactions with ACID allowing 
insert, update and delete 
• Temporary Tables 
• Cost Based Optimizer optimizes 
star and bushy join queries 
Page 8 © Hortonworks Inc. 2014 
Sub-Second 
1st Half 2015 
• Sub-Second queries with LLAP 
• Hive-Spark Machine Learning 
integration 
• Operational reporting with Hive 
Streaming Ingest and 
Transactions 
• Replication and SQL/CBO 
improvements 
Richer Analytics 
2nd Half 2015 
• Toward SQL:2011 Analytics 
• Materialized Views 
• Cross-Geo Queries 
• Workload Management via YARN 
and LLAP integration
Deep Dive: Cost Based Optimizer 
• Phase 1 
• CBO Introduced 
• CBO does join re-ordering 
• Initial collection of statistics 
• Phase 2 
• Handle queries with more joins 
• Better plans for star and bushy (multi-star) join schemas 
• Opportunistic improvements based on sample queries 
• Better integration of Calcite into Hive infrastructure 
• More statistics with better usability 
• Better predicate handling 
• Phase 3 
• Move existing simple optimizations into cost based optimizer 
• Build more complex optimization into Calcite 
[Done] 
[Hive 0.14] 
Page 9 © Hortonworks Inc. 2014 
SQL 
CBO 
Based on Calcite 
Hive 
Rule Based 
Optimizations 
Query 
Plan 
[2015]
Performance Improvement – Query 17 
Scale = 30TB 
Input records ~186mil 
Page 14 © Hortonworks Inc. 2014 
CBO Elapsed 
Time (sec) 
Elapsed 
Time 
Intermediate 
data (GB) 
Output and 
Intermediate 
Records 
OFF 10,683 ~3 hrs 5,017 135,647,792,123 
ON 1,284 ~20 mins 275 8,543,232,360
Transaction Use Cases 
• Reporting with Analytics (YES) 
• Reporting on data with occasional updates 
• Corrections to the fact tables, evolving dimension tables 
• Low concurrency updates, low TPS 
• Operational Reporting (YES) 
• High throughput ingest from operational (OLTP) database 
• Periodic inserts every 5-30 minutes 
• Requires tool support 
• Operational (OLTP) Database (NO) 
• Small Transactions, each doing single line inserts 
• High Concurrency - Hundreds to thousands of connections 
Page 15 © Hortonworks Inc. 2014 
Analytics Modifications 
Hive 
Replication 
OLTP Hive 
Hive 
High Concurrency 
OLTP
Deep Dive: Transactions 
Transaction Support in Hive with ACID semantics 
• Hive native support for INSERT, UPDATE, DELETE. 
• Split Into Phases: 
• Phase 1: Hive Streaming Ingest (append) 
• Phase 2: INSERT / UPDATE / DELETE Support 
• Phase 3: BEGIN / COMMIT / ROLLBACK Txn 
[Hive 0.13] 
[Hive 0.14] 
Page 16 © Hortonworks Inc. 2014 
Read- 
Optimized 
ORCFile 
Delta File 
Merged 
Read- 
Optimized 
ORCFile 
1. Original File 
Task reads the latest 
ORCFile 
Task 
Read- 
Optimized 
ORCFile 
Task Task 
2. Edits Made 
Task reads the ORCFile and merges 
the delta file with the edits 
3. Edits Merged 
Task reads the 
updated ORCFile 
Hive ACID Compactor 
periodically merges the delta 
files in the background
Sub-Second: Tez with LLAP 
• LLAP is a node resident daemon process 
• Low latency by reducing setup cost 
• Multi-threaded engine that runs smaller tasks for query 
including reads, filter and some joins 
• Use regular Tez tasks for larger shuffle and other 
operators 
• LLAP has In-memory columnar data cache 
• Low latency by providing data from in-memory cache 
instead of going to HDFS 
• Store data in columnar format for vectorization 
irrespective of underlying file type 
• Security enforced across queries and users 
• Uses YARN for resource management 
Page 17 © Hortonworks Inc. 2014 
LLAP = Live Long And Process 
Node 
Query 
Fragment 
LLAP Process 
LLAP process 
running a task 
for a query 
LLAP In-Memory 
columnar cache 
HDFS
Deeper Dive: Tez with LLAP engine 
LLAP is an optional daemon process running on multiple nodes, that provides the following: 
• Caching and data reuse across queries with compressed columnar data in-memory (off-heap) 
• Multi-threaded execution including reads with predicate pushdown and hash joins 
• High throughput IO using Async IO Elevator with dedicated thread and core per disk 
• Granular column level security across applications 
• YARN will provide workload management in LLAP by using delegation 
Page 18 © Hortonworks Inc. 2014 
LLAP process runs on multiple nodes, accelerating Tez tasks 
Node 
LLAP Process 
HDFS 
Query 
Fragment 
LLAP process running 
read task for a query 
LLAP In-Memory 
columnar cache 
Node 
Hive 
Query 
Node Node Node Node 
LLAP LLAP LLAP LLAP
Deep Dive: Engines 
• Tez 
• Phase 1 
• Pipelined, Vectorized Execution 
• Low latency startup 
– Hold on to sessions 
– Hold on to pre-warmed containers 
• Phase 2 
• Dynamic Partition Pruning 
• Improved Tez Shuffle 
– Compression / Vectorization 
• Tez + LLAP for Sub-Second Queries 
• Phase 3 
• LLAP Processes with: 
• Multi-threaded Execution Engine 
• In-Memory Columnar Cache 
• Phase 4 
• YARN workload management for 
LLAP 
Page 19 © Hortonworks Inc. 2014 
M M M 
R R 
M M 
R 
M M 
R 
M M 
R 
HDFS 
HDFS 
HDFS 
Hive 
LLAP process 
running read task 
T T T 
R R 
R 
T T 
T 
R 
M M M 
R R 
R 
M M 
R 
R 
[Done] 
[Champlain] 
[1H, 2015] 
HDFS 
LLAP In-Memory 
columnar cache 
Map – Reduce 
Intermediate results in HDFS 
Tez 
Optimized Pipeline 
Tez with LLAP 
Resident process on Nodes 
Map tasks 
read HDFS 
[2H, 2015]
SQL Support 
Page 20 © Hortonworks Inc. 2014 
SQL Datatypes SQL Semantics 
INT/TINYINT/SMALLINT/BIGINT SELECT, INSERT 
FLOAT/DOUBLE GROUP BY, ORDER BY, HAVING 
BOOLEAN Inner, outer, cross and semi joins 
ARRAY, MAP, STRUCT, UNION Sub-queries in the FROM clause 
STRING ROLLUP and CUBE 
BINARY UNION 
TIMESTAMP Standard aggregations (sum, avg, etc.) 
DECIMAL Custom Java UDFs 
DATE Windowing functions (OVER, RANK, etc.) 
VARCHAR Advanced UDFs (ngram, XPath, URL) 
CHAR Sub-queries for IN/NOT IN, HAVING 
Interval Types JOINs in WHERE Clause 
Common Table Expressions (WITH Clause) 
INSERT / UPDATE / DELETE 
Non-equi joins 
Set functions - Union, Except, Intersect 
All sub-queries 
Minor syntax differences resolved – rollup, case 
Goal: SQL 2011 Analytic Functions 
Legend 
Available Now 
HDP Champlain 
Stinger.next
Questions? 
. 
Page 22 © Hortonworks Inc. 2014

Contenu connexe

Tendances

Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopHortonworks
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceHortonworks
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Hortonworks
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchHortonworks
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Hortonworks
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun ConnollyHortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Hortonworks
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Hortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramHortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Hortonworks
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataHortonworks
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksHortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopHortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks
 

Tendances (20)

Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache HadoopHortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
Hortonworks Technical Workshop: Real Time Monitoring with Apache Hadoop
 
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data GovernanceDiscover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture Delivering Apache Hadoop for the Modern Data Architecture
Delivering Apache Hadoop for the Modern Data Architecture
 
State of the Union with Shaun Connolly
State of the Union with Shaun ConnollyState of the Union with Shaun Connolly
State of the Union with Shaun Connolly
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and HortonworksPowering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
Powering Fast Data and the Hadoop Ecosystem with VoltDB and Hortonworks
 
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache HadoopEnrich a 360-degree Customer View with Splunk and Apache Hadoop
Enrich a 360-degree Customer View with Splunk and Apache Hadoop
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 

En vedette

Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Josh Elser
 
Hybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianHybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianData Con LA
 
Impala presentation ahad rana
Impala presentation ahad ranaImpala presentation ahad rana
Impala presentation ahad ranaData Con LA
 
VoltDB Big Data Camp LA 2014 - Scott Jar
VoltDB  Big Data Camp LA 2014 - Scott JarVoltDB  Big Data Camp LA 2014 - Scott Jar
VoltDB Big Data Camp LA 2014 - Scott JarData Con LA
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaData Con LA
 
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...Data Con LA
 
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...Data Con LA
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Data Con LA
 
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Data Con LA
 
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Data Con LA
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 
HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran John Mulhall
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Data Con LA
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...Data Con LA
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Data Con LA
 
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...Data Con LA
 
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...Data Con LA
 
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...Data Con LA
 
Big Data Day LA 2016 Keynote - Reynold Xin/ Databricks
Big Data Day LA 2016 Keynote - Reynold Xin/ DatabricksBig Data Day LA 2016 Keynote - Reynold Xin/ Databricks
Big Data Day LA 2016 Keynote - Reynold Xin/ DatabricksData Con LA
 
Apache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesApache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesData Con LA
 

En vedette (20)

Calcite meetup-2016-04-20
Calcite meetup-2016-04-20Calcite meetup-2016-04-20
Calcite meetup-2016-04-20
 
Hybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerianHybrid architecture integrateduserviewdata-peyman_mohajerian
Hybrid architecture integrateduserviewdata-peyman_mohajerian
 
Impala presentation ahad rana
Impala presentation ahad ranaImpala presentation ahad rana
Impala presentation ahad rana
 
VoltDB Big Data Camp LA 2014 - Scott Jar
VoltDB  Big Data Camp LA 2014 - Scott JarVoltDB  Big Data Camp LA 2014 - Scott Jar
VoltDB Big Data Camp LA 2014 - Scott Jar
 
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jhaAziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
 
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...
Big Data Day LA 2015 - Brainwashed: Building an IDE for Feature Engineering b...
 
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
Big Data Day LA 2016/ Big Data Track - Real Time Analytics with Druid - Guill...
 
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
Big Data Day LA 2016/ NoSQL track - Apache Kudu: Fast Analytics on Fast Data,...
 
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
Big Data Day LA 2016/ Big Data Track - Apply R in Enterprise Applications, Lo...
 
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
Stream your Operational Data with Apache Spark & Kafka into Hadoop using Couc...
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran HUG_Ireland_Apache_Arrow_Tomer_Shiran
HUG_Ireland_Apache_Arrow_Tomer_Shiran
 
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
Big Data Day LA 2016/ Big Data Track - Fluentd and Embulk: Collect More Data,...
 
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Panel - Interactive Applic...
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
 
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...
Big Data Day LA 2016/ Data Science Track - Affinity Marketing Leveraging Crow...
 
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
Big Data Day LA 2015 - Building a Big Data Culture in the Entertainment Indus...
 
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
Big Data Day LA 2016/ Big Data Track - Rapid Analytics @ Netflix LA (Updated ...
 
Big Data Day LA 2016 Keynote - Reynold Xin/ Databricks
Big Data Day LA 2016 Keynote - Reynold Xin/ DatabricksBig Data Day LA 2016 Keynote - Reynold Xin/ Databricks
Big Data Day LA 2016 Keynote - Reynold Xin/ Databricks
 
Apache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use CasesApache HBase - Introduction & Use Cases
Apache HBase - Introduction & Use Cases
 

Similaire à Stinger.Next by Alan Gates of Hortonworks

LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveDataWorks Summit
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureDataWorks Summit
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...ssuserd3a367
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?DataWorks Summit
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoDataWorks Summit
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleYifeng Jiang
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureJianfeng Zhang
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and FutureRajesh Balamohan
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_featuresAlberto Romero
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017alanfgates
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Gunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stingerGunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stingerhdhappy001
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Global Business Events
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopEvans Ye
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BIDataWorks Summit
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveAldrin Piri
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep DiveHortonworks
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Data Con LA
 

Similaire à Stinger.Next by Alan Gates of Hortonworks (20)

LLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in HiveLLAP: Sub-Second Analytical Queries in Hive
LLAP: Sub-Second Analytical Queries in Hive
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
Building Scalable Big Data Infrastructure Using Open Source Software Presenta...
 
What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?What's New in Apache Hive 3.0?
What's New in Apache Hive 3.0?
 
What's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - TokyoWhat's New in Apache Hive 3.0 - Tokyo
What's New in Apache Hive 3.0 - Tokyo
 
Sub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scaleSub-second-sql-on-hadoop-at-scale
Sub-second-sql-on-hadoop-at-scale
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Apache Tez – Present and Future
Apache Tez – Present and FutureApache Tez – Present and Future
Apache Tez – Present and Future
 
Hive acid and_2.x new_features
Hive acid and_2.x new_featuresHive acid and_2.x new_features
Hive acid and_2.x new_features
 
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Gunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stingerGunther hagleitner:apache hive & stinger
Gunther hagleitner:apache hive & stinger
 
Data Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat AlwellData Con LA 2018 - Streaming and IoT by Pat Alwell
Data Con LA 2018 - Streaming and IoT by Pat Alwell
 
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
Justin Sheppard & Ankur Gupta from Sears Holdings Corporation - Single point ...
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
 
Future of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep DiveFuture of Data New Jersey - HDF 3.0 Deep Dive
Future of Data New Jersey - HDF 3.0 Deep Dive
 
Stinger Initiative - Deep Dive
Stinger Initiative - Deep DiveStinger Initiative - Deep Dive
Stinger Initiative - Deep Dive
 
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
Big Data Day LA 2016/ Big Data Track - Building scalable enterprise data flow...
 

Plus de Data Con LA

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA
 

Plus de Data Con LA (20)

Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynotes
Data Con LA 2022 KeynotesData Con LA 2022 Keynotes
Data Con LA 2022 Keynotes
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup ShowcaseData Con LA 2022 - Startup Showcase
Data Con LA 2022 - Startup Showcase
 
Data Con LA 2022 Keynote
Data Con LA 2022 KeynoteData Con LA 2022 Keynote
Data Con LA 2022 Keynote
 
Data Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendationsData Con LA 2022 - Using Google trends data to build product recommendations
Data Con LA 2022 - Using Google trends data to build product recommendations
 
Data Con LA 2022 - AI Ethics
Data Con LA 2022 - AI EthicsData Con LA 2022 - AI Ethics
Data Con LA 2022 - AI Ethics
 
Data Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learningData Con LA 2022 - Improving disaster response with machine learning
Data Con LA 2022 - Improving disaster response with machine learning
 
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and AtlasData Con LA 2022 - What's new with MongoDB 6.0 and Atlas
Data Con LA 2022 - What's new with MongoDB 6.0 and Atlas
 
Data Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentationData Con LA 2022 - Real world consumer segmentation
Data Con LA 2022 - Real world consumer segmentation
 
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
Data Con LA 2022 - Modernizing Analytics & AI for today's needs: Intuit Turbo...
 
Data Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWSData Con LA 2022 - Moving Data at Scale to AWS
Data Con LA 2022 - Moving Data at Scale to AWS
 
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AIData Con LA 2022 - Collaborative Data Exploration using Conversational AI
Data Con LA 2022 - Collaborative Data Exploration using Conversational AI
 
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
Data Con LA 2022 - Why Database Modernization Makes Your Data Decisions More ...
 
Data Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data ScienceData Con LA 2022 - Intro to Data Science
Data Con LA 2022 - Intro to Data Science
 
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing EntertainmentData Con LA 2022 - How are NFTs and DeFi Changing Entertainment
Data Con LA 2022 - How are NFTs and DeFi Changing Entertainment
 
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
Data Con LA 2022 - Why Data Quality vigilance requires an End-to-End, Automat...
 
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
Data Con LA 2022-Perfect Viral Ad prediction of Superbowl 2022 using Tease, T...
 
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...Data Con LA 2022- Embedding medical journeys with machine learning to improve...
Data Con LA 2022- Embedding medical journeys with machine learning to improve...
 
Data Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with KafkaData Con LA 2022 - Data Streaming with Kafka
Data Con LA 2022 - Data Streaming with Kafka
 

Dernier

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Dernier (20)

Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Stinger.Next by Alan Gates of Hortonworks

  • 1. Stinger.next Alan F Gates @alanfgates December 2014 Page 1 © Hortonworks Inc. 2014
  • 2. Disclaimer This document may contain product features and technology directions that are under development or may be under development in the future. Technical feasibility, market demand, user feedback, and the Apache Software Foundation community development process can all effect timing and final delivery. This document’s description of these features and technology directions does not represent a contractual commitment from Hortonworks to deliver these features in any generally available product. Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. Page 2 © Hortonworks Inc. 2014
  • 3. Hadoop Summit EU Call For Abstracts Open Open until December 5, 2014 Share your Hadoop knowledge and experience with the wider community Summit is April 15-16 2015 in Brussels Belgium Tracks: • Committer Track • Data Science & Hadoop • Hadoop Governance, Security & Operations • Hadoop Access Engines • Applications of Hadoop and the Data Driven Business • The Future of Apache Hadoop Page 3 © Hortonworks Inc. 2014
  • 4. Interactive SQL-IN-Hadoop Delivered Stinger Initiative – DELIVERED Next generation SQL based interactive query in Hadoop Speed Improve Hive query performance has increased by 100X to allow for interactive query times (seconds) Scale The only SQL interface to Hadoop designed for queries that scale from TB to PB SQL Support broadest range of SQL semantics for analytic applications running against Hadoop Business Analytics Custom SQL Apps Window Functions Apache Hive Apache MapReduce Apache Tez Apache YARN 1 ° ° ° ° ° ° ° ° ° ° ° Apache Hive Contribution… an Open Community at its finest 1,672 Jira Tickets Closed 145 Developers Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 44 Companies ~390,000 Lines Of Code Added… (2x) ° ° N HDFS (Hadoop Distributed File System) Stinger Project Stinger Phase 1: • Base Optimizations • SQL Types • SQL Analytic Functions • ORCFile Modern File Format Stinger Phase 2: HDP 2.1 • SQL Types • SQL Analytic Functions • Advanced Optimizations • Performance Boosts via YARN Stinger Phase 3 • Hive on Apache Tez • Query Service (always on) • Buffer Cache • Cost Based Optimizer (Optiq) 13 Months Governance & Integration Security Operations Data Access Data Management ORC File
  • 5. Hive – Single tool for all SQL use cases Page 5 © Hortonworks Inc. 2014 OLTP, ERP, CRM Systems Unstructured documents, emails Server logs Clickstream Sentiment, Web Data Sensor. Machine Data Geolocation Interactive Analytics Batch Reports / Deep Analytics Hive - SQL ETL / ELT
  • 6. Stinger.next - Delivery Themes Beyond Read-Only 2nd Half 2014 • Transactions with ACID allowing insert, update and delete • Temporary Tables • Cost Based Optimizer optimizes star and bushy join queries Page 8 © Hortonworks Inc. 2014 Sub-Second 1st Half 2015 • Sub-Second queries with LLAP • Hive-Spark Machine Learning integration • Operational reporting with Hive Streaming Ingest and Transactions • Replication and SQL/CBO improvements Richer Analytics 2nd Half 2015 • Toward SQL:2011 Analytics • Materialized Views • Cross-Geo Queries • Workload Management via YARN and LLAP integration
  • 7. Deep Dive: Cost Based Optimizer • Phase 1 • CBO Introduced • CBO does join re-ordering • Initial collection of statistics • Phase 2 • Handle queries with more joins • Better plans for star and bushy (multi-star) join schemas • Opportunistic improvements based on sample queries • Better integration of Calcite into Hive infrastructure • More statistics with better usability • Better predicate handling • Phase 3 • Move existing simple optimizations into cost based optimizer • Build more complex optimization into Calcite [Done] [Hive 0.14] Page 9 © Hortonworks Inc. 2014 SQL CBO Based on Calcite Hive Rule Based Optimizations Query Plan [2015]
  • 8. Performance Improvement – Query 17 Scale = 30TB Input records ~186mil Page 14 © Hortonworks Inc. 2014 CBO Elapsed Time (sec) Elapsed Time Intermediate data (GB) Output and Intermediate Records OFF 10,683 ~3 hrs 5,017 135,647,792,123 ON 1,284 ~20 mins 275 8,543,232,360
  • 9. Transaction Use Cases • Reporting with Analytics (YES) • Reporting on data with occasional updates • Corrections to the fact tables, evolving dimension tables • Low concurrency updates, low TPS • Operational Reporting (YES) • High throughput ingest from operational (OLTP) database • Periodic inserts every 5-30 minutes • Requires tool support • Operational (OLTP) Database (NO) • Small Transactions, each doing single line inserts • High Concurrency - Hundreds to thousands of connections Page 15 © Hortonworks Inc. 2014 Analytics Modifications Hive Replication OLTP Hive Hive High Concurrency OLTP
  • 10. Deep Dive: Transactions Transaction Support in Hive with ACID semantics • Hive native support for INSERT, UPDATE, DELETE. • Split Into Phases: • Phase 1: Hive Streaming Ingest (append) • Phase 2: INSERT / UPDATE / DELETE Support • Phase 3: BEGIN / COMMIT / ROLLBACK Txn [Hive 0.13] [Hive 0.14] Page 16 © Hortonworks Inc. 2014 Read- Optimized ORCFile Delta File Merged Read- Optimized ORCFile 1. Original File Task reads the latest ORCFile Task Read- Optimized ORCFile Task Task 2. Edits Made Task reads the ORCFile and merges the delta file with the edits 3. Edits Merged Task reads the updated ORCFile Hive ACID Compactor periodically merges the delta files in the background
  • 11. Sub-Second: Tez with LLAP • LLAP is a node resident daemon process • Low latency by reducing setup cost • Multi-threaded engine that runs smaller tasks for query including reads, filter and some joins • Use regular Tez tasks for larger shuffle and other operators • LLAP has In-memory columnar data cache • Low latency by providing data from in-memory cache instead of going to HDFS • Store data in columnar format for vectorization irrespective of underlying file type • Security enforced across queries and users • Uses YARN for resource management Page 17 © Hortonworks Inc. 2014 LLAP = Live Long And Process Node Query Fragment LLAP Process LLAP process running a task for a query LLAP In-Memory columnar cache HDFS
  • 12. Deeper Dive: Tez with LLAP engine LLAP is an optional daemon process running on multiple nodes, that provides the following: • Caching and data reuse across queries with compressed columnar data in-memory (off-heap) • Multi-threaded execution including reads with predicate pushdown and hash joins • High throughput IO using Async IO Elevator with dedicated thread and core per disk • Granular column level security across applications • YARN will provide workload management in LLAP by using delegation Page 18 © Hortonworks Inc. 2014 LLAP process runs on multiple nodes, accelerating Tez tasks Node LLAP Process HDFS Query Fragment LLAP process running read task for a query LLAP In-Memory columnar cache Node Hive Query Node Node Node Node LLAP LLAP LLAP LLAP
  • 13. Deep Dive: Engines • Tez • Phase 1 • Pipelined, Vectorized Execution • Low latency startup – Hold on to sessions – Hold on to pre-warmed containers • Phase 2 • Dynamic Partition Pruning • Improved Tez Shuffle – Compression / Vectorization • Tez + LLAP for Sub-Second Queries • Phase 3 • LLAP Processes with: • Multi-threaded Execution Engine • In-Memory Columnar Cache • Phase 4 • YARN workload management for LLAP Page 19 © Hortonworks Inc. 2014 M M M R R M M R M M R M M R HDFS HDFS HDFS Hive LLAP process running read task T T T R R R T T T R M M M R R R M M R R [Done] [Champlain] [1H, 2015] HDFS LLAP In-Memory columnar cache Map – Reduce Intermediate results in HDFS Tez Optimized Pipeline Tez with LLAP Resident process on Nodes Map tasks read HDFS [2H, 2015]
  • 14. SQL Support Page 20 © Hortonworks Inc. 2014 SQL Datatypes SQL Semantics INT/TINYINT/SMALLINT/BIGINT SELECT, INSERT FLOAT/DOUBLE GROUP BY, ORDER BY, HAVING BOOLEAN Inner, outer, cross and semi joins ARRAY, MAP, STRUCT, UNION Sub-queries in the FROM clause STRING ROLLUP and CUBE BINARY UNION TIMESTAMP Standard aggregations (sum, avg, etc.) DECIMAL Custom Java UDFs DATE Windowing functions (OVER, RANK, etc.) VARCHAR Advanced UDFs (ngram, XPath, URL) CHAR Sub-queries for IN/NOT IN, HAVING Interval Types JOINs in WHERE Clause Common Table Expressions (WITH Clause) INSERT / UPDATE / DELETE Non-equi joins Set functions - Union, Except, Intersect All sub-queries Minor syntax differences resolved – rollup, case Goal: SQL 2011 Analytic Functions Legend Available Now HDP Champlain Stinger.next
  • 15. Questions? . Page 22 © Hortonworks Inc. 2014

Notes de l'éditeur

  1. explain select i_item_id > ,i_item_desc > ,s_state > ,count(ss_quantity) as store_sales_quantitycount > ,avg(ss_quantity) as store_sales_quantityave > ,stddev_samp(ss_quantity) as store_sales_quantitystdev > ,stddev_samp(ss_quantity)/avg(ss_quantity) as store_sales_quantitycov > ,count(sr_return_quantity) as_store_returns_quantitycount > ,avg(sr_return_quantity) as_store_returns_quantityave > ,stddev_samp(sr_return_quantity) as_store_returns_quantitystdev > ,stddev_samp(sr_return_quantity)/avg(sr_return_quantity) as store_returns_quantitycov > ,count(cs_quantity) as catalog_sales_quantitycount ,avg(cs_quantity) as catalog_sales_quantityave > ,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitystdev > ,stddev_samp(cs_quantity)/avg(cs_quantity) as catalog_sales_quantitycov > from store_sales > ,store_returns > ,catalog_sales > ,date_dim d1 > ,date_dim d2 > ,date_dim d3 > ,store > ,item > where d1.d_quarter_name = '2000Q1' > and d1.d_date_sk = store_sales.ss_sold_date_sk > and ss_sold_date between '2000-01-01' and '2000-03-31' > and item.i_item_sk = store_sales.ss_item_sk > and store.s_store_sk = store_sales.ss_store_sk > and store_sales.ss_customer_sk = store_returns.sr_customer_sk > and store_sales.ss_item_sk = store_returns.sr_item_sk > and store_sales.ss_ticket_number = store_returns.sr_ticket_number > and store_returns.sr_returned_date_sk = d2.d_date_sk > and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3') > and sr_returned_date between '2000-01-01' and '2000-09-01' > and store_returns.sr_customer_sk = catalog_sales.cs_bill_customer_sk > and store_returns.sr_item_sk = catalog_sales.cs_item_sk > and catalog_sales.cs_sold_date_sk = d3.d_date_sk > and d3.d_quarter_name in ('2000Q1','2000Q2','2000Q3') > and cs_sold_date between '2000-01-01' and '2000-09-31' > group by i_item_id > ,i_item_desc > ,s_state > order by i_item_id > ,i_item_desc > ,s_state > limit 100;