SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
© 2017 Bloomberg Finance L.P. All rights reserved.
HBaseCon West 2017
June 12, 2017
Anirudha Jadhav
ajadhav2@bloomberg.net
Biju Nair
bnair10@bloomberg.net
Cursors in Apache Phoenix
© 2017 Bloomberg Finance L.P. All rights reserved.
Leading data and analytics provider for the financial industry
Bloomberg
Bloomberg is a data company
© 2017 Bloomberg Finance L.P. All rights reserved.
Reality of working with data
• The data model changes over time
• Users querying the data model don’t necessarily change
• Alternate query patterns for the same dataset
• Data infrastructure usage needs to be simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Apache Phoenix
• Recipes of best practices for using HBase over a familiar SQL’ish grammar
• It is so much more than SQL
o User defined functions for push-down
o Secondary indices
o Statistics collections, optimizations based on heuristics
o ORM libraries
o JDBC, ODBC support with Query servers
o Integrations: Spark, Kafka, MR and others
© 2017 Bloomberg Finance L.P. All rights reserved.
Extending Apache Phoenix
• A very active and helpful community
• Our ongoing work
o Apache Calcite
o Distributed tests and nightly performance build
o Multi-DC replication
o Deep paging with cursor implementation
© 2017 Bloomberg Finance L.P. All rights reserved.
HBase
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix
https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase
http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
Phoenix Client
Phoenix RPC
endpoint
Phoenix RPC
endpoint
Phoenix
Coprocessors
SYSTEM.CATALOG SYSTEM.STATS
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Client
Phoenix Client
Authentication
SQL Parsing
Query rewrite/
Optimization
Query Plan Generation
Transaction Management
HBase
Client
ANTLR4
Hints/Rules
Rules
Tephra
Connection Management
HBase
Client
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connection con =
DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile")
;
…
PreparedStatement statement = con.prepareStatement("select * from TBL");
…
ResultSet rset = statement.executeQuery();
…
while (rset.next() != null)
…
rset.close()
…
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connect to HBase
Parse SQL Statement
Read/Cache Metadata
Validate SQL statement
Create query plan
Optimize query plan
Create Phoenix Result Set
Close ResultSet
Create Result Iterator
getConnection
prepareStatement
executeQuery
close()
© 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Server
Meta Data
Request
RegionServer
MetaDataEndPointImpl
SYSTEM.CATALOG
RegionServer
UngroupedAggregateRO
USER_TABLE
GroupedAggregateRO
ScanRegionObserverMetaDataRegionObserver
Indexer
RegionServer
UngroupedAggregateRO
GroupedAggregateRO
ScanRegionObserver
ServerCachingEndpointImpl
HBase Client
Application
Phoenix Client
Index
Write
Request
Read
Request
USER_TABLE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• To support row pagination
o Should support forward and backward traversal
• Support required for select queries only
• Data needs to be consistent during traversal
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• DECLARE tCursor CURSOR FOR SELECT * FROM TBL
• OPEN tCursor
• FETCH NEXT 10 ROWS FROM tCursor
• FETCH PRIOR 5 ROWS FROM tCursor
• CLOSE tCursor
© 2017 Bloomberg Finance L.P. All rights reserved.
Implementation options
• PHOENIX-2606
• Use row value constructors
o Query rewrite and complex
• Wrapper over available query Resultsets
o Can leverage Resultsets and so relatively simple
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Lifecycle
PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR
FOR SELECT * FROM TBL");
statement.execute();
…
statement = con.prepareStatement("OPEN tCursor");
statement = con.prepareStatement("FETCH NEXT FROM tCursor");
ResultSet rset = statement.execute();
while (rset.next != null)
…
statement = con.prepareStatement(“CLOSE tCursor");
statement.execute();
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor lifecycle
Parse SQL Statement
Create/Optimize QueryPlan
Create CursorWrapper
Set Cursor Status to Open
Execute CursorFetchPlan
Create CursorResultIterator
Close Cursor
Create Phoenix ResultSet
DECLARE CURSOR
FETCH
OPEN CURSOR
CLOSE
© 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Challenges
• Data Consistency
o Query start timestamp provides snapshot consistency
• Optimization
o Use Scan object for non aggregate queries
• Cache sizing
o Dynamic sizing
Contributors
• Gabriel Jimenez (MIT)
• Anirudha Jadhav (Bloomberg)
• Biju Nair (Bloomberg)
• Ankit Singhal (Hortonworks)
© 2017 Bloomberg Finance L.P. All rights reserved.
Thank You
Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT
Q&A

Contenu connexe

Tendances

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
Michael Stack
 

Tendances (20)

2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase2 - Trafodion and Hadoop HBase
2 - Trafodion and Hadoop HBase
 
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
HBaseCon 2012 | Developing Real Time Analytics Applications Using HBase in th...
 
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul MasterCornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
Cornami Accelerates Performance on SPARK: Spark Summit East talk by Paul Master
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
1 - The Case for Trafodion
1 - The Case for Trafodion1 - The Case for Trafodion
1 - The Case for Trafodion
 
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
HBaseConAsia2018 Track2-1: Kerberos-based Big Data Security Solution and Prac...
 
Trafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoopTrafodion – an enterprise class sql based on hadoop
Trafodion – an enterprise class sql based on hadoop
 
HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware AccelerationHTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
HTAP By Accident: Getting More From PostgreSQL Using Hardware Acceleration
 
IoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management ThingsIoFMT – Internet of Fleet Management Things
IoFMT – Internet of Fleet Management Things
 
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
HBaseCon 2012 | Getting Real about Interactive Big Data Management with Lily ...
 
Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)Cloudera Operational DB (Apache HBase & Apache Phoenix)
Cloudera Operational DB (Apache HBase & Apache Phoenix)
 
Modernise your Data Warehouse with Amazon Redshift and Amazon Redshift Spectrum
Modernise your Data Warehouse with Amazon Redshift and Amazon Redshift SpectrumModernise your Data Warehouse with Amazon Redshift and Amazon Redshift Spectrum
Modernise your Data Warehouse with Amazon Redshift and Amazon Redshift Spectrum
 
Geospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNAGeospatial Big Data - Foss4gNA
Geospatial Big Data - Foss4gNA
 
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
HBaseConAsia2018 Track2-3: Bringing MySQL Compatibility to HBase using Databa...
 
Which Questions We Should Have
Which Questions We Should HaveWhich Questions We Should Have
Which Questions We Should Have
 
Enterprise Postgres
Enterprise PostgresEnterprise Postgres
Enterprise Postgres
 
Distributed SQL Databases Deconstructed
Distributed SQL Databases DeconstructedDistributed SQL Databases Deconstructed
Distributed SQL Databases Deconstructed
 
Rebuilding from MongoDB for Scale on HBase
Rebuilding from MongoDB for Scale on HBaseRebuilding from MongoDB for Scale on HBase
Rebuilding from MongoDB for Scale on HBase
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
There and back_again_oracle_and_big_data_16x9
There and back_again_oracle_and_big_data_16x9There and back_again_oracle_and_big_data_16x9
There and back_again_oracle_and_big_data_16x9
 

Similaire à Cursor Implementation in Apache Phoenix

How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
DataWorks Summit
 

Similaire à Cursor Implementation in Apache Phoenix (20)

Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham ChopraSpark and Online Analytics: Spark Summit East talky by Shubham Chopra
Spark and Online Analytics: Spark Summit East talky by Shubham Chopra
 
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
From Mainframe to Microservices: Vanguard’s Move to the Cloud - ENT331 - re:I...
 
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
Case Study: Sprinklr Uses Amazon EBS to Maximize Its NoSQL Deployment - DAT33...
 
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
Self-Service Analytics with AWS Big Data and Tableau - ARC217 - re:Invent 2017
 
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
How Nextdoor Built a Scalable, Serverless Data Pipeline for Billions of Event...
 
DAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into CloudDAT320_Moving a Galaxy into Cloud
DAT320_Moving a Galaxy into Cloud
 
[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...
[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...
[db tech showcase Tokyo 2017] C13:There and back again or how to connect Orac...
 
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
Spring-Boot-PQS with Apache Ignite Caching @ HbaseCon PhoenixCon Dataworks su...
 
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
How to Use Innovative Data Handling and Processing Techniques to Drive Alpha ...
 
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
Airbnb Runs on Amazon Aurora - DAT331 - re:Invent 2017
 
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
FINRA's Managed Data Lake: Next-Gen Analytics in the Cloud - ENT328 - re:Inve...
 
Postgres Foreign Data Wrappers
Postgres Foreign Data Wrappers  Postgres Foreign Data Wrappers
Postgres Foreign Data Wrappers
 
Building Data Driven Apps with AWS: Collision 2018
Building Data Driven Apps with AWS: Collision 2018Building Data Driven Apps with AWS: Collision 2018
Building Data Driven Apps with AWS: Collision 2018
 
NEW LAUNCH! Data Driven Apps with GraphQL: AWS AppSync Deep Dive - MBL402 - r...
NEW LAUNCH! Data Driven Apps with GraphQL: AWS AppSync Deep Dive - MBL402 - r...NEW LAUNCH! Data Driven Apps with GraphQL: AWS AppSync Deep Dive - MBL402 - r...
NEW LAUNCH! Data Driven Apps with GraphQL: AWS AppSync Deep Dive - MBL402 - r...
 
DAT317_Migrating Databases and Data Warehouses to the Cloud
DAT317_Migrating Databases and Data Warehouses to the CloudDAT317_Migrating Databases and Data Warehouses to the Cloud
DAT317_Migrating Databases and Data Warehouses to the Cloud
 
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
[db tech showcase Tokyo 2017] C24:Taking off to the clouds. How to use DMS in...
 
MySQL 8.0 in a nutshell
MySQL 8.0 in a nutshellMySQL 8.0 in a nutshell
MySQL 8.0 in a nutshell
 
State ofdolphin short
State ofdolphin shortState ofdolphin short
State ofdolphin short
 
DAT324_Expedia Flies with DynamoDB Lightning Fast Stream Processing for Trave...
DAT324_Expedia Flies with DynamoDB Lightning Fast Stream Processing for Trave...DAT324_Expedia Flies with DynamoDB Lightning Fast Stream Processing for Trave...
DAT324_Expedia Flies with DynamoDB Lightning Fast Stream Processing for Trave...
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 

Plus de Biju Nair

Plus de Biju Nair (14)

Chef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scaleChef conf-2015-chef-patterns-at-bloomberg-scale
Chef conf-2015-chef-patterns-at-bloomberg-scale
 
Apache Kafka Reference
Apache Kafka ReferenceApache Kafka Reference
Apache Kafka Reference
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Chef patterns
Chef patternsChef patterns
Chef patterns
 
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
 
HDFS User Reference
HDFS User ReferenceHDFS User Reference
HDFS User Reference
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
Concurrency
ConcurrencyConcurrency
Concurrency
 
Project Risk Management
Project Risk ManagementProject Risk Management
Project Risk Management
 
Websphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentalsWebsphere MQ (MQSeries) fundamentals
Websphere MQ (MQSeries) fundamentals
 

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 

Cursor Implementation in Apache Phoenix

  • 1. © 2017 Bloomberg Finance L.P. All rights reserved. HBaseCon West 2017 June 12, 2017 Anirudha Jadhav ajadhav2@bloomberg.net Biju Nair bnair10@bloomberg.net Cursors in Apache Phoenix
  • 2. © 2017 Bloomberg Finance L.P. All rights reserved. Leading data and analytics provider for the financial industry Bloomberg
  • 3. Bloomberg is a data company
  • 4. © 2017 Bloomberg Finance L.P. All rights reserved. Reality of working with data • The data model changes over time • Users querying the data model don’t necessarily change • Alternate query patterns for the same dataset • Data infrastructure usage needs to be simple
  • 5. © 2017 Bloomberg Finance L.P. All rights reserved. Apache Phoenix • Recipes of best practices for using HBase over a familiar SQL’ish grammar • It is so much more than SQL o User defined functions for push-down o Secondary indices o Statistics collections, optimizations based on heuristics o ORM libraries o JDBC, ODBC support with Query servers o Integrations: Spark, Kafka, MR and others
  • 6. © 2017 Bloomberg Finance L.P. All rights reserved. Extending Apache Phoenix • A very active and helpful community • Our ongoing work o Apache Calcite o Distributed tests and nightly performance build o Multi-DC replication o Deep paging with cursor implementation
  • 7. © 2017 Bloomberg Finance L.P. All rights reserved. HBase HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode
  • 8. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf HBase Master RegionServer RegionServer RegionServer ZooKeeper QuorumHBase Client Application HDFS DataNode HDFS DataNode HDFS DataNode Phoenix Client Phoenix RPC endpoint Phoenix RPC endpoint Phoenix Coprocessors SYSTEM.CATALOG SYSTEM.STATS
  • 9. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Client Phoenix Client Authentication SQL Parsing Query rewrite/ Optimization Query Plan Generation Transaction Management HBase Client ANTLR4 Hints/Rules Rules Tephra Connection Management HBase Client
  • 10. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connection con = DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile") ; … PreparedStatement statement = con.prepareStatement("select * from TBL"); … ResultSet rset = statement.executeQuery(); … while (rset.next() != null) … rset.close() …
  • 11. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix query execution Connect to HBase Parse SQL Statement Read/Cache Metadata Validate SQL statement Create query plan Optimize query plan Create Phoenix Result Set Close ResultSet Create Result Iterator getConnection prepareStatement executeQuery close()
  • 12. © 2017 Bloomberg Finance L.P. All rights reserved. Phoenix Server Meta Data Request RegionServer MetaDataEndPointImpl SYSTEM.CATALOG RegionServer UngroupedAggregateRO USER_TABLE GroupedAggregateRO ScanRegionObserverMetaDataRegionObserver Indexer RegionServer UngroupedAggregateRO GroupedAggregateRO ScanRegionObserver ServerCachingEndpointImpl HBase Client Application Phoenix Client Index Write Request Read Request USER_TABLE
  • 13. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • To support row pagination o Should support forward and backward traversal • Support required for select queries only • Data needs to be consistent during traversal
  • 14. © 2017 Bloomberg Finance L.P. All rights reserved. Cursors • DECLARE tCursor CURSOR FOR SELECT * FROM TBL • OPEN tCursor • FETCH NEXT 10 ROWS FROM tCursor • FETCH PRIOR 5 ROWS FROM tCursor • CLOSE tCursor
  • 15. © 2017 Bloomberg Finance L.P. All rights reserved. Implementation options • PHOENIX-2606 • Use row value constructors o Query rewrite and complex • Wrapper over available query Resultsets o Can leverage Resultsets and so relatively simple
  • 16. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Lifecycle PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR FOR SELECT * FROM TBL"); statement.execute(); … statement = con.prepareStatement("OPEN tCursor"); statement = con.prepareStatement("FETCH NEXT FROM tCursor"); ResultSet rset = statement.execute(); while (rset.next != null) … statement = con.prepareStatement(“CLOSE tCursor"); statement.execute();
  • 17. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor lifecycle Parse SQL Statement Create/Optimize QueryPlan Create CursorWrapper Set Cursor Status to Open Execute CursorFetchPlan Create CursorResultIterator Close Cursor Create Phoenix ResultSet DECLARE CURSOR FETCH OPEN CURSOR CLOSE
  • 18. © 2017 Bloomberg Finance L.P. All rights reserved. Cursor Challenges • Data Consistency o Query start timestamp provides snapshot consistency • Optimization o Use Scan object for non aggregate queries • Cache sizing o Dynamic sizing
  • 19. Contributors • Gabriel Jimenez (MIT) • Anirudha Jadhav (Bloomberg) • Biju Nair (Bloomberg) • Ankit Singhal (Hortonworks)
  • 20. © 2017 Bloomberg Finance L.P. All rights reserved. Thank You Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT Q&A