Contenu connexe
Similaire à Cursor Implementation in Apache Phoenix (20)
Cursor Implementation in Apache Phoenix
- 1. © 2017 Bloomberg Finance L.P. All rights reserved.
HBaseCon West 2017
June 12, 2017
Anirudha Jadhav
ajadhav2@bloomberg.net
Biju Nair
bnair10@bloomberg.net
Cursors in Apache Phoenix
- 2. © 2017 Bloomberg Finance L.P. All rights reserved.
Leading data and analytics provider for the financial industry
Bloomberg
- 4. © 2017 Bloomberg Finance L.P. All rights reserved.
Reality of working with data
• The data model changes over time
• Users querying the data model don’t necessarily change
• Alternate query patterns for the same dataset
• Data infrastructure usage needs to be simple
- 5. © 2017 Bloomberg Finance L.P. All rights reserved.
Apache Phoenix
• Recipes of best practices for using HBase over a familiar SQL’ish grammar
• It is so much more than SQL
o User defined functions for push-down
o Secondary indices
o Statistics collections, optimizations based on heuristics
o ORM libraries
o JDBC, ODBC support with Query servers
o Integrations: Spark, Kafka, MR and others
- 6. © 2017 Bloomberg Finance L.P. All rights reserved.
Extending Apache Phoenix
• A very active and helpful community
• Our ongoing work
o Apache Calcite
o Distributed tests and nightly performance build
o Multi-DC replication
o Deep paging with cursor implementation
- 7. © 2017 Bloomberg Finance L.P. All rights reserved.
HBase
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
- 8. © 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix
https://www.slideshare.net/enissoz/apache-phoenix-past-present-and-future-of-sql-over-hbase
http://phoenix.apache.org/presentations/OC-HUG-2014-10-4x3.pdf
HBase
Master
RegionServer RegionServer RegionServer
ZooKeeper
QuorumHBase Client
Application
HDFS
DataNode
HDFS
DataNode
HDFS
DataNode
Phoenix Client
Phoenix RPC
endpoint
Phoenix RPC
endpoint
Phoenix
Coprocessors
SYSTEM.CATALOG SYSTEM.STATS
- 9. © 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Client
Phoenix Client
Authentication
SQL Parsing
Query rewrite/
Optimization
Query Plan Generation
Transaction Management
HBase
Client
ANTLR4
Hints/Rules
Rules
Tephra
Connection Management
HBase
Client
- 10. © 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connection con =
DriverManager.getConnection("jdbc:phoenix:zkquorum:2181:/hbase:principal:keytabfile")
;
…
PreparedStatement statement = con.prepareStatement("select * from TBL");
…
ResultSet rset = statement.executeQuery();
…
while (rset.next() != null)
…
rset.close()
…
- 11. © 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix query execution
Connect to HBase
Parse SQL Statement
Read/Cache Metadata
Validate SQL statement
Create query plan
Optimize query plan
Create Phoenix Result Set
Close ResultSet
Create Result Iterator
getConnection
prepareStatement
executeQuery
close()
- 12. © 2017 Bloomberg Finance L.P. All rights reserved.
Phoenix Server
Meta Data
Request
RegionServer
MetaDataEndPointImpl
SYSTEM.CATALOG
RegionServer
UngroupedAggregateRO
USER_TABLE
GroupedAggregateRO
ScanRegionObserverMetaDataRegionObserver
Indexer
RegionServer
UngroupedAggregateRO
GroupedAggregateRO
ScanRegionObserver
ServerCachingEndpointImpl
HBase Client
Application
Phoenix Client
Index
Write
Request
Read
Request
USER_TABLE
- 13. © 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• To support row pagination
o Should support forward and backward traversal
• Support required for select queries only
• Data needs to be consistent during traversal
- 14. © 2017 Bloomberg Finance L.P. All rights reserved.
Cursors
• DECLARE tCursor CURSOR FOR SELECT * FROM TBL
• OPEN tCursor
• FETCH NEXT 10 ROWS FROM tCursor
• FETCH PRIOR 5 ROWS FROM tCursor
• CLOSE tCursor
- 15. © 2017 Bloomberg Finance L.P. All rights reserved.
Implementation options
• PHOENIX-2606
• Use row value constructors
o Query rewrite and complex
• Wrapper over available query Resultsets
o Can leverage Resultsets and so relatively simple
- 16. © 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Lifecycle
PreparedStatement statement = con.prepareStatement("DECLARE tCursor CURSOR
FOR SELECT * FROM TBL");
statement.execute();
…
statement = con.prepareStatement("OPEN tCursor");
statement = con.prepareStatement("FETCH NEXT FROM tCursor");
ResultSet rset = statement.execute();
while (rset.next != null)
…
statement = con.prepareStatement(“CLOSE tCursor");
statement.execute();
- 17. © 2017 Bloomberg Finance L.P. All rights reserved.
Cursor lifecycle
Parse SQL Statement
Create/Optimize QueryPlan
Create CursorWrapper
Set Cursor Status to Open
Execute CursorFetchPlan
Create CursorResultIterator
Close Cursor
Create Phoenix ResultSet
DECLARE CURSOR
FETCH
OPEN CURSOR
CLOSE
- 18. © 2017 Bloomberg Finance L.P. All rights reserved.
Cursor Challenges
• Data Consistency
o Query start timestamp provides snapshot consistency
• Optimization
o Use Scan object for non aggregate queries
• Cache sizing
o Dynamic sizing
- 20. © 2017 Bloomberg Finance L.P. All rights reserved.
Thank You
Hardening Apache Phoenix @PhoenixCon tomorrow 2 PM PT
Q&A