Contenu connexe
Similaire à HiveServer2 for Apache Hive
Similaire à HiveServer2 for Apache Hive (20)
HiveServer2 for Apache Hive
- 2. Hive Background: What is it?
An ETL/Data Warehouse system for Hadoop:
• SQL->MR Compiler and Execution Engine
• SerDes: Pluggable Data Format Handlers
• MetaStore: Persistent Metadata Storage
2
©2012 Cloudera, Inc. All Rights Reserved.
- 3. Hive Evolution
• Original Vision:
– Let users express their queries in a high-level
language without having to write MR
programs
• Now more and more:
– A parallel SQL DBMS that happens to use
Hadoop for its storage and execution layer.
3
©2012 Cloudera, Inc. All Rights Reserved.
- 4. What do users expect from a DBMS?
• Sessions/Concurrency
– Persistent client state on the server-side
– Ability to run multiple client concurrently
• ODBC/JDBC
– SQL IDEs, BI, ETL, …
• Authentication/Authorization
• Auditing/Logging
4
©2012 Cloudera, Inc. All Rights Reserved.
- 5. What’s Missing?
• Sessions/Concurrency
– Current Thrift API can’t support concurrency
• ODBC/JDBC
– Thrift API doesn’t support common ODBC/JDBC
• Authentication/Authorization
– Incomplete implementations
• Auditing/Logging
– Multiple plugin interfaces in need of consolidation
5
©2012 Cloudera, Inc. All Rights Reserved.
- 6. What’s Missing
Concurrency/Sessions
• Current Thrift API can’t support multiple
connections or client sessions.
• User/Global Configuration and Session
Info
• Query compiler memory leaks
6
©2012 Cloudera, Inc. All Rights Reserved.
- 7. What’s Missing
ODBC/JDBC
• Thrift API can’t support common ODBC/
JDBC calls:
– SQLGetInfo
– SQLGetTypeInfo
– SQLCancel
– SQLGetFunctions
7
©2012 Cloudera, Inc. All Rights Reserved.
- 8. What’s Missing
Authentication/Authorization
• SASL Authentication for HiveServer
• Hive supports GRANT/ROLE based
authorization, but implementation is
incomplete.
• Code injection vectors: ADD JAR,
TRANSFORM, SET x, …
8
©2012 Cloudera, Inc. All Rights Reserved.
- 9. Project Milestones
• HiveServer2 Thrift API Spec
• JDBC/ODBC HiveServer2 Drivers
• Concurrent Thrift clients
– Fix query compiler memory leaks
– User/Global session/configuration information
• Authentication (Kerberos)
• Authorization
– Extend to configuration, ADD x,
TRANSFORM, …
9
©2012 Cloudera, Inc. All Rights Reserved.
- 10. Who’s working on it?
• Carl Steinbach
– carl@cloudera
• Prasad Mujumdar
– prasadm@cloudera
10
©2011 Cloudera, Inc. All Rights Reserved.
- 11. Resources
• HIVE-2935: Implement HiveServer2
• HiveServer API Proposal:
– https://cwiki.apache.org/confluence/display/
Hive/HiveServer2+Thrift+API
11
©2011 Cloudera, Inc. All Rights Reserved.
- 12. Questions?
• Questions?
• Questions?
• Questions?
• Questions?
12
©2012 Cloudera, Inc. All Rights Reserved.