2. Current State of Authorization in Hive
• Advisory Authorization
- Facilitates self regulation to avoid safeguard against accidental changes
- Users can grant themselves privileges as necessary
- Problem: Insufficient to guard against malicious users
• Impersonation
- Data is protected at the file level by HDFS permissions
- Problem: File-level access is not granular enough
- Problem: Not role-based
2
3. Authorization Requirements
• Secure Authorization
Ability to control access to data and/or privileges on data for authenticated users
• Fine-Grained Authorization
Ability to give users access to a subset of data in files
• Role-Based Authorization
Ability to create/apply templatized privileges based on functional roles
• Multi-Tenant Administration
Ability for central admin group to empower lower-level admins to manage security for each
database/schema
3
4. Introducing Sentry
Authorization module for Hadoop ecosystem
• Unlocks Key RBAC Requirements
ᵒ Secure, fine-grained, role-based authorization
ᵒ Multi-tenant administration
ᵒ Open Source via Apache Incubator
ᵒ Modular RBAC Framework
ᵒ Multiple users in production for months
4
6. Sentry: Fine-Grained Authorization
• Ability to specify privileges on
ᵒ SERVER, DATABASE, TABLE, VIEW, URI
• Privilege Granularity
ᵒ SELECT
ᵒ INSERT
ᵒ ALL
• Multi-Tenant Administration
ᵒ Administration per database
6
7. Granting Privileges
• Example: Grant SELECT on table CUSTOMERS from database SALES:
server=server1->db=sales->table=customer->action=SELECT!
• Objects represented by containment Hierarchy
• Privilege granted for the leaf object and its continues
!!
7
8. Specifying Roles
• Roles are collection of Privileges
• Example: A role Seller that allows SELECT on table CUSTOMER and Insert on
table ITEMS
!
seller_role = server=server1->db=sales->table=customer->action=Select, !
!
8
server=server1->db=sales->table=items->action=Insert!
9. Users and Groups
• Works with existing Authentication Mechanisms
• Group connects the authentication system with authorization system.
ᵒ A Set of Roles can be assigned to a Group
!analyst = sales_reporting, data_export, audit_report!
• User to Group Mapping:
ᵒ Using Hadoop groups
ᵒ Or Specify Locally in sentry-site.xml file
9
10. User Feedback
I have implemented Hiveserver2 Authentication (openLDAP) and Authorization (using
Cloudera Sentry). I am super-excited because we know can open our Hive Data
Platform in "read only" mode to remote clients in the company and SAS clients.
Source:
• Apache user@hive.apache.org
• Tue, 17 Sep 2013 19:10:43 GMT
• http://s.apache.org/hive-sentry-user
10
12. Hive Requirements
• Sentry plugs into existing hooks such as the Semantic Analyzer hook interface
• Changes required are minor, estimating ~600 LOC including unit tests
12
13. Hive Requirements
Follow Hive integration via SENTRY-67
• HIVE-4670 - Authentication module should pass the instance part of the
Kerberos principle
• HIVE-4390 - Enable capturing input URI entities for DML statements
• HIVE-4741 - Add Hive config API to modify the restrict list
• HIVE-4641 - Support post execution/fetch hook for HiveServer2
13