SlideShare une entreprise Scribd logo
1  sur  29
Télécharger pour lire hors ligne
1
Apache Sentry: Enterprise-grade
Security for Hadoop
Xuefu Zhang, Srayva Tirukkovalur | Cloudera
June 25, 2014
Outline
• Introduction
• Hadoop security primer
• Authentication
• Authorization
• Data Protection
• Governance and Auditing
• Introducing Apache Sentry
• What's Sentry
• Sentry Architecture
• Sentry Internal
• Recent Development
• Q&A
2
Introduction
●
Hadoop gets bigger ...
●
Hadoop has been enjoying an increasing adoption rate
●
More and more data on Hadoop Cluster
●
More and more access to the data
●
Data warehouse offload is the most common use case
●
Apache Hive, Apache Drill, Cloudera Impala
●
SQL on Hadoop is phenomenon
3
Introduction (cont'd)
●
But more encumbrance ...
●
Enterprises wants to protect sensitive data
●
Government regulations, compliance, like HIPPA, PII,
FISMA
●
Existing security problems with Hadoop has hindered
the adoption
●
Security has become the top priority
4
Introduction (cont'd)
●
Reality is ...
●
Different components, different security mechanisms
●
Multiple components may access the same data set
●
Hadoop was born out of trust, not security
●
Thinking of Windows
5
Outline
• Introduction
• Hadoop security primer
• Authentication
• Authorization
• Data Protection
• Governance and Auditing
• Introducing Apache Sentry
• What's Sentry
• Sentry Architecture
• Sentry Internal
• Recent Development
• Q&A
6
Hadoop Security Primer
• Authentication
●
Identify who you are
●
Untrusted users has no access to the cluster network
●
In a trusted network, every one is good citizen
●
Who you are is determined by client host
7
Hadoop Security Primer
• Strong Authentication
●
Kerberos
●
LDAP, ActiveDirectory
●
LDAP, AD integrated with Kerberos, establishing a
single point of truth
●
Single Sign On
8
Hadoop Security Primer (cont'd)
• Kerberos
●
Strong authentication
●
Provides mutual authentication
●
Protects against eavesdropping and replay attacks
●
Every user and service has a Kerberos “principal”
●
Credentials: keytabs (service), password (user)
9
Hadoop Security Primer (cont'd)
• Authorization
●
Determine if you can access
●
HDFS Posix style permission R/W/X for U/G/O, coarse-
grained
●
Other components have authorization
●
MR job queue
●
HBase ACLs on table and column family.
●
Accumulo provides cell-level access control
●
Impersonation
10
Hadoop Security Primer (cont'd)
• Data Protection
●
Data at rest and in transit
●
Hadoop provides encryption on data in transit: DTP,
HTTP, RPC, JDBC/ODBC
●
Hadoop has no native encryption on data at rest (HDFS-
6134)
●
Relying on OS-level encryption
11
Hadoop Security Primer (cont'd)
• Governance and auditing
●
Again, component to component
●
DFS and MapReduce provide base audit support
●
Apache Hive metastore records audit (who/when)
information for Hive interactions.
●
Apache Oozie provides audit trail for services
12
Outline
• Introduction
• Hadoop security primer
• Authentication
• Authorization
• Data Protection
• Governance and Auditing
• Introducing Apache Sentry
• What's Sentry
• Sentry Architecture
• Sentry Internal
• Recent Development
• Q&A
13
Introducing Apache Sentry
14
●
Hadoop Authorization
●
Existing authorization is fragmented, coarse-grained,
and manual
●
A lot of times data is just unprotected for simplicity
●
Enterprises need a centralized authorization component
that work across components with ease of use, fine-
grained, role based
Introducing Apache Sentry (cont'd)
15
●
What's Sentry
●
Sentry is an authorization module for Hive, Search,
Impala, and beyond
●
It unlocks Key RBAC Requirements: secure, fine-
grained, role-based authorization, multi-tenant
administration
●
Open Source, Apache Incubator project
●
Ecosystem Support: Apache SOLR, HiveServer2, &
Impala 1.1+
Introducing Apache Sentry (cont'd)
16
●
Key Benefits
●
Store Sensitive Data in Hadoop
●
Extend Hadoop to More Users
●
Comply with Regulations
Introducing Apache Sentry (cont'd)
17
●
Key Capabilities
●
Fine-Grained: SERVERS, DATABASES, TABLES &
VIEWS; INDEXES, COLLECTIONS
●
Role-Based: role including privileges such as SELECT,
INSERT, ALL; UPDATE, QUERY
●
Multi-Tenant administration
●
Separate policies for each database/schema
●
Can be maintained by separate admins
Introducing Apache Sentry (cont'd)
18
Binding
Layer
Impala
Impala Hive
Policy Engine
Policy Provider
File Database
HiveServer
2
Authorization
Provider
Local
FS/HDFS
Search
SOLR
Pig …
Sentry Architecture
Introducing Apache Sentry (cont'd)
19
QueryMR
SQL
Parse
Build
Check
Plan
Sentry
Validate SQL grammar
Construct statement tree
Validate statement objects
•
First check: Authorization
Forward to execution planner
Introducing Apache Sentry (cont'd)
• Actors
●
User
●
User group membership
●
Resources
●
Privilege
●
Role
20
Introducing Apache Sentry (cont'd)
• User
●
User authenticated
●
User identity obtained from session context
21
Introducing Apache Sentry (cont'd)
• User group membership
●
Defined outside sentry policy
●
Obtained from user directory (LDAP, AD, HDFS)
●
Maybe available from session context
22
Introducing Apache Sentry (cont'd)
• Resources
●
Data to be protected
●
File or directory on HDFS
●
Table or views in Hive
●
URI
●
Resource can be hierarchical
23
Introducing Apache Sentry (cont'd)
• Privilege
●
Action or operation associated with a resource
●
Exists in a role only
●
SELECT on a given TABLE or VIEW
●
CREATE a TABLE or VIEW
●
QUERY on a search COLLECTION
●
DELETE a FILE or DIRECTORY
●
Example
collection=customerCol->action=query
24
Introducing Apache Sentry (cont'd)
• Roles
●
A collection of privileges
●
Defined in Sentry policy
●
Example
[roles]
ana_query_role = collection=sentryColl->action=query
ana_update_role = collection=sentryColl->action=update
test_role = collection=testColl->action=update
full_admin_role = collection=*
25
Introducing Apache Sentry (cont'd)
• (Group, Role) mapping
●
Defined in policy
●
One-to-Many
●
Example
[groups]
analyts = ana_query_role, ana_update_role
admins = full_admin_role
testgroup = test_role
hbase = full_admin_role
26
Introducing Apache Sentry (cont'd)
• Rule evaluation
●
Who's the user?
●
Which group(s) does the user belong to?
●
What resource to be accessed?
●
How the resource is accessed (READ, SELECT, etc.)?
●
Does any of the user's groups have a role, which has
the right privilege?
●
Yes – great! Go head!
●
No – sorry! No sufficient privilege!
27
Outline
• Introduction
• Hadoop security primer
• Authentication
• Authorization
• Data Protection
• Governance and Auditing
• Introducing Apache Sentry
• What's Sentry
• Sentry Architecture
• Sentry Internal
• Recent Development
• Q&A
28
Click to edit Master title style
29

Contenu connexe

Tendances

Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Ryan Blue
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architectureAdam Doyle
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowDremio Corporation
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon Web Services
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudNoritaka Sekiyama
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsAlluxio, Inc.
 
Some Iceberg Basics for Beginners (CDP).pdf
Some Iceberg Basics for Beginners (CDP).pdfSome Iceberg Basics for Beginners (CDP).pdf
Some Iceberg Basics for Beginners (CDP).pdfMichael Kogan
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeDatabricks
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAAdam Doyle
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta LakeKnoldus Inc.
 
Centralized logging for (java) applications with the elastic stack made easy
Centralized logging for (java) applications with the elastic stack   made easyCentralized logging for (java) applications with the elastic stack   made easy
Centralized logging for (java) applications with the elastic stack made easyfelixbarny
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - RangerIsheeta Sanghi
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLScyllaDB
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsAlluxio, Inc.
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardParis Data Engineers !
 

Tendances (20)

Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Delta lake and the delta architecture
Delta lake and the delta architectureDelta lake and the delta architecture
Delta lake and the delta architecture
 
Building a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache ArrowBuilding a Virtual Data Lake with Apache Arrow
Building a Virtual Data Lake with Apache Arrow
 
Amazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best PracticesAmazon EMR Deep Dive & Best Practices
Amazon EMR Deep Dive & Best Practices
 
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the CloudAmazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
Amazon S3 Best Practice and Tuning for Hadoop/Spark in the Cloud
 
Databricks Delta Lake and Its Benefits
Databricks Delta Lake and Its BenefitsDatabricks Delta Lake and Its Benefits
Databricks Delta Lake and Its Benefits
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Some Iceberg Basics for Beginners (CDP).pdf
Some Iceberg Basics for Beginners (CDP).pdfSome Iceberg Basics for Beginners (CDP).pdf
Some Iceberg Basics for Beginners (CDP).pdf
 
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta LakeSimplify CDC Pipeline with Spark Streaming SQL and Delta Lake
Simplify CDC Pipeline with Spark Streaming SQL and Delta Lake
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Apache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEAApache Iceberg Presentation for the St. Louis Big Data IDEA
Apache Iceberg Presentation for the St. Louis Big Data IDEA
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
 
Spark with Delta Lake
Spark with Delta LakeSpark with Delta Lake
Spark with Delta Lake
 
Centralized logging for (java) applications with the elastic stack made easy
Centralized logging for (java) applications with the elastic stack   made easyCentralized logging for (java) applications with the elastic stack   made easy
Centralized logging for (java) applications with the elastic stack made easy
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
High-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQLHigh-speed Database Throughput Using Apache Arrow Flight SQL
High-speed Database Throughput Using Apache Arrow Flight SQL
 
Iceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data AnalyticsIceberg + Alluxio for Fast Data Analytics
Iceberg + Alluxio for Fast Data Analytics
 
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin AmbardDelta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
Delta Lake OSS: Create reliable and performant Data Lake by Quentin Ambard
 

En vedette

Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: OverviewCloudera, Inc.
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access SecurityCloudera, Inc.
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction Alexander Alten
 
Introduction to sentry
Introduction to sentryIntroduction to sentry
Introduction to sentrymozillazg
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security ArchitectureOwen O'Malley
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxVinay Shukla
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with HadoopCloudera, Inc.
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateSteve Loughran
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionSteve Loughran
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop ClusterEdureka!
 
Dive into sentry
Dive into sentryDive into sentry
Dive into sentryLeo Zhou
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Shravan (Sean) Pabba
 
Embeddable data transformation for real time streams
Embeddable data transformation for real time streamsEmbeddable data transformation for real time streams
Embeddable data transformation for real time streamsJoey Echeverria
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessCloudera, Inc.
 
Hadoop security landscape
Hadoop security landscapeHadoop security landscape
Hadoop security landscapeSujee Maniyam
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosEdureka!
 
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...Lucidworks
 

En vedette (20)

Hadoop Security: Overview
Hadoop Security: OverviewHadoop Security: Overview
Hadoop Security: Overview
 
Hadoop and Data Access Security
Hadoop and Data Access SecurityHadoop and Data Access Security
Hadoop and Data Access Security
 
Sentry - An Introduction
Sentry - An Introduction Sentry - An Introduction
Sentry - An Introduction
 
Introduction to sentry
Introduction to sentryIntroduction to sentry
Introduction to sentry
 
Hadoop Security Architecture
Hadoop Security ArchitectureHadoop Security Architecture
Hadoop Security Architecture
 
Big Data Security and Governance
Big Data Security and GovernanceBig Data Security and Governance
Big Data Security and Governance
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Hadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache KnoxHadoop Security Today & Tomorrow with Apache Knox
Hadoop Security Today & Tomorrow with Apache Knox
 
Big Data Security with Hadoop
Big Data Security with HadoopBig Data Security with Hadoop
Big Data Security with Hadoop
 
Hadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the GateHadoop and Kerberos: the Madness Beyond the Gate
Hadoop and Kerberos: the Madness Beyond the Gate
 
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 editionHadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
Hadoop and Kerberos: the Madness Beyond the Gate: January 2016 edition
 
Administer Hadoop Cluster
Administer Hadoop ClusterAdminister Hadoop Cluster
Administer Hadoop Cluster
 
Dive into sentry
Dive into sentryDive into sentry
Dive into sentry
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Embeddable data transformation for real time streams
Embeddable data transformation for real time streamsEmbeddable data transformation for real time streams
Embeddable data transformation for real time streams
 
Hadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster AccessHadoop Operations: How to Secure and Control Cluster Access
Hadoop Operations: How to Secure and Control Cluster Access
 
Hadoop security landscape
Hadoop security landscapeHadoop security landscape
Hadoop security landscape
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Secure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With KerberosSecure Hadoop Cluster With Kerberos
Secure Hadoop Cluster With Kerberos
 
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
Secure Search - Using Apache Sentry to Add Authentication and Authorization S...
 

Similaire à Apache Sentry for Hadoop security

BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of ViewKaran Alang
 
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Cloudera, Inc.
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop EcosystemDataWorks Summit
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoopNiel Dunnage
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv larsgeorge
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at UberDataWorks Summit
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全Jianwei Li
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCloudera, Inc.
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaCaserta
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Cloudera, Inc.
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityChris Nauroth
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyDataWorks Summit
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyAnurag Shrivastava
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextHellmar Becker
 
Security Threats to Hadoop: Data Leakage Attacks and Investigation
Security Threats to Hadoop: Data Leakage Attacks  and InvestigationSecurity Threats to Hadoop: Data Leakage Attacks  and Investigation
Security Threats to Hadoop: Data Leakage Attacks and Investigation Kiran Gajbhiye
 

Similaire à Apache Sentry for Hadoop security (20)

April 2014 HUG : Apache Sentry
April 2014 HUG : Apache SentryApril 2014 HUG : Apache Sentry
April 2014 HUG : Apache Sentry
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208Webinar: Productionizing Hadoop: Lessons Learned - 20101208
Webinar: Productionizing Hadoop: Lessons Learned - 20101208
 
Securing the Hadoop Ecosystem
Securing the Hadoop EcosystemSecuring the Hadoop Ecosystem
Securing the Hadoop Ecosystem
 
Fighting cyber fraud with hadoop
Fighting cyber fraud with hadoopFighting cyber fraud with hadoop
Fighting cyber fraud with hadoop
 
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
Data Pipelines in Hadoop - SAP Meetup in Tel Aviv
 
Securing Data in Hadoop at Uber
Securing Data in Hadoop at UberSecuring Data in Hadoop at Uber
Securing Data in Hadoop at Uber
 
大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache HadoopCombat Cyber Threats with Cloudera Impala & Apache Hadoop
Combat Cyber Threats with Cloudera Impala & Apache Hadoop
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
August 2014 HUG : Hive 13 Security
August 2014 HUG : Hive 13 SecurityAugust 2014 HUG : Hive 13 Security
August 2014 HUG : Hive 13 Security
 
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by ClouderaBig Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
Big Data Warehousing Meetup: Securing the Hadoop Ecosystem by Cloudera
 
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
Comprehensive Security for the Enterprise II: Guarding the Perimeter and Cont...
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
Hadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happyHadoop Security Features That make your risk officer happy
Hadoop Security Features That make your risk officer happy
 
Hadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happyHadoop Security Features that make your risk officer happy
Hadoop Security Features that make your risk officer happy
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Security Threats to Hadoop: Data Leakage Attacks and Investigation
Security Threats to Hadoop: Data Leakage Attacks  and InvestigationSecurity Threats to Hadoop: Data Leakage Attacks  and Investigation
Security Threats to Hadoop: Data Leakage Attacks and Investigation
 

Plus de bigdatagurus_meetup

Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql databasebigdatagurus_meetup
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014bigdatagurus_meetup
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)bigdatagurus_meetup
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSbigdatagurus_meetup
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing enginebigdatagurus_meetup
 

Plus de bigdatagurus_meetup (11)

Hypertable - massively scalable nosql database
Hypertable - massively scalable nosql databaseHypertable - massively scalable nosql database
Hypertable - massively scalable nosql database
 
Big data beyond the hype may 2014
Big data beyond the hype may 2014Big data beyond the hype may 2014
Big data beyond the hype may 2014
 
What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)What enterprises can learn from Real Time Bidding (RTB)
What enterprises can learn from Real Time Bidding (RTB)
 
Quantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFSQuantcast File System (QFS) - Alternative to HDFS
Quantcast File System (QFS) - Alternative to HDFS
 
Scaling HBase at Pinterest
Scaling HBase at PinterestScaling HBase at Pinterest
Scaling HBase at Pinterest
 
Continuuity Weave
Continuuity WeaveContinuuity Weave
Continuuity Weave
 
Cassandra 2.0 (Introduction)
Cassandra 2.0 (Introduction)Cassandra 2.0 (Introduction)
Cassandra 2.0 (Introduction)
 
Search On Hadoop
Search On HadoopSearch On Hadoop
Search On Hadoop
 
Apache Tez -- A modern processing engine
Apache Tez -- A modern processing engineApache Tez -- A modern processing engine
Apache Tez -- A modern processing engine
 
Cloudera Developer Kit (CDK)
Cloudera Developer Kit (CDK)Cloudera Developer Kit (CDK)
Cloudera Developer Kit (CDK)
 
Lipstick On Pig
Lipstick On Pig Lipstick On Pig
Lipstick On Pig
 

Dernier

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...nirzagarg
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdftheeltifs
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制vexqp
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制vexqp
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteedamy56318795
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 

Dernier (20)

Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
怎样办理伦敦大学城市学院毕业证(CITY毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In Begusarai [ 7014168258 ] Call Me For Genuine Models...
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Satna [ 7014168258 ] Call Me For Genuine Models We ...
 

Apache Sentry for Hadoop security

  • 1. 1 Apache Sentry: Enterprise-grade Security for Hadoop Xuefu Zhang, Srayva Tirukkovalur | Cloudera June 25, 2014
  • 2. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Recent Development • Q&A 2
  • 3. Introduction ● Hadoop gets bigger ... ● Hadoop has been enjoying an increasing adoption rate ● More and more data on Hadoop Cluster ● More and more access to the data ● Data warehouse offload is the most common use case ● Apache Hive, Apache Drill, Cloudera Impala ● SQL on Hadoop is phenomenon 3
  • 4. Introduction (cont'd) ● But more encumbrance ... ● Enterprises wants to protect sensitive data ● Government regulations, compliance, like HIPPA, PII, FISMA ● Existing security problems with Hadoop has hindered the adoption ● Security has become the top priority 4
  • 5. Introduction (cont'd) ● Reality is ... ● Different components, different security mechanisms ● Multiple components may access the same data set ● Hadoop was born out of trust, not security ● Thinking of Windows 5
  • 6. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Recent Development • Q&A 6
  • 7. Hadoop Security Primer • Authentication ● Identify who you are ● Untrusted users has no access to the cluster network ● In a trusted network, every one is good citizen ● Who you are is determined by client host 7
  • 8. Hadoop Security Primer • Strong Authentication ● Kerberos ● LDAP, ActiveDirectory ● LDAP, AD integrated with Kerberos, establishing a single point of truth ● Single Sign On 8
  • 9. Hadoop Security Primer (cont'd) • Kerberos ● Strong authentication ● Provides mutual authentication ● Protects against eavesdropping and replay attacks ● Every user and service has a Kerberos “principal” ● Credentials: keytabs (service), password (user) 9
  • 10. Hadoop Security Primer (cont'd) • Authorization ● Determine if you can access ● HDFS Posix style permission R/W/X for U/G/O, coarse- grained ● Other components have authorization ● MR job queue ● HBase ACLs on table and column family. ● Accumulo provides cell-level access control ● Impersonation 10
  • 11. Hadoop Security Primer (cont'd) • Data Protection ● Data at rest and in transit ● Hadoop provides encryption on data in transit: DTP, HTTP, RPC, JDBC/ODBC ● Hadoop has no native encryption on data at rest (HDFS- 6134) ● Relying on OS-level encryption 11
  • 12. Hadoop Security Primer (cont'd) • Governance and auditing ● Again, component to component ● DFS and MapReduce provide base audit support ● Apache Hive metastore records audit (who/when) information for Hive interactions. ● Apache Oozie provides audit trail for services 12
  • 13. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Recent Development • Q&A 13
  • 14. Introducing Apache Sentry 14 ● Hadoop Authorization ● Existing authorization is fragmented, coarse-grained, and manual ● A lot of times data is just unprotected for simplicity ● Enterprises need a centralized authorization component that work across components with ease of use, fine- grained, role based
  • 15. Introducing Apache Sentry (cont'd) 15 ● What's Sentry ● Sentry is an authorization module for Hive, Search, Impala, and beyond ● It unlocks Key RBAC Requirements: secure, fine- grained, role-based authorization, multi-tenant administration ● Open Source, Apache Incubator project ● Ecosystem Support: Apache SOLR, HiveServer2, & Impala 1.1+
  • 16. Introducing Apache Sentry (cont'd) 16 ● Key Benefits ● Store Sensitive Data in Hadoop ● Extend Hadoop to More Users ● Comply with Regulations
  • 17. Introducing Apache Sentry (cont'd) 17 ● Key Capabilities ● Fine-Grained: SERVERS, DATABASES, TABLES & VIEWS; INDEXES, COLLECTIONS ● Role-Based: role including privileges such as SELECT, INSERT, ALL; UPDATE, QUERY ● Multi-Tenant administration ● Separate policies for each database/schema ● Can be maintained by separate admins
  • 18. Introducing Apache Sentry (cont'd) 18 Binding Layer Impala Impala Hive Policy Engine Policy Provider File Database HiveServer 2 Authorization Provider Local FS/HDFS Search SOLR Pig … Sentry Architecture
  • 19. Introducing Apache Sentry (cont'd) 19 QueryMR SQL Parse Build Check Plan Sentry Validate SQL grammar Construct statement tree Validate statement objects • First check: Authorization Forward to execution planner
  • 20. Introducing Apache Sentry (cont'd) • Actors ● User ● User group membership ● Resources ● Privilege ● Role 20
  • 21. Introducing Apache Sentry (cont'd) • User ● User authenticated ● User identity obtained from session context 21
  • 22. Introducing Apache Sentry (cont'd) • User group membership ● Defined outside sentry policy ● Obtained from user directory (LDAP, AD, HDFS) ● Maybe available from session context 22
  • 23. Introducing Apache Sentry (cont'd) • Resources ● Data to be protected ● File or directory on HDFS ● Table or views in Hive ● URI ● Resource can be hierarchical 23
  • 24. Introducing Apache Sentry (cont'd) • Privilege ● Action or operation associated with a resource ● Exists in a role only ● SELECT on a given TABLE or VIEW ● CREATE a TABLE or VIEW ● QUERY on a search COLLECTION ● DELETE a FILE or DIRECTORY ● Example collection=customerCol->action=query 24
  • 25. Introducing Apache Sentry (cont'd) • Roles ● A collection of privileges ● Defined in Sentry policy ● Example [roles] ana_query_role = collection=sentryColl->action=query ana_update_role = collection=sentryColl->action=update test_role = collection=testColl->action=update full_admin_role = collection=* 25
  • 26. Introducing Apache Sentry (cont'd) • (Group, Role) mapping ● Defined in policy ● One-to-Many ● Example [groups] analyts = ana_query_role, ana_update_role admins = full_admin_role testgroup = test_role hbase = full_admin_role 26
  • 27. Introducing Apache Sentry (cont'd) • Rule evaluation ● Who's the user? ● Which group(s) does the user belong to? ● What resource to be accessed? ● How the resource is accessed (READ, SELECT, etc.)? ● Does any of the user's groups have a role, which has the right privilege? ● Yes – great! Go head! ● No – sorry! No sufficient privilege! 27
  • 28. Outline • Introduction • Hadoop security primer • Authentication • Authorization • Data Protection • Governance and Auditing • Introducing Apache Sentry • What's Sentry • Sentry Architecture • Sentry Internal • Recent Development • Q&A 28
  • 29. Click to edit Master title style 29