SlideShare une entreprise Scribd logo
1  sur  36
© Cloudera, Inc. All rights reserved.
SECURING DATA IN HYBRID ENVIRONMENTS
USING APACHE RANGER
Don Bosco Durai, Privacera
Apache Ranger PMC
Madhan Neethiraj, Cloudera
Apache Ranger PMC, Apache Atlas PMC
© Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved.
DISCLAIMER
• This document may contain product features and technology directions that are under
development, may be under development in the future or may ultimately not be developed.
• Project capabilities are based on information that is publicly available within the Apache
Software Foundation project websites ("Apache"). Progress of the project capabilities can be
tracked from inception to release through Apache, however, technical feasibility, market
demand, user feedback and the overarching Apache Software Foundation community
development process can all effect timing and final delivery.
• This document’s description of these features and technology directions does not represent a
contractual commitment, promise or obligation from Cloudera and Privacera to deliver these
features in any generally available product.
• Product features and technology directions are subject to change, and must not be included in
contracts, purchase orders, or sales agreements of any kind.
• Since this document contains an outline of general product development plans, customers
should not rely upon it when making purchasing decisions.
© Cloudera, Inc. All rights reserved.
ABOUT PRIVACERA
Privacera Confidential
CLOUDACCESS MANAGER CLOUD
DISCOVERY
Storage
SQL
No SQL
Streaming,
Serverless,
ML
CLOUD
ANONYMIZATION
© Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved.
AGENDA
Apache Ranger overview
Security Challenges Hybrid Deployment
Implementing Hybrid Security using Ranger
New Features: Security Zones, Role Based Access Control, Conditions at Policy Scope
Demo
Questions
© Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved.
APACHE RANGER: OVERVIEW - HISTORY
Jul 2014
Enters Incubation
Nov 2014
Ranger 0.4.0
Jun 2015
Ranger 0.5.0
x
Jul 2016
Ranger 0.6.0
Nov 2016
Ranger 0.6.2x
Jan 2017
Ranger TLP
graduation!
Jun 2017
Ranger 0.7.1
Mar 2018
Ranger 1.0.0
• Committers: 29
• Contributors
from:
eBay, MSFT,
Huawei, Pandora,
Accenture, ING,
Talend, ZTE
Ranger 1.1.0Ranger 0.7.x
• Tag based Masking
• Export/import of Policies
• $User and macros
• User Sync Nested LDAP
Support
• Plugin status tab
• “Show columns” and
“describe extended support”
• Incremental LDAP Sync
• Time based policies
• Metadata security
• Audit only (compliance) role
• Hive UDF usage authorization
• Show Hive query in audits
• Policy labels
• Audit enhancements
Feb 2017
Ranger 0.7.0
Jul 2018
Ranger 1.1.0
May 2014
XASecure
Acquisition
Ranger 2.0.0
~May 2019
Ranger 2.0.0
Oct 2018
Ranger 1.2.0
Jan 2016
Ranger 0.5.1
Aug 2016
Ranger 0.6.1
• Hadoop3 version updates
• Security zones
• Policy level custom
conditions
• Role based authorization
• DB Schema optimization
for faster policy CRUD
• Hadoop Trusted-proxy
authentication
© Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved.
APACHE RANGER: OVERVIEW – FEATURES
• Centralized policy administration
• Centralized auditing
• Dynamic row filtering
• Dynamic data masking
• Tag based authorization and data-masking policies
• Rich & extendable policy enforcement engine
• Key Management System (KMS)
• New Feature: Security Zones
• New Feature: Support for Roles Based Access Control
• New Feature: Conditions at policy scope
© Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved.
APACHE RANGER: OVERVIEW – CENTRALIZED AUTHORIZATION
© Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved.
SECURITY IN HYBRID ENVIRONMENT
© Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OVERVIEW
On Premise
HDFS Hive Kafka Spark
Hive
Ranger
HDInsight
HiveSpark
EMR
Ranger
Ranger DB
Presto
Security
Admins
Data
Stewards
© Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: SECURITY CHALLENGES
• Every environment has different security model
• Access policies needs to be set in each environment
• Policies needs to be consistent
• The granularity of access control are not the same
• Policies can go out of sync very soon
• Regulation and compliance requirements on what data
can be copied to cloud and whether it should be
encrypted or deidentified
© Cloudera, Inc. All rights reserved. 11© Cloudera, Inc. All rights reserved.
HDInsight
Option #1
Restrict Data from On-premise
Option #2
Centralized Ranger
© Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OPTION #1
• Filter & Redact data copied to cloud
• Use Hive to export data to S3
• Apply Ranger Row Level Filtering and Column Masking on ETL user (e.g.
s3etl)
• Setup cloud native access policies for copied data
© Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved.
APACHE RANGER: ROW-FILTER, COLUMN-MASKING POLICIES
ID CONSENT TAX_ID NAME EMAIL
1 Y 123456789 John john@acme.com
2 Y 987654321 Jane jane@acme.com
3 N 789654123 Mary mary@acme.com
4 Y 321789654 David david@acme.com
5 N 456321789 Max max@acme.com
ID CONSENT TAX_ID NAME EMAIL
1 Y xxxxxxxxxx John dkrx@acme.com
2 Y xxxxxxxxxx Jane yafe@acme.com
4 Y xxxxxxxxxx David aumd2@acme.com
© Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved.
APACHE RANGER: ROW-FILTER, COLUMN-MASKING POLICIES
© Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OPTION #1 – PROS AND CONS
• Advantages
• Simple to implement
• Fine grained policies enforced on premise using Filtering, Redaction and Transformation
• Use cloud security policy for coarse grain policies
• Make data accessible to non-Ranger supported services like AWS Redshift, AWS Athena,
SageMaker, etc.
• Limitation
• Not real-time
• If policies changes, then data need to be recopied to cloud
• Need to manage policies on both the sides
© Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OPTION #2 - CENTRALIZED SECURITY
On Premise
HDFS Hive Kafka Spark
Hive
Ranger
HDInsight
HiveSpark
EMR
Ranger
Ranger DB
Presto
Security
Admins
Data
Stewards
© Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OPTION #2
• Common Ranger Admin or Ranger Database for all environments
• Single Ranger to manage the policies for all environments
• If you are using the same name for resources, e.g. Database, Table and
Column name, then a same policy would be used by all the environments
• Tag-based policies can be used to authorize access to cloud-specific data as
well
• Use new Ranger features under development to support central policy
management
• Security Zone
• Scoped Policy
• Roles in Ranger
© Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved.
HYBRID DEPLOYMENT: OPTION #2 – PROS AND CONS
• Advantages
• Centrally Manage security policies for all environments
• Policy changes applied in real-time in all environments
• Leverage Tag Based policies for consistent behavior
• Increasing support for Ranger by 3rd party vendors. Privacera, StarBurst, Dremio, Microsoft,
EMC Isilon, etc.
• Limitation
• Need reliable and secure network connectivity between premise and cloud (site to site VPN)
• All cloud components might be not supported by Open Source Ranger.
• Ranger integration for cloud environment is not supported by the community and will require
additional setup in the cloud services/deployments
© Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved.
PRIVACERA EXTENSION TO APACHE RANGER
© Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved.
DEMO
© Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved.
SECURITY ZONES
© Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES - INTRODUCTION
• Partition resources for easier administration of security policies
• Policies in a zone are applied only for resources included in the
zone. For example:
• a landing zone policy for db=* applies only for the resources of landing
zone. It will not impact other resources, like db=marketing
• Policy administration for each zone can be delegated to specific
users/groups
Zone HDFS Hive HBase Kafka
landing /landing/ db=*landing
staging /staging/ db=*staging table=*staging
marketing /marketing db=marketing table=marketing topic=mktg_campaign
© Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES - INTRODUCTION
• Audit log includes zone name, allows to quickly filter accesses to
resources of a zone
• REST API for Security Zone administration
• Example use cases:
• ‘on-prem’ zone for resources that should only be accessible from on-prem
clusters
• ‘test-data’ zone for resources that can be used for test purposes by wider
set of users/groups, without impacting production data
© Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES - ADMINISTRATION
© Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES - ADMINISTRATION
© Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES - POLICY ADMINISTRATION
• Users see only zones in
which they have admin
privileges
• Zone support extends to
access, data-masking,
row-filter and tag-based
policies
© Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved.
APACHE RANGER: SECURITY ZONES – AUDIT LOGS
• Shows zone of the
accessed resource
• Audits can be filtered by
zone
• Only policies in zone of
the accessed resource
are used to authorize
© Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved.
ROLE BASED ACCESS CONTROL
© Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved.
APACHE RANGER: ROLE BASED ACCESS CONTROL - INTRODUCTION
• Ranger policy model extended to support roles
• RBAC is widely used in enterprise applications & cloud environments
• Roles can be used in
• resource-based authorization policies
• tag-based authorization policies
• data-masking policies
• row-filtering policies
• Role management REST API
© Cloudera, Inc. All rights reserved. 30© Cloudera, Inc. All rights reserved.
APACHE RANGER: ROLE BASED ACCESS CONTROL – ROLE ADMIN
© Cloudera, Inc. All rights reserved. 31© Cloudera, Inc. All rights reserved.
APACHE RANGER: ROLE BASED ACCESS CONTROL - POLICY
© Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved.
CONDITIONS AT POLICY SCOPE
© Cloudera, Inc. All rights reserved. 33© Cloudera, Inc. All rights reserved.
APACHE RANGER: CONDITIONS AT POLICY SCOPE - INTRODUCTION
• Conditions can now be set at policy scope, in addition to policy-item scope
• Simplifies use of conditions in policies
• Example use cases:
• Policies specific to access cluster i.e. on-prem, cloud
• Multiple policies for a given tag, for different tag-attribute values
i.e. PII type=email, PII: type=ccn
© Cloudera, Inc. All rights reserved. 34© Cloudera, Inc. All rights reserved.
APACHE RANGER: CONDITIONS AT POLICY SCOPE - SAMPLE
Access cluster type: cloud
© Cloudera, Inc. All rights reserved. 35© Cloudera, Inc. All rights reserved.
APACHE RANGER: CONDITIONS AT POLICY SCOPE - SAMPLE
tagAttr.type == ‘ccn’
tagAttr.type == ‘email’
© Cloudera, Inc. All rights reserved.
THANK YOU

Contenu connexe

Tendances

Ozone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR complianceOzone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
Dinesh Chitlangia
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
DataWorks Summit
 

Tendances (20)

Big Data Platform Industrialization
Big Data Platform Industrialization Big Data Platform Industrialization
Big Data Platform Industrialization
 
Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4Hdfs 2016-hadoop-summit-san-jose-v4
Hdfs 2016-hadoop-summit-san-jose-v4
 
Curb your insecurity with HDP
Curb your insecurity with HDPCurb your insecurity with HDP
Curb your insecurity with HDP
 
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR complianceOzone: Evolution of HDFS scalability & built-in GDPR compliance
Ozone: Evolution of HDFS scalability & built-in GDPR compliance
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016Hadoop Security and Compliance - StampedeCon 2016
Hadoop Security and Compliance - StampedeCon 2016
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 
Scaling HDFS at Xiaomi
Scaling HDFS at XiaomiScaling HDFS at Xiaomi
Scaling HDFS at Xiaomi
 
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo VanzinSecuring Spark Applications by Kostas Sakellis and Marcelo Vanzin
Securing Spark Applications by Kostas Sakellis and Marcelo Vanzin
 
Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...Dancing elephants - efficiently working with object stores from Apache Spark ...
Dancing elephants - efficiently working with object stores from Apache Spark ...
 
Multi-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BTMulti-Tenant Operations with Cloudera 5.7 & BT
Multi-Tenant Operations with Cloudera 5.7 & BT
 
Hybrid Data Platform
Hybrid Data Platform Hybrid Data Platform
Hybrid Data Platform
 
Data protection for hadoop environments
Data protection for hadoop environmentsData protection for hadoop environments
Data protection for hadoop environments
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo ScaleManaging Hadoop, HBase and Storm Clusters at Yahoo Scale
Managing Hadoop, HBase and Storm Clusters at Yahoo Scale
 
Enabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government dataEnabling Modern Application Architecture using Data.gov open government data
Enabling Modern Application Architecture using Data.gov open government data
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
 
Securing data in hybrid environments using Apache Ranger
Securing data in hybrid environments using Apache RangerSecuring data in hybrid environments using Apache Ranger
Securing data in hybrid environments using Apache Ranger
 

Similaire à Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger

Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
DataWorks Summit
 

Similaire à Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger (20)

大数据数据治理及数据安全
大数据数据治理及数据安全大数据数据治理及数据安全
大数据数据治理及数据安全
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
Optimized Data Management with Cloudera 5.7: Understanding data value with Cl...
 
Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
 
Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015Hadoop security @ Philly Hadoop Meetup May 2015
Hadoop security @ Philly Hadoop Meetup May 2015
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
Spark One Platform Webinar
Spark One Platform WebinarSpark One Platform Webinar
Spark One Platform Webinar
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Presentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion seguraPresentacion de solucion cloud de navegacion segura
Presentacion de solucion cloud de navegacion segura
 
Cloud's Hidden Impact on IT Shops
Cloud's Hidden Impact on IT ShopsCloud's Hidden Impact on IT Shops
Cloud's Hidden Impact on IT Shops
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
Big data journey to the cloud 5.30.18 asher bartch
Big data journey to the cloud 5.30.18   asher bartchBig data journey to the cloud 5.30.18   asher bartch
Big data journey to the cloud 5.30.18 asher bartch
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
 
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber SolutionGet Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 

Plus de DataWorks Summit

HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
DataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
DataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
Transforming and Scaling Large Scale Data Analytics: Moving to a Cloud-based ...
 
Applying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real ProblemsApplying Noisy Knowledge Graphs to Real Problems
Applying Noisy Knowledge Graphs to Real Problems
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger

  • 1. © Cloudera, Inc. All rights reserved. SECURING DATA IN HYBRID ENVIRONMENTS USING APACHE RANGER Don Bosco Durai, Privacera Apache Ranger PMC Madhan Neethiraj, Cloudera Apache Ranger PMC, Apache Atlas PMC
  • 2. © Cloudera, Inc. All rights reserved. 2© Cloudera, Inc. All rights reserved. DISCLAIMER • This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed. • Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery. • This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Cloudera and Privacera to deliver these features in any generally available product. • Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind. • Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
  • 3. © Cloudera, Inc. All rights reserved. ABOUT PRIVACERA Privacera Confidential CLOUDACCESS MANAGER CLOUD DISCOVERY Storage SQL No SQL Streaming, Serverless, ML CLOUD ANONYMIZATION
  • 4. © Cloudera, Inc. All rights reserved. 4© Cloudera, Inc. All rights reserved. AGENDA Apache Ranger overview Security Challenges Hybrid Deployment Implementing Hybrid Security using Ranger New Features: Security Zones, Role Based Access Control, Conditions at Policy Scope Demo Questions
  • 5. © Cloudera, Inc. All rights reserved. 5© Cloudera, Inc. All rights reserved. APACHE RANGER: OVERVIEW - HISTORY Jul 2014 Enters Incubation Nov 2014 Ranger 0.4.0 Jun 2015 Ranger 0.5.0 x Jul 2016 Ranger 0.6.0 Nov 2016 Ranger 0.6.2x Jan 2017 Ranger TLP graduation! Jun 2017 Ranger 0.7.1 Mar 2018 Ranger 1.0.0 • Committers: 29 • Contributors from: eBay, MSFT, Huawei, Pandora, Accenture, ING, Talend, ZTE Ranger 1.1.0Ranger 0.7.x • Tag based Masking • Export/import of Policies • $User and macros • User Sync Nested LDAP Support • Plugin status tab • “Show columns” and “describe extended support” • Incremental LDAP Sync • Time based policies • Metadata security • Audit only (compliance) role • Hive UDF usage authorization • Show Hive query in audits • Policy labels • Audit enhancements Feb 2017 Ranger 0.7.0 Jul 2018 Ranger 1.1.0 May 2014 XASecure Acquisition Ranger 2.0.0 ~May 2019 Ranger 2.0.0 Oct 2018 Ranger 1.2.0 Jan 2016 Ranger 0.5.1 Aug 2016 Ranger 0.6.1 • Hadoop3 version updates • Security zones • Policy level custom conditions • Role based authorization • DB Schema optimization for faster policy CRUD • Hadoop Trusted-proxy authentication
  • 6. © Cloudera, Inc. All rights reserved. 6© Cloudera, Inc. All rights reserved. APACHE RANGER: OVERVIEW – FEATURES • Centralized policy administration • Centralized auditing • Dynamic row filtering • Dynamic data masking • Tag based authorization and data-masking policies • Rich & extendable policy enforcement engine • Key Management System (KMS) • New Feature: Security Zones • New Feature: Support for Roles Based Access Control • New Feature: Conditions at policy scope
  • 7. © Cloudera, Inc. All rights reserved. 7© Cloudera, Inc. All rights reserved. APACHE RANGER: OVERVIEW – CENTRALIZED AUTHORIZATION
  • 8. © Cloudera, Inc. All rights reserved. 8© Cloudera, Inc. All rights reserved. SECURITY IN HYBRID ENVIRONMENT
  • 9. © Cloudera, Inc. All rights reserved. 9© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OVERVIEW On Premise HDFS Hive Kafka Spark Hive Ranger HDInsight HiveSpark EMR Ranger Ranger DB Presto Security Admins Data Stewards
  • 10. © Cloudera, Inc. All rights reserved. 10© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: SECURITY CHALLENGES • Every environment has different security model • Access policies needs to be set in each environment • Policies needs to be consistent • The granularity of access control are not the same • Policies can go out of sync very soon • Regulation and compliance requirements on what data can be copied to cloud and whether it should be encrypted or deidentified
  • 11. © Cloudera, Inc. All rights reserved. 11© Cloudera, Inc. All rights reserved. HDInsight Option #1 Restrict Data from On-premise Option #2 Centralized Ranger
  • 12. © Cloudera, Inc. All rights reserved. 12© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OPTION #1 • Filter & Redact data copied to cloud • Use Hive to export data to S3 • Apply Ranger Row Level Filtering and Column Masking on ETL user (e.g. s3etl) • Setup cloud native access policies for copied data
  • 13. © Cloudera, Inc. All rights reserved. 13© Cloudera, Inc. All rights reserved. APACHE RANGER: ROW-FILTER, COLUMN-MASKING POLICIES ID CONSENT TAX_ID NAME EMAIL 1 Y 123456789 John john@acme.com 2 Y 987654321 Jane jane@acme.com 3 N 789654123 Mary mary@acme.com 4 Y 321789654 David david@acme.com 5 N 456321789 Max max@acme.com ID CONSENT TAX_ID NAME EMAIL 1 Y xxxxxxxxxx John dkrx@acme.com 2 Y xxxxxxxxxx Jane yafe@acme.com 4 Y xxxxxxxxxx David aumd2@acme.com
  • 14. © Cloudera, Inc. All rights reserved. 14© Cloudera, Inc. All rights reserved. APACHE RANGER: ROW-FILTER, COLUMN-MASKING POLICIES
  • 15. © Cloudera, Inc. All rights reserved. 15© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OPTION #1 – PROS AND CONS • Advantages • Simple to implement • Fine grained policies enforced on premise using Filtering, Redaction and Transformation • Use cloud security policy for coarse grain policies • Make data accessible to non-Ranger supported services like AWS Redshift, AWS Athena, SageMaker, etc. • Limitation • Not real-time • If policies changes, then data need to be recopied to cloud • Need to manage policies on both the sides
  • 16. © Cloudera, Inc. All rights reserved. 16© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OPTION #2 - CENTRALIZED SECURITY On Premise HDFS Hive Kafka Spark Hive Ranger HDInsight HiveSpark EMR Ranger Ranger DB Presto Security Admins Data Stewards
  • 17. © Cloudera, Inc. All rights reserved. 17© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OPTION #2 • Common Ranger Admin or Ranger Database for all environments • Single Ranger to manage the policies for all environments • If you are using the same name for resources, e.g. Database, Table and Column name, then a same policy would be used by all the environments • Tag-based policies can be used to authorize access to cloud-specific data as well • Use new Ranger features under development to support central policy management • Security Zone • Scoped Policy • Roles in Ranger
  • 18. © Cloudera, Inc. All rights reserved. 18© Cloudera, Inc. All rights reserved. HYBRID DEPLOYMENT: OPTION #2 – PROS AND CONS • Advantages • Centrally Manage security policies for all environments • Policy changes applied in real-time in all environments • Leverage Tag Based policies for consistent behavior • Increasing support for Ranger by 3rd party vendors. Privacera, StarBurst, Dremio, Microsoft, EMC Isilon, etc. • Limitation • Need reliable and secure network connectivity between premise and cloud (site to site VPN) • All cloud components might be not supported by Open Source Ranger. • Ranger integration for cloud environment is not supported by the community and will require additional setup in the cloud services/deployments
  • 19. © Cloudera, Inc. All rights reserved. 19© Cloudera, Inc. All rights reserved. PRIVACERA EXTENSION TO APACHE RANGER
  • 20. © Cloudera, Inc. All rights reserved. 20© Cloudera, Inc. All rights reserved. DEMO
  • 21. © Cloudera, Inc. All rights reserved. 21© Cloudera, Inc. All rights reserved. SECURITY ZONES
  • 22. © Cloudera, Inc. All rights reserved. 22© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES - INTRODUCTION • Partition resources for easier administration of security policies • Policies in a zone are applied only for resources included in the zone. For example: • a landing zone policy for db=* applies only for the resources of landing zone. It will not impact other resources, like db=marketing • Policy administration for each zone can be delegated to specific users/groups Zone HDFS Hive HBase Kafka landing /landing/ db=*landing staging /staging/ db=*staging table=*staging marketing /marketing db=marketing table=marketing topic=mktg_campaign
  • 23. © Cloudera, Inc. All rights reserved. 23© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES - INTRODUCTION • Audit log includes zone name, allows to quickly filter accesses to resources of a zone • REST API for Security Zone administration • Example use cases: • ‘on-prem’ zone for resources that should only be accessible from on-prem clusters • ‘test-data’ zone for resources that can be used for test purposes by wider set of users/groups, without impacting production data
  • 24. © Cloudera, Inc. All rights reserved. 24© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES - ADMINISTRATION
  • 25. © Cloudera, Inc. All rights reserved. 25© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES - ADMINISTRATION
  • 26. © Cloudera, Inc. All rights reserved. 26© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES - POLICY ADMINISTRATION • Users see only zones in which they have admin privileges • Zone support extends to access, data-masking, row-filter and tag-based policies
  • 27. © Cloudera, Inc. All rights reserved. 27© Cloudera, Inc. All rights reserved. APACHE RANGER: SECURITY ZONES – AUDIT LOGS • Shows zone of the accessed resource • Audits can be filtered by zone • Only policies in zone of the accessed resource are used to authorize
  • 28. © Cloudera, Inc. All rights reserved. 28© Cloudera, Inc. All rights reserved. ROLE BASED ACCESS CONTROL
  • 29. © Cloudera, Inc. All rights reserved. 29© Cloudera, Inc. All rights reserved. APACHE RANGER: ROLE BASED ACCESS CONTROL - INTRODUCTION • Ranger policy model extended to support roles • RBAC is widely used in enterprise applications & cloud environments • Roles can be used in • resource-based authorization policies • tag-based authorization policies • data-masking policies • row-filtering policies • Role management REST API
  • 30. © Cloudera, Inc. All rights reserved. 30© Cloudera, Inc. All rights reserved. APACHE RANGER: ROLE BASED ACCESS CONTROL – ROLE ADMIN
  • 31. © Cloudera, Inc. All rights reserved. 31© Cloudera, Inc. All rights reserved. APACHE RANGER: ROLE BASED ACCESS CONTROL - POLICY
  • 32. © Cloudera, Inc. All rights reserved. 32© Cloudera, Inc. All rights reserved. CONDITIONS AT POLICY SCOPE
  • 33. © Cloudera, Inc. All rights reserved. 33© Cloudera, Inc. All rights reserved. APACHE RANGER: CONDITIONS AT POLICY SCOPE - INTRODUCTION • Conditions can now be set at policy scope, in addition to policy-item scope • Simplifies use of conditions in policies • Example use cases: • Policies specific to access cluster i.e. on-prem, cloud • Multiple policies for a given tag, for different tag-attribute values i.e. PII type=email, PII: type=ccn
  • 34. © Cloudera, Inc. All rights reserved. 34© Cloudera, Inc. All rights reserved. APACHE RANGER: CONDITIONS AT POLICY SCOPE - SAMPLE Access cluster type: cloud
  • 35. © Cloudera, Inc. All rights reserved. 35© Cloudera, Inc. All rights reserved. APACHE RANGER: CONDITIONS AT POLICY SCOPE - SAMPLE tagAttr.type == ‘ccn’ tagAttr.type == ‘email’
  • 36. © Cloudera, Inc. All rights reserved. THANK YOU