SlideShare une entreprise Scribd logo
1  sur  20
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas: Governance for your data
Madhan Neethiraj
Director - Engineering, Atlas PMC
Suma Shivaprasad
Staff Engineer, Atlas PMC
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Disclaimer
 This document may contain product features and technology directions that are under development, may be
under development in the future or may ultimately not be developed.
 Project capabilities are based on information that is publicly available within the Apache Software
Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception
to release through Apache, however, technical feasibility, market demand, user feedback and the
overarching Apache Software Foundation community development process can all effect timing and final
delivery.
 This document’s description of these features and technology directions does not represent a contractual
commitment, promise or obligation from Hortonworks to deliver these features in any generally available
product.
 Product features and technology directions are subject to change, and must not be included in contracts,
purchase orders, or sales agreements of any kind.
 Since this document contains an outline of general product development plans, customers should not rely
upon it when making purchasing decisions.
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
• Introduction
• Apache Atlas
• Overview
• Data Provenance/Lineage
• Classification
• Metadata Catalog Search
• Architecture
• Demo
• Roadmap
• Q & A
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas: Introduction
Provides metadata-driven core foundational governance services for Hadoop and
enterprise data ecosystem
Data Lineage/Provenance
• Captures data lineage across components
Data Classification
• Supports classification of data assets using tags – PII, PHI, PCI, EXPIRES_ON, CLAIMS, LIFE_INSURANCE
Metadata Catalog Search
• Free text search on metadata
• Advanced search using DSL
Integrations
• OOB real-time metadata and lineage ingestion with Hive, Sqoop, Storm/Kafka
• APIs for custom metadata ingestion
• Apache Ranger integration for classification based security
Metadata Repository
• Flexible metamodel to capture technical, business, operational metadata
• Out-of-box models for Hive, Storm, Sqoop, HDFS, Kafka, HBase
• APIs to register custom models
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Background: DGI Community becomes Apache Atlas
May
2015
Apache
Atlas
Incubation
DGI group
Kickoff
Dec
2014
Apr
2017
HDP 2.6/
Apache 0.8
Release
Global Financial
Company
* DGI: Data Governance Initiative
Aug
2016
HDP 2.5/
Apache 0.7
Foundation
Release
Apache 0.8/HDP 2.6
• Simplified Search UI
• Simplified APIs
• Classification-based security for HDFS,
Kafka, HBase
• Knox SSO
• Performance/scalability improvements
Apache 0.7.1/HDP 2.5.3
• High availability support
• LDAP Authentication/Authorization
• Classification based security for Hive
• UI Redesign
• #Committers – 35
• Code contributors from
- Hortonworks, IBM, Aetna,
Merck, Target
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Lineage
• Where does this data originate from (source/provenance)?
• Upstream path: Path through all data assets and processes leading up to current data asset
Impact
• How is this data being used ?
• What other data assets (derivative/dependent) does this impact?
• Downstream path: Path through all data assets and processes leading out of current data asset
Used for forensics
• Impact analysis
• Auditing and Compliance
Apache Atlas : Lineage
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas: Lineage and Impact
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas: Connectors and Ecosystem
Custom
Integration
PartnerPartner
Apache Atlas
RDBMS
Apache
Kafka
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas: Classification
• Categorize and curate data assets for easier discovery
• Associate context with data assets – Governance, Security, Business, …
GOVERNANCE
SECURITY
BUSINESS
10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas Classification : usecase – REFERENCE_DATA
Security policy enforcement for denying updates on immutable data
assets
• REFERENCE_DATA classification associated with immutable hive_table eu_countries
• Apache Ranger policies block updates on the table for all users except admins
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas Classification : usecase – access expiry
Data expiration
• EXPIRES_ON classification with attribute expiry_date
• tax_2009 table tagged with EXPIRES_ON(expiry_date=2016/12/31)
• tax_2010 table tagged with EXPIRES_ON(expiry_date=2017/12/31)
• Apache Ranger policies use the attribute to block access after expiry date
12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas Classification: usecase – attribute based authorization
Data quality
• Deny access to analysts group based on data quality threshold
13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas Classification: usecase – cross component
Classification based security on cross-component data assets
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Atlas Classification: usecase - hierarchy
Security policy enforcement based on classification hierarchy
• Data assets classified as PII will be denied for all contractors
• Data assets classified as FinancePII will be denied for anyone not in
Finance group
• Data assets classified as VendorPII will be denied for anyone not in
PartnerManager group
• ..hence contractors will be denied access to data assets classified as PII,
FinancePII, and VendorPII
PII
Finance
PII
Vendor
PII
Deny: contractors
Deny: public
Except: Finance
Deny: public
Except: PartnerManager
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Metadata Catalog Search : Free Text
Search for a hive_table classified as ‘PII’ and name starting with ‘emp’
Filter by
Data Asset type
Filter by
Classification
Search text
Wildcards: emp*, *dept*
Logical expressions: emp* AND *dept*
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Metadata Catalog Search : Advanced
Filter by
Data asset type
Search for a hive_table named ‘employees’ and owner ‘hive’
DSL search with SQL like syntax
Select columns from impressions table in raw database
hive_column where table.name=‘impressions’ and table.db.name = ‘raw’
DSL query string
Integration
Core
Apps
Type System
Graph Abstraction/Engine
API
<HTTP/REST>
Titan
Metadata Store
<HBase>
Index Store
<Solr>
UI
Business Glossary
(Roadmap)
Metadata Sources
Messaging
<Kafka>
Hive Sqoop Storm Custom Ranger Tag Based
Policies
Ingest / Export Search
Apache Atlas: Architecture
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Business Taxonomy/Glossary
• Vocabulary for business users
• Categorized and curated by data stewards
• Association with physical assets like database, tables for effective data classification
• Discover data assets more naturally with glossary terms
Connectors
• Spark, NiFi, HBase
Hive Column-Level Lineage
Search/Filtering on Lineage
Export/Import of Atlas lineage and metadata
Apache Atlas : Roadmap
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
References
• Apache Atlas
• http://atlas.apache.org
• http://hortonworks.com/apache/atlas
• Apache Atlas community
• https://community.hortonworks.com/spaces/64/governance-lifecycle-
track.html?topics=Atlas&type=question
• Apache Ranger
• http://ranger.apache.org
• http://hortonworks.com/apache/ranger
• https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Contenu connexe

Tendances

Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoopinside-BigData.com
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDatabricks
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for DummiesRodney Joyce
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - RangerIsheeta Sanghi
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Michael Rys
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDatabricks
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Databricks
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Databricks
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouseJames Serra
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsKhalid Salama
 

Tendances (20)

Practical introduction to hadoop
Practical introduction to hadoopPractical introduction to hadoop
Practical introduction to hadoop
 
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache SparkDesigning the Next Generation of Data Pipelines at Zillow with Apache Spark
Designing the Next Generation of Data Pipelines at Zillow with Apache Spark
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Databricks for Dummies
Databricks for DummiesDatabricks for Dummies
Databricks for Dummies
 
Apache Hadoop Security - Ranger
Apache Hadoop Security - RangerApache Hadoop Security - Ranger
Apache Hadoop Security - Ranger
 
Architecting a datalake
Architecting a datalakeArchitecting a datalake
Architecting a datalake
 
Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3Webinar Data Mesh - Part 3
Webinar Data Mesh - Part 3
 
Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)Azure Data Lake Intro (SQLBits 2016)
Azure Data Lake Intro (SQLBits 2016)
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Introduction to AWS Glue
Introduction to AWS GlueIntroduction to AWS Glue
Introduction to AWS Glue
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0Achieving Lakehouse Models with Spark 3.0
Achieving Lakehouse Models with Spark 3.0
 
Building a modern data warehouse
Building a modern data warehouseBuilding a modern data warehouse
Building a modern data warehouse
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Building the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake AnalyticsBuilding the Data Lake with Azure Data Factory and Data Lake Analytics
Building the Data Lake with Azure Data Factory and Data Lake Analytics
 

Similaire à Apache Atlas: Governance for your Data

Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in HadoopMadhan Neethiraj
 
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...DataWorks Summit/Hadoop Summit
 
Atlas and ranger epam meetup
Atlas and ranger epam meetupAtlas and ranger epam meetup
Atlas and ranger epam meetupAlex Zeltov
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it DataWorks Summit/Hadoop Summit
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceHortonworks
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance InitiativeDataWorks Summit
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...Hortonworks
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Hortonworks
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaDataWorks Summit/Hadoop Summit
 
Building a data-driven authorization framework
Building a data-driven authorization frameworkBuilding a data-driven authorization framework
Building a data-driven authorization frameworkDataWorks Summit
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPHortonworks
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Seetharam Venkatesh
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDataWorks Summit
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...DataWorks Summit
 

Similaire à Apache Atlas: Governance for your Data (20)

Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Classification based security in Hadoop
Classification based security in HadoopClassification based security in Hadoop
Classification based security in Hadoop
 
Enterprise Data Classification and Provenance
Enterprise Data Classification and ProvenanceEnterprise Data Classification and Provenance
Enterprise Data Classification and Provenance
 
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
Implementing the Business Catalog in the Modern Enterprise: Bridging Traditio...
 
HDP Next: Governance
HDP Next: GovernanceHDP Next: Governance
HDP Next: Governance
 
Atlas and ranger epam meetup
Atlas and ranger epam meetupAtlas and ranger epam meetup
Atlas and ranger epam meetup
 
What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it What the #$* is a Business Catalog and why you need it
What the #$* is a Business Catalog and why you need it
 
Implementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data GovernanceImplementing a Data Lake with Enterprise Grade Data Governance
Implementing a Data Lake with Enterprise Grade Data Governance
 
Data Governance Initiative
Data Governance InitiativeData Governance Initiative
Data Governance Initiative
 
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015Data Governance - Atlas 7.12.2015
Data Governance - Atlas 7.12.2015
 
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & TrifactaExtend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
Extend Governance in Hadoop with Atlas Ecosystem: Waterline, Attivo & Trifacta
 
Building a data-driven authorization framework
Building a data-driven authorization frameworkBuilding a data-driven authorization framework
Building a data-driven authorization framework
 
Dynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDPDynamic Column Masking and Row-Level Filtering in HDP
Dynamic Column Masking and Row-Level Filtering in HDP
 
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015 Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
Data Governance in Apache Falcon - Hadoop Summit Brussels 2015
 
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache FalconDriving Enterprise Data Governance for Big Data Systems through Apache Falcon
Driving Enterprise Data Governance for Big Data Systems through Apache Falcon
 
An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...Treat your enterprise data lake indigestion: Enterprise ready security and go...
Treat your enterprise data lake indigestion: Enterprise ready security and go...
 

Plus de DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

Plus de DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Dernier

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Apache Atlas: Governance for your Data

  • 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas: Governance for your data Madhan Neethiraj Director - Engineering, Atlas PMC Suma Shivaprasad Staff Engineer, Atlas PMC
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Disclaimer  This document may contain product features and technology directions that are under development, may be under development in the future or may ultimately not be developed.  Project capabilities are based on information that is publicly available within the Apache Software Foundation project websites ("Apache"). Progress of the project capabilities can be tracked from inception to release through Apache, however, technical feasibility, market demand, user feedback and the overarching Apache Software Foundation community development process can all effect timing and final delivery.  This document’s description of these features and technology directions does not represent a contractual commitment, promise or obligation from Hortonworks to deliver these features in any generally available product.  Product features and technology directions are subject to change, and must not be included in contracts, purchase orders, or sales agreements of any kind.  Since this document contains an outline of general product development plans, customers should not rely upon it when making purchasing decisions.
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda • Introduction • Apache Atlas • Overview • Data Provenance/Lineage • Classification • Metadata Catalog Search • Architecture • Demo • Roadmap • Q & A
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas: Introduction Provides metadata-driven core foundational governance services for Hadoop and enterprise data ecosystem Data Lineage/Provenance • Captures data lineage across components Data Classification • Supports classification of data assets using tags – PII, PHI, PCI, EXPIRES_ON, CLAIMS, LIFE_INSURANCE Metadata Catalog Search • Free text search on metadata • Advanced search using DSL Integrations • OOB real-time metadata and lineage ingestion with Hive, Sqoop, Storm/Kafka • APIs for custom metadata ingestion • Apache Ranger integration for classification based security Metadata Repository • Flexible metamodel to capture technical, business, operational metadata • Out-of-box models for Hive, Storm, Sqoop, HDFS, Kafka, HBase • APIs to register custom models
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Background: DGI Community becomes Apache Atlas May 2015 Apache Atlas Incubation DGI group Kickoff Dec 2014 Apr 2017 HDP 2.6/ Apache 0.8 Release Global Financial Company * DGI: Data Governance Initiative Aug 2016 HDP 2.5/ Apache 0.7 Foundation Release Apache 0.8/HDP 2.6 • Simplified Search UI • Simplified APIs • Classification-based security for HDFS, Kafka, HBase • Knox SSO • Performance/scalability improvements Apache 0.7.1/HDP 2.5.3 • High availability support • LDAP Authentication/Authorization • Classification based security for Hive • UI Redesign • #Committers – 35 • Code contributors from - Hortonworks, IBM, Aetna, Merck, Target
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Lineage • Where does this data originate from (source/provenance)? • Upstream path: Path through all data assets and processes leading up to current data asset Impact • How is this data being used ? • What other data assets (derivative/dependent) does this impact? • Downstream path: Path through all data assets and processes leading out of current data asset Used for forensics • Impact analysis • Auditing and Compliance Apache Atlas : Lineage
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas: Lineage and Impact
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas: Connectors and Ecosystem Custom Integration PartnerPartner Apache Atlas RDBMS Apache Kafka
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas: Classification • Categorize and curate data assets for easier discovery • Associate context with data assets – Governance, Security, Business, … GOVERNANCE SECURITY BUSINESS
  • 10. 10 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas Classification : usecase – REFERENCE_DATA Security policy enforcement for denying updates on immutable data assets • REFERENCE_DATA classification associated with immutable hive_table eu_countries • Apache Ranger policies block updates on the table for all users except admins
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas Classification : usecase – access expiry Data expiration • EXPIRES_ON classification with attribute expiry_date • tax_2009 table tagged with EXPIRES_ON(expiry_date=2016/12/31) • tax_2010 table tagged with EXPIRES_ON(expiry_date=2017/12/31) • Apache Ranger policies use the attribute to block access after expiry date
  • 12. 12 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas Classification: usecase – attribute based authorization Data quality • Deny access to analysts group based on data quality threshold
  • 13. 13 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas Classification: usecase – cross component Classification based security on cross-component data assets
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Atlas Classification: usecase - hierarchy Security policy enforcement based on classification hierarchy • Data assets classified as PII will be denied for all contractors • Data assets classified as FinancePII will be denied for anyone not in Finance group • Data assets classified as VendorPII will be denied for anyone not in PartnerManager group • ..hence contractors will be denied access to data assets classified as PII, FinancePII, and VendorPII PII Finance PII Vendor PII Deny: contractors Deny: public Except: Finance Deny: public Except: PartnerManager
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Metadata Catalog Search : Free Text Search for a hive_table classified as ‘PII’ and name starting with ‘emp’ Filter by Data Asset type Filter by Classification Search text Wildcards: emp*, *dept* Logical expressions: emp* AND *dept*
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Metadata Catalog Search : Advanced Filter by Data asset type Search for a hive_table named ‘employees’ and owner ‘hive’ DSL search with SQL like syntax Select columns from impressions table in raw database hive_column where table.name=‘impressions’ and table.db.name = ‘raw’ DSL query string
  • 17. Integration Core Apps Type System Graph Abstraction/Engine API <HTTP/REST> Titan Metadata Store <HBase> Index Store <Solr> UI Business Glossary (Roadmap) Metadata Sources Messaging <Kafka> Hive Sqoop Storm Custom Ranger Tag Based Policies Ingest / Export Search Apache Atlas: Architecture
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Business Taxonomy/Glossary • Vocabulary for business users • Categorized and curated by data stewards • Association with physical assets like database, tables for effective data classification • Discover data assets more naturally with glossary terms Connectors • Spark, NiFi, HBase Hive Column-Level Lineage Search/Filtering on Lineage Export/Import of Atlas lineage and metadata Apache Atlas : Roadmap
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Questions
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved References • Apache Atlas • http://atlas.apache.org • http://hortonworks.com/apache/atlas • Apache Atlas community • https://community.hortonworks.com/spaces/64/governance-lifecycle- track.html?topics=Atlas&type=question • Apache Ranger • http://ranger.apache.org • http://hortonworks.com/apache/ranger • https://cwiki.apache.org/confluence/display/RANGER/Tag+Based+Policies

Notes de l'éditeur

  1. We have a lot to cover, want to apologize in advance
  2. How fast ? 7 months !
  3. Show – clearly identify customer metadata. Change Add customer classification example – Aetna – make the use case story have continuity. Use DX procedures to diagnosis ** bring meta from external systems into hadoop – keep it together