Apache Hadoop is being adopted across all industries for its ability
to store and process an abundance of new types of data in a modern data architecture. But this “Any Data” architecture presents a challenge when organizations must reconcile data management realities and as they bring existing and new data from disparate platforms under management.
Apache Atlas proposes to provide governance capabilities in Hadoop that use both a prescriptive and forensic models enriched by business taxonomical metadata. It is designed to exchange metadata with other tools and processes within and outside of the Hadoop stack, thereby enabling platform-agnostic governance.
4. Use Cases
Financial Reporting
Chain of custody, Lineage narratives
Healthcare
30 day measures reporting
Retail
Point of sale analysis, Price optimization
Telco
Device log management, Correlation, Analysis & Mitigation
14. Atlas: Data Lifecycle Management
Focus on:
● Provenance
● Replication
● Data retention/eviction
● Late data handling
● Automation
Tech: Falcon
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Policy RulesTaxonomies
Policy Engine
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Custom
CWM
Retail
PCI
PII
Other
Data Lifecycle
Management
Other
CWM
Energy
PPDM
15. Atlas: Audit Store
Historical repository
● Security & Operational
● Indexed
● Searchable (DSL)
Tech:
● YARN ATS, HBase, Hive
● Solr, ElasticSearch
○ PluggableApache Atlas
Knowledge Store
ModelsType-System
Policy RulesTaxonomies
Policy Engine
Data Lifecycle
Management
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Custom
CWM
Retail
PCI
PII
Other
Audit Store
Other
CWM
Energy
PPDM
16. Atlas: Policy Engine
Metadata driven
Rationalized at runtime
Geo/Time based rules
Prohibitions
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Taxonomies
Data Lifecycle
Management
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Custom
CWM
Retail
PCI
PII
Other
Policy Rules
Policy Engine
Security
Other
CWM
Energy
PPDM
17. Atlas: Security
Enforces policies
Metadata driven
ABAC (not simple RBAC)
● Attribute-based access control
Tech: Ranger
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Taxonomies
Data Lifecycle
Management
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Custom
CWM
Retail
PCI
PII
Other
Policy Rules
Policy Engine
Security
Other
CWM
Energy
PPDM
18. Atlas: RESTful Interface
API everything
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Policy RulesTaxonomies
Policy Engine
Data Lifecycle
Management
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Energy
PPDM
Retail
PCI
PII
Other
CWM
19. Atlas: Metadata Exchange
Metadata
Metadata
Metadata
Apache Atlas
Knowledge Store
Audit Store
ModelsType-System
Policy RulesTaxonomies
Policy Engine
Data Lifecycle
Management
Security
REST API
Services
Search Lineage Exchange
Healthcare
HIPAA
HL7
Financial
SOX
Dodd-Frank
Energy
PPDM
Retail
PCI
PII
Other
CWM