SlideShare une entreprise Scribd logo
1  sur  25
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Bridle Your Flying Islands And Castles In The Sky:
Built-in Governance And Security For The Cloud
DataWorks Summit San Jose
June 2017
2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Presenters
Jeff Sposetti
Senior Director of Product Management, Cloud
Hortonworks Data Cloud, Cloudbreak
Srikanth Venkat
Senior Director of Product Management, Security & Governance
Apache Ranger, Apache Atlas, Apache Knox
3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Agenda
 Introduction
 Security & Governance: Apache Ranger, Atlas and Knox
 Bringing It Together: Cloud and Data Lake Security
 Demo
 Wrap Up
 Q & A
4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Background: Ephemeral Workloads + Cloud Storage
CLOUD STORAGE
S3
WORKLOAD CLUSTERS
Durable Ephemeral
5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Introducing Hortonworks Data Cloud
 Focuses on business agility, rather than
infinite configurability and cluster
management
 Addresses prescriptive, ephemeral use
cases around Apache Spark + Apache Hive
 Pre-tuned and configured for use with
Amazon S3
Learn more:
http://hortonworks.com/products/cloud/aws/
6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Quick demo…
7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
CLOUD
DATA LAKE
SECURITY
8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Security & Governance:
Apache Ranger, Atlas and Knox
9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Protecting the Elephant in the Castle…..
Kerberos,
Wire Encryption
HDFS Encryption
Apache Ranger
Network Segmentation,
Firewalls
LDAP/AD
Apache Knox
1
0
© Hortonworks Inc. 2011 – 2017. All Rights
Reserved
Apache Knox Proxying Services
★ Provide access to Hadoop via proxying of
HTTP resources
★ Ecosystem APIs and UIs + Hadoop oriented
dispatching for Kerberos + doAs
(impersonation) etc.
Authentication Services
★ REST API access, WebSSO flow for UIs
★ LDAP/AD, Header based PreAuth
★ Kerberos, SAML, OAuth
Client DSL/SDK Services
★ Scripting through DSL
★ Using Knox Shell classes directly as SDK
11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Apache Ranger
Comprehensive and Extensible Security Model
– Centralized platform to define, administer and manage
security policies across Hadoop components (HDFS, Hive,
HBase, YARN, Kafka, Solr, Storm, Knox, NiFi, Atlas)
– Extensible Architecture with ability to add custom policy
conditions, user context enrichers
Centralized Auditing
– Central audit location for all access requests
– Support multiple destination sources (HDFS, Solr, etc.)
– Real-time visual query interface
Fine-Grained Authorization
for data access control for Database, Table, Column, LDAP
Groups & Specific Users
Advanced Security
• Dynamic Security Policies: Tag (Atlas), Prohibition, Time, and
Location
• Dynamic Column Masking & Row Filtering
OPERATIONS SECURITY
GOVERNANCE
STORAGE
STORAGE
Machine
Learning
Batch
StreamingInteractive
Search
SECURITY
12
© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved
STRUCTURED
Atlas: Metadata & Governance
TRADITIONAL
RDBMS
METADATA
MPP
APPLIANCES
Kafka Storm
Sqoop
Hive
ATLAS
METADATA
Falcon
RANGER
Custom
Partners
Metadata-driven governance services for Hadoop and
enterprise big data ecosystems
Data Lineage/Provenance
 Along the entire data lifecycle with integrated Cross
component lineage
Data Classification
 Supports classification of data assets using tags (e.g. PII,
PHI, PCI etc.) and attributes
Metadata Catalog Search
 Free text search on metadata
 Advanced search using DSL
Integrations
across the Hadoop ecosystem, through a common metadata
store
 Free text search on metadata
 OOtB real-time metadata and lineage ingestion with Hive,
Sqoop, Storm/Kafka
 APIs for custom metadata ingestion
 Apache Ranger integration for classification based security
Key Benefits:
Modern Data Lakes need new ways to
govern because:
• Cost – Traditional staff ratio to data size not possible
• Diversity – Only way to manage velocity of new datasets
• Agility – Quick change based on tags / taxonomy
13
© Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved
HDP – Security & Governance
Classification
Prohibition
Time
Location
Policies
PDP
Resource
Cache
Ranger
Manage Access Policies
and Audit Logs
Track Metadata
and Lineage
Atlas Client
Subscribers
to Topic
Gets Metadata
Updates
Atlas
Metastore
Tags
Assets
Entitles
Streams
Pipelines
Feeds
Hive
Tables
HDFS
Files
HBase
Tables
Entities
in Data
Lake
Industry First: Dynamic Tag-based Security Policies
14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Bringing It Together:
Cloud and Data Lake Services for
Enterprise Ephemeral Workloads
15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Ephemeral Workloads: Basic -> Advanced -> Enterprise
Basic Ephemeral Advanced Ephemeral Enterprise Ephemeral
Tuned and Optimized
Infrastructure
Simplified, Automated
Operations
S3 Integration
Protected Network Access
Schema - Shared (Hive Metastore) Shared (Hive Metastore)
Authentication Single-user Single-user Multi-User (LDAP/AD)
Authorization - - Security Policies (Ranger)
Audit - - Audit (Ranger)
16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Ephemeral Workloads for the Cloud
CLOUD STORAGE
S3
WORKLOADS
Durable Ephemeral
Metastore
SCHEMA
Long Running
Security access to workload
clusters via a Protected Gateway
enabled for AuthN and HTTPS.
Define your data schema, security
policies, and metadata catalog
once for your ephemeral and
always-on workloads.
Atlas
CATALOG
Ranger
POLICY
SHARED DATA LAKE SERVICES
17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Key Components for an Enterprise Ephemeral Deployment
SCHEMA POLICY AUDIT DIRECTORY
WHAT
Provides Hive schema (tables,
views, etc).
WHY
If you have 2+ workloads
accessing the same data, need
to share schema across those
workloads.
HOW
Externalize Hive Metastore
into for schema definition.
WHAT
Defines security policies
around Hive schema.
WHY
If you have 2+ users accessing
the same data, need policies
to be consistently available
and enforced.
HOW
Externalize and share Ranger
across workloads and store
policies external.
WHAT
Audit user access.
WHY
Capture data access activity.
HOW
Externalize and share Ranger
across workloads, leverage
cloud storage for audit data.
GATEWAY
WHAT
Provide single endpoint that
can be protected with SSL and
enabled for authentication to
access to cluster resources.
WHY
Avoid opening many ports,
some potentially w/o
authentication or SSL
protection.
HOW
Deploy a centralized protected
gateway automatically.
WHAT
Users and groups.
WHY
Provide authentication source
for users and authorization
source for groups.
HOW
Leverage external LDAP or
Active Directory.
18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Architecture of an Enterprise Ephemeral Deployment
Access your cluster
components through the
protected gateway via SSL
on port 443 open on the
controller security group.
CONTROLLER
PROTECTED
GATEWAY
USER ACCESS
Zeppelin
HIVE LLAP / SPARK WORKLOADS
Hive
SHARED DATA LAKE SERVICES
Ranger
POLICY
(RDS)
AUDIT
(S3)
SCHEMA
(RDS)
DIRECTORY
(LDAP/AD)
Spark
Hive
Metasto
re
19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Hortonworks Data Cloud + Shared Data Lake Services
1
2
3
Register an Authentication Source (i.e. LDAP/AD).
Create a “Shared Data Lake”, specify S3 Bucket & RDS.
When you create a cluster, ”attach” to the Shared Data Lake Services:
• for Multi-User AuthN (LDAP/AD)
• for AuthZ + Audit (Ranger)
• for Schema (Hive Metastore)
PREREQUISITES
• LDAP/AD
• S3 Bucket
• Amazon RDS Instance
20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Longer demo…
21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Guidelines for Shared Data Lake Services
 Stay Ephemeral. All of your data and metadata in S3 and RDS respectively, do not create
tables or files in the local HDFS.
 The Hive warehouse is setup to be on S3 for data lakes, create tables in this location
instead of individual S3 buckets, it will make them easier to manage.
 Use Hive “external tables” for tables that are outside this warehouse, typically if the
data is being ingested through some path outside of Hadoop
 Create S3 bucket policies that exactly match usage so that you can spin up clusters with
the least privilege.
22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Wrap Up
23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Bringing Big Data + Cloud + Security Together: The Takeaways
 Cloud driving more ephemeral data processing use cases
 Ephemeral workloads leverage cloud storage
 This pattern is driving an architectural approach for “shared data lake services”
 To provide shared security services for Authentication, Authorization and Audit
** Building blocks are Apache Ranger, Apache Atlas and Apache Knox
THINK
WORKLOADS
Deploy “what you need”…
Right tool, right job!
THINK
SHARED
Shared “security policies”…
End-to-end security for workloads!**
THINK
EPHEMERAL
Run “when you need”…
Workloads will come and go!
24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Learn More
 Try Hortonworks Data Cloud 2.0 Technical Preview
– http://hortonworks.github.io/hdp-aws/
 BREAKOUT SESSIONS
– Wednesday, June 14 @ 5:50p, Don’t Let Spark Burn Your House: Perspectives on Securing Spark
 CRASH COURSE
– Thursday, June 15 @ 3:00p – 6:00p, Apache Spark and Apache Hive processing on the Cloud
 BIRDS OF A FEATHER
– Thursday, June 15 @ 5:00p, Security and Governance
– Thursday, June 15 @ 5:00p, Cloud and Operations
25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved
Thank You
https://hortonworks.com/products/cloud/aws/
https://hortonworks.com/apache/ranger/
https://hortonworks.com/apache/atlas/

Contenu connexe

Tendances

Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeDataWorks Summit
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?DataWorks Summit/Hadoop Summit
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningEvans Ye
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersDataWorks Summit
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingDataWorks Summit
 
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceThe Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceBlueData, Inc.
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...DataWorks Summit
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariDataWorks Summit
 
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsCloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsDataWorks Summit/Hadoop Summit
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataDataWorks Summit
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsDataWorks Summit
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?DataWorks Summit
 
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceBig SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceDataWorks Summit
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data InsightsDataWorks Summit
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...DataWorks Summit
 
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...DataWorks Summit
 
Apache hive essentials
Apache hive essentialsApache hive essentials
Apache hive essentialsSteve Tran
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureUwe Printz
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...DataWorks Summit
 

Tendances (20)

An Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache KnoxAn Approach for Multi-Tenancy Through Apache Knox
An Approach for Multi-Tenancy Through Apache Knox
 
Schema Registry - Set Your Data Free
Schema Registry - Set Your Data FreeSchema Registry - Set Your Data Free
Schema Registry - Set Your Data Free
 
Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?Is your Enterprise Data lake Metadata Driven AND Secure?
Is your Enterprise Data lake Metadata Driven AND Secure?
 
Leveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioningLeveraging docker for hadoop build automation and big data stack provisioning
Leveraging docker for hadoop build automation and big data stack provisioning
 
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise UsersApache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
Apache Knox Gateway "Single Sign On" expands the reach of the Enterprise Users
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
 
The Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-ServiceThe Time Has Come for Big-Data-as-a-Service
The Time Has Come for Big-Data-as-a-Service
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
 
Manage Add-On Services with Apache Ambari
Manage Add-On Services with Apache AmbariManage Add-On Services with Apache Ambari
Manage Add-On Services with Apache Ambari
 
Cloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World ConsiderationsCloudy with a Chance of Hadoop - Real World Considerations
Cloudy with a Chance of Hadoop - Real World Considerations
 
HAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged DataHAWQ Meets Hive - Querying Unmanaged Data
HAWQ Meets Hive - Querying Unmanaged Data
 
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data AnalyticsApache Ignite vs Alluxio: Memory Speed Big Data Analytics
Apache Ignite vs Alluxio: Memory Speed Big Data Analytics
 
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
Security needs in Hadoop’s Current and Future – How Apache Ranger can help?
 
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open sourceBig SQL: Powerful SQL Optimization - Re-Imagined for open source
Big SQL: Powerful SQL Optimization - Re-Imagined for open source
 
Accelerating Big Data Insights
Accelerating Big Data InsightsAccelerating Big Data Insights
Accelerating Big Data Insights
 
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
 
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
 
Apache hive essentials
Apache hive essentialsApache hive essentials
Apache hive essentials
 
Hadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, FutureHadoop & Security - Past, Present, Future
Hadoop & Security - Past, Present, Future
 
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
Extending Apache Ranger Authorization Beyond Hadoop: Review of Apache Ranger ...
 

Similaire à Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Security for the Cloud

Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxDataWorks Summit
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Clusterahortonworks
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsDataWorks Summit
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudDataWorks Summit/Hadoop Summit
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemShivaji Dutta
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop SecurityDataWorks Summit
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks
 
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Rommel Garcia
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in HadoopRommel Garcia
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - finalHortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Hortonworks
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopYifeng Jiang
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016alanfgates
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not laterDataWorks Summit
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoopGergely Devenyi
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalHortonworks
 

Similaire à Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Security for the Cloud (20)

Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
 
Curb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure ClusterCurb your insecurity with HDP - Tips for a Secure Cluster
Curb your insecurity with HDP - Tips for a Secure Cluster
 
Cloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerationsCloudy with a chance of Hadoop - real world considerations
Cloudy with a chance of Hadoop - real world considerations
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
Hadoop security
Hadoop securityHadoop security
Hadoop security
 
Improvements in Hadoop Security
Improvements in Hadoop SecurityImprovements in Hadoop Security
Improvements in Hadoop Security
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Hadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise HadoopHadoop Present - Open Enterprise Hadoop
Hadoop Present - Open Enterprise Hadoop
 
Big data spain keynote nov 2016
Big data spain keynote nov 2016Big data spain keynote nov 2016
Big data spain keynote nov 2016
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
Curb Your Insecurity - Tips for a Secure Cluster (with Spark too)!!
 
Curb your insecurity with HDP
Curb your insecurity with HDPCurb your insecurity with HDP
Curb your insecurity with HDP
 
Cloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep DiveCloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep Dive
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 

Dernier (20)

Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 

Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Security for the Cloud

  • 1. 1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Bridle Your Flying Islands And Castles In The Sky: Built-in Governance And Security For The Cloud DataWorks Summit San Jose June 2017
  • 2. 2 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Presenters Jeff Sposetti Senior Director of Product Management, Cloud Hortonworks Data Cloud, Cloudbreak Srikanth Venkat Senior Director of Product Management, Security & Governance Apache Ranger, Apache Atlas, Apache Knox
  • 3. 3 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Agenda  Introduction  Security & Governance: Apache Ranger, Atlas and Knox  Bringing It Together: Cloud and Data Lake Security  Demo  Wrap Up  Q & A
  • 4. 4 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Background: Ephemeral Workloads + Cloud Storage CLOUD STORAGE S3 WORKLOAD CLUSTERS Durable Ephemeral
  • 5. 5 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Introducing Hortonworks Data Cloud  Focuses on business agility, rather than infinite configurability and cluster management  Addresses prescriptive, ephemeral use cases around Apache Spark + Apache Hive  Pre-tuned and configured for use with Amazon S3 Learn more: http://hortonworks.com/products/cloud/aws/
  • 6. 6 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Quick demo…
  • 7. 7 © Hortonworks Inc. 2011 – 2017. All Rights Reserved CLOUD DATA LAKE SECURITY
  • 8. 8 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Security & Governance: Apache Ranger, Atlas and Knox
  • 9. 9 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Protecting the Elephant in the Castle….. Kerberos, Wire Encryption HDFS Encryption Apache Ranger Network Segmentation, Firewalls LDAP/AD Apache Knox
  • 10. 1 0 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Knox Proxying Services ★ Provide access to Hadoop via proxying of HTTP resources ★ Ecosystem APIs and UIs + Hadoop oriented dispatching for Kerberos + doAs (impersonation) etc. Authentication Services ★ REST API access, WebSSO flow for UIs ★ LDAP/AD, Header based PreAuth ★ Kerberos, SAML, OAuth Client DSL/SDK Services ★ Scripting through DSL ★ Using Knox Shell classes directly as SDK
  • 11. 11 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Apache Ranger Comprehensive and Extensible Security Model – Centralized platform to define, administer and manage security policies across Hadoop components (HDFS, Hive, HBase, YARN, Kafka, Solr, Storm, Knox, NiFi, Atlas) – Extensible Architecture with ability to add custom policy conditions, user context enrichers Centralized Auditing – Central audit location for all access requests – Support multiple destination sources (HDFS, Solr, etc.) – Real-time visual query interface Fine-Grained Authorization for data access control for Database, Table, Column, LDAP Groups & Specific Users Advanced Security • Dynamic Security Policies: Tag (Atlas), Prohibition, Time, and Location • Dynamic Column Masking & Row Filtering OPERATIONS SECURITY GOVERNANCE STORAGE STORAGE Machine Learning Batch StreamingInteractive Search SECURITY
  • 12. 12 © Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved STRUCTURED Atlas: Metadata & Governance TRADITIONAL RDBMS METADATA MPP APPLIANCES Kafka Storm Sqoop Hive ATLAS METADATA Falcon RANGER Custom Partners Metadata-driven governance services for Hadoop and enterprise big data ecosystems Data Lineage/Provenance  Along the entire data lifecycle with integrated Cross component lineage Data Classification  Supports classification of data assets using tags (e.g. PII, PHI, PCI etc.) and attributes Metadata Catalog Search  Free text search on metadata  Advanced search using DSL Integrations across the Hadoop ecosystem, through a common metadata store  Free text search on metadata  OOtB real-time metadata and lineage ingestion with Hive, Sqoop, Storm/Kafka  APIs for custom metadata ingestion  Apache Ranger integration for classification based security Key Benefits: Modern Data Lakes need new ways to govern because: • Cost – Traditional staff ratio to data size not possible • Diversity – Only way to manage velocity of new datasets • Agility – Quick change based on tags / taxonomy
  • 13. 13 © Hortonworks Inc. , and Dataguise Inc. 2011 – 2017. All Rights Reserved HDP – Security & Governance Classification Prohibition Time Location Policies PDP Resource Cache Ranger Manage Access Policies and Audit Logs Track Metadata and Lineage Atlas Client Subscribers to Topic Gets Metadata Updates Atlas Metastore Tags Assets Entitles Streams Pipelines Feeds Hive Tables HDFS Files HBase Tables Entities in Data Lake Industry First: Dynamic Tag-based Security Policies
  • 14. 14 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Bringing It Together: Cloud and Data Lake Services for Enterprise Ephemeral Workloads
  • 15. 15 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Ephemeral Workloads: Basic -> Advanced -> Enterprise Basic Ephemeral Advanced Ephemeral Enterprise Ephemeral Tuned and Optimized Infrastructure Simplified, Automated Operations S3 Integration Protected Network Access Schema - Shared (Hive Metastore) Shared (Hive Metastore) Authentication Single-user Single-user Multi-User (LDAP/AD) Authorization - - Security Policies (Ranger) Audit - - Audit (Ranger)
  • 16. 16 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Ephemeral Workloads for the Cloud CLOUD STORAGE S3 WORKLOADS Durable Ephemeral Metastore SCHEMA Long Running Security access to workload clusters via a Protected Gateway enabled for AuthN and HTTPS. Define your data schema, security policies, and metadata catalog once for your ephemeral and always-on workloads. Atlas CATALOG Ranger POLICY SHARED DATA LAKE SERVICES
  • 17. 17 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Key Components for an Enterprise Ephemeral Deployment SCHEMA POLICY AUDIT DIRECTORY WHAT Provides Hive schema (tables, views, etc). WHY If you have 2+ workloads accessing the same data, need to share schema across those workloads. HOW Externalize Hive Metastore into for schema definition. WHAT Defines security policies around Hive schema. WHY If you have 2+ users accessing the same data, need policies to be consistently available and enforced. HOW Externalize and share Ranger across workloads and store policies external. WHAT Audit user access. WHY Capture data access activity. HOW Externalize and share Ranger across workloads, leverage cloud storage for audit data. GATEWAY WHAT Provide single endpoint that can be protected with SSL and enabled for authentication to access to cluster resources. WHY Avoid opening many ports, some potentially w/o authentication or SSL protection. HOW Deploy a centralized protected gateway automatically. WHAT Users and groups. WHY Provide authentication source for users and authorization source for groups. HOW Leverage external LDAP or Active Directory.
  • 18. 18 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Architecture of an Enterprise Ephemeral Deployment Access your cluster components through the protected gateway via SSL on port 443 open on the controller security group. CONTROLLER PROTECTED GATEWAY USER ACCESS Zeppelin HIVE LLAP / SPARK WORKLOADS Hive SHARED DATA LAKE SERVICES Ranger POLICY (RDS) AUDIT (S3) SCHEMA (RDS) DIRECTORY (LDAP/AD) Spark Hive Metasto re
  • 19. 19 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Data Cloud + Shared Data Lake Services 1 2 3 Register an Authentication Source (i.e. LDAP/AD). Create a “Shared Data Lake”, specify S3 Bucket & RDS. When you create a cluster, ”attach” to the Shared Data Lake Services: • for Multi-User AuthN (LDAP/AD) • for AuthZ + Audit (Ranger) • for Schema (Hive Metastore) PREREQUISITES • LDAP/AD • S3 Bucket • Amazon RDS Instance
  • 20. 20 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Longer demo…
  • 21. 21 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Guidelines for Shared Data Lake Services  Stay Ephemeral. All of your data and metadata in S3 and RDS respectively, do not create tables or files in the local HDFS.  The Hive warehouse is setup to be on S3 for data lakes, create tables in this location instead of individual S3 buckets, it will make them easier to manage.  Use Hive “external tables” for tables that are outside this warehouse, typically if the data is being ingested through some path outside of Hadoop  Create S3 bucket policies that exactly match usage so that you can spin up clusters with the least privilege.
  • 22. 22 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Wrap Up
  • 23. 23 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Bringing Big Data + Cloud + Security Together: The Takeaways  Cloud driving more ephemeral data processing use cases  Ephemeral workloads leverage cloud storage  This pattern is driving an architectural approach for “shared data lake services”  To provide shared security services for Authentication, Authorization and Audit ** Building blocks are Apache Ranger, Apache Atlas and Apache Knox THINK WORKLOADS Deploy “what you need”… Right tool, right job! THINK SHARED Shared “security policies”… End-to-end security for workloads!** THINK EPHEMERAL Run “when you need”… Workloads will come and go!
  • 24. 24 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Learn More  Try Hortonworks Data Cloud 2.0 Technical Preview – http://hortonworks.github.io/hdp-aws/  BREAKOUT SESSIONS – Wednesday, June 14 @ 5:50p, Don’t Let Spark Burn Your House: Perspectives on Securing Spark  CRASH COURSE – Thursday, June 15 @ 3:00p – 6:00p, Apache Spark and Apache Hive processing on the Cloud  BIRDS OF A FEATHER – Thursday, June 15 @ 5:00p, Security and Governance – Thursday, June 15 @ 5:00p, Cloud and Operations
  • 25. 25 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Thank You https://hortonworks.com/products/cloud/aws/ https://hortonworks.com/apache/ranger/ https://hortonworks.com/apache/atlas/

Notes de l'éditeur

  1. 12