SlideShare une entreprise Scribd logo
1  sur  31
CLOUDERA ALTUS ON AZURE
May 2018
2 © Cloudera, Inc. All rights reserved.2 © Cloudera, Inc. All rights reserved.
- Cloudera Altus Overview
- Altus Architecture Deep Dive
- ADLS Deep Dive
- Data Engineering Demo
- Roadmap
AGENDA
3 © Cloudera, Inc. All rights reserved.3 © Cloudera, Inc. All rights reserved.
CLOUDERA ALTUS OVERVIEW
4 © Cloudera, Inc. All rights reserved.4 © Cloudera, Inc. All rights reserved.
CLOUDERA
ENTERPRISE DATA
PLATFORM
The modern platform for
machine learning & analytics
optimized for the cloud
WORKLOADS 3RD PARTY
SERVICES
DATA
ENGINEERING
DATA
SCIENCE
ANALYTIC
DATABASE
OPERATIONAL
DATABASE
DATA CATALOG
GOVERNANCESECURITY LIFECYCLE
MANAGEMENT
STORAGE
Other cloud
COMMON SERVICES
HDFS
Microsoft
ADLS
CONTROL
PLANE
KUDU
5 © Cloudera, Inc. All rights reserved.5 © Cloudera, Inc. All rights reserved.
CLOUDERA ALTUS
PAAS
• Simple
• Self-service
• Auto-elastic
• Role specific
DATA ENGINEERING ANALYTIC DB DATA SCIENCE
DATA CATALOG
GOVERNANC
E
SECURITY CONTROL
PLANE
LIFECYCLE
MANAGEMEN
T
beta soon
Other Cloud
Microsoft
ADLS
6 © Cloudera, Inc. All rights reserved.6 © Cloudera, Inc. All rights reserved.
What is it?
- Short-lived
- Single tenant
- Spark, Hive, MapReduce, or YARN Cluster
Used for things like
- ETL jobs
- Batch processing
- With data living in ADLS
- Provides fast and easy job submission
without cluster management
Generally Available on Azure
ALTUS DATA ENGINEERING
DATA
ENGINEERING
TRANSFORM DATA AT SCALE WITHOUT THE ADMINISTRATION
7 © Cloudera, Inc. All rights reserved.7 © Cloudera, Inc. All rights reserved.
What is it?
- Long-lived
- Multi tenant
- Impala Cluster
Used for things like
- Data warehousing
- Analytics
- With data living in ADLS
- Provides fast and easy analytics
without cluster management
Available in Beta on Azure
ALTUS ANALYTIC DATABASE
ANALYTIC
DATABASE
MULTITENENT ANALYTICS AND DW AT SCALE WITHOUT THE ADMINISTRATION
8 © Cloudera, Inc. All rights reserved.8 © Cloudera, Inc. All rights reserved.
CLOUDERA SDX
EASIEST WAY TO COLLABORATE IN A SHARED ENVIRONMENT
8
• Unified security – protects sensitive data with consistent
controls, even for transient and recurring workloads
• Consistent governance – enables secure self-service access
to all relevant data and increases compliance
• Easy workload management – increases user productivity and
boosts job predictability
• Flexible ingest and replication – aggregates a single copy of
all data, provides disaster recovery, and eases migration
• Shared data catalog – defines and preserves structure and
business context of data for new applications and partner
solutions
SHARED
DATA
EXPERIENCE
Available in Beta on Azure
9 © Cloudera, Inc. All rights reserved.9 © Cloudera, Inc. All rights reserved.
- Troubleshoot jobs after cluster
termination
- Insight into causes of job failure
- Identification and root cause
analysis of slow jobs
- Define an SLA and get sizing
recommendations
ALTUS WORKLOAD ANALYTICS
HELPING USERS FOCUS ONLY ON THEIR WORKLOADS – NOT CLUSTERS
10 © Cloudera, Inc. All rights reserved.10 © Cloudera, Inc. All rights reserved.
If customers move from
persistent analytic clusters to
elastic clusters, they can save
money and bring agility.
Save labor and/ upfront
expense required to
accommodate new teams &
environments
Save hardware/software costs
by using transient nodes for
peak workloads as opposed to
always-on nodes.
Transient
Persistent
ELASTIC WORKLOADS DATA MART EXPANSION PEAK BURSTING
COMMON USE CASES
FLEXIBLE DEPLOYMENTS FOR OPTIMAL TOTAL COST OF OWNERSHIP
DATA
ENGINEERING
ANALYTIC
DATABASE
DATA
SCIENCE
MACHINE
LEARNING
11 © Cloudera, Inc. All rights reserved.11 © Cloudera, Inc. All rights reserved.
ALTUS ON AZURE ARCHITECTURE
12 © Cloudera, Inc. All rights reserved.12 © Cloudera, Inc. All rights reserved.
Cloudera Altus User Cloud Account
ALTUS ARCHITECTURE
Object Store
Web UI
CLI
SDK/
Partners
API
Job Metadata
Environment
Cluster
Metadata
Altus DE Cluster
Telemetry Storage
Job
Job
InputData
OutputData
TelemetryData(Optional)
Job Queue
Workers
Workers
Workers
Workers
Workers
Workers
Workers
Workers
JobLogs
Remote
Management
13 © Cloudera, Inc. All rights reserved.13 © Cloudera, Inc. All rights reserved.
Customer Azure Subscription
ACCESS SECURITY
Clusters Created with User Assigned MSI
ADLS
Job
Workers
Workers
Workers
VMs
Workers
Workers
Workers
VMs
DataAccess
SSH
Altus
Virtual Network/Subnet
Resource
Group
Needed Permissions
Provide consent for cross account access
● Can be restricted to Resource Groups
● Can leverage custom Azure RBAC roles
● For POCs, recommended to have
contributor access to subscription
Create a User Assigned MSI to:
● Read/write to ADLS folders/files
○ Governed by ACLs
Network Security Group:
● Allow SSH from Altus management plane
to VMs
○ Limited to Altus IPs
Cross Account
Access
SSH
14 © Cloudera, Inc. All rights reserved.14 © Cloudera, Inc. All rights reserved.
DATA SECURITY
Customer Azure Subscription
User Assigned MSI Permissions
ADLS
Job
Data
Access
SSH
Altus
Virtual Network/Subnet
Resource
Group
Cross Account
Access
Workers
Workers
Workers
VMs
Managed
Disks
Workers
Workers
Workers
VMs
Managed
Disks
Encrypted at rest by default
Encrypted at rest by default
● Can use custom keys
(Azure Key Vault)
● Data, Logs
TLS in-cluster
Kerberos enabled
Communications
Encrypted
15 © Cloudera, Inc. All rights reserved.15 © Cloudera, Inc. All rights reserved.
MICROSOFT ADLS DEEP DIVE
USGovGlobalRegionalIndustry
 ISO 27001:2013
 ISO 27017:2015
 ISO 27018:2014
 ISO 22301:2012
 ISO 9001:2015
 ISO 20000-1:2011
 SOC 1 Type 2
 SOC 2 Type 2
 SOC 3
 CSA STAR Certification
 CSA STAR Attestation
 CSA STAR Self-Assessment
 WCAG 2.0
 FedRAMP High
 FedRAMP Moderate
 EAR
 DoD DISA SRG Level 5
 DoD DISA SRG Level 4
 DoD DISA SRG Level 2
 DFARS
 DoE 10 CFR Part 810
 NIST SP 800-171
 NIST CSF
 Section 508 VPATs
 PCI DSS Level 1
 GLBA
 FFIEC
 Shared Assessments
 FISC (Japan)
 APRA (Australia)
 FCA (UK)
 MAS + ABS (Singapore)
 23 NYCRR 500
 HIPAA BAA
 HITRUST
 21 CFR Part 11 (GxP)
 MARS-E
 NHS IG Toolkit (UK)
 NEN 7510:2011 (Netherlands)
 FERPA
 CDSA
 MPAA
 FACT (UK)
 DPP (UK)
 SOX
 Argentina PDPA
 Australia IRAP Unclassified
 Australia IRAP Protected
 Canada Privacy Laws
 China GB 18030:2005
 China DJCP (MLPS) Level 3
 Germany C5
 India MeitY
 Japan CS Mark Gold
 Japan My Number Act
 Netherlands BIR 2012
 New Zealand Gov CIO Fwk
 Singapore MTCS Level 3
 Spain ENS
 Spain DPA
 UK Cyber Essentials Plus
 UK G-Cloud
 UK PASF
 FIPS 140-2
 ITAR
 CJIS
 IRS 1075
Azure covers 73 compliance offerings
Azure has the deepest and most comprehensive compliance coverage in the industry
 China TRUCS / CCCPPF
 EN 301 549
 EU ENISA IAF
 EU Model Clauses
 EU – US Privacy Shield
 Germany IT-Grundschutz workbook
https://aka.ms/AzureCompliance
Open source support
Applications
Infrastructure
Management
Databases &
middleware
App frameworks
& tools
DevOps
Azure Data Lake Store
(ADLS)
A hyper scale repository for
big data analytics workloads
Store ANY DATA in its native format
HADOOP FILE SYSTEM (HDFS) for
the cloud
ENTERPRISE GRADE
No limits to SCALE
Optimized for analytic workload
PERFORMANCE
YARN
Hive | Spark | Impala
Cloudera 5.1x Azure PaaS
Services
ADL Store
Compute
Data
Hadoop
Worker 2
Hadoop
Worker 1
Hadoop
Master
Local Disk Local Disk
HDFS
Architectures - Traditional Hadoop Cluster
Hadoop
Worker 2
Hadoop
Worker 1
Hadoop
Master
ADLS
Cloudera on ADLS
ADLS – Under the hood
Data Lake Store Backend
SSD-backed Data Lake Ingestion layer
Data Lake Client Data Lake Management Client
Data Lake Client SDK
REST API
(Data Access)
Data Lake Store Frontend
Management API
Scale out Storage
Azure ML
Metadata
Service
Naming
Service
File System/ HDFS API
1
2
4
3
5
6
Microsoft
R Server
Comparison between storage options
Block based options Filesystem based options
VHDs on WASB Premium Storage WASB ADLS
Maximum volume 4TB per disk 4TB per disk 500 TB No limit (tested > exabytes)
Maximum item size N/A N/A 4.75 TB No limit (tested > petabytes)
Physical media HDD Flash/SSD HDD SSD + HDD
Replication LRS and GRS None LRS and GRS LRS
Throughput 60 MBps per disk 250 MBps per disk 60 MBps per blob Extremely high
RBAC N/A N/A N/A POSIX compliant (file & folder level)
Encryption SSE or Azure Key Vault N/A N/A Transparent (AES 256 + TLS 1.2)
Workloads any any low TBs >10 TBs
Locations all most all 4 and growing
https://docs.microsoft.com/en-us/azure/storage/storage-scalability-targets
Why Cloudera on Azure Data Lake Store?
Separation of
Compute & Storage
Transient clusters for
flexibility, lower TCO
Shared storage for many
optimized clusters
Compute
time
M T W R F S S
Data Lake
Store
Data Lake
Store
Data Lake
Store
25 © Cloudera, Inc. All rights reserved.25 © Cloudera, Inc. All rights reserved.
ALTUS DEMO
26 © Cloudera, Inc. All rights reserved.26 © Cloudera, Inc. All rights reserved.
ALTUS ROADMAP
27 © Cloudera, Inc. All rights reserved.27 © Cloudera, Inc. All rights reserved.
ALTUS DATA ENGINEERING ROADMAP
Production Workflows
• Scheduling, orchestration
• Success/Failure
notifications
• Enhanced debugging
• Failure handling
• Dependency management
Developer Workflows
• IDE + partner integration
• CI/CD
• Python SDK
• Interactive experience
• Altus DS Integration
Operational Efficiency
• Autoscaling, enhanced spot
• SLA and cost management
• Workload automation
28 © Cloudera, Inc. All rights reserved.28 © Cloudera, Inc. All rights reserved.
ALTUS ANALYTIC DATABASE ROADMAP
Platform
• Pause, resume, and resize
for clusters
• Shrink w graceful shutdown
• Altus SQL editor
• Autoscaling
Integrations
• SDX
• Workload XM
• Navigator
• Navigator Optimizer
Misc
• UDF support
• SQL CLI (impala-shell)
29 © Cloudera, Inc. All rights reserved.29 © Cloudera, Inc. All rights reserved.
ALTUS PLATFORM ROADMAP
Altus Self Service
• Self-service subscription
Platform
• Increased Scalability
• Java SDK for ADB, SDX
Security
• Identity federation
• Enhanced security
THANK YOUTHANK YOU
31 © Cloudera, Inc. All rights reserved.31 © Cloudera, Inc. All rights reserved.
- Install any software to start working
- Install any hardware
- Worry about cluster configuration
- Upgrade/reconfigure clusters
- OS upgrades/patching
- Resource Management
EVERYTHING YOU DON’T HAVE TO DO
FOCUS ON YOUR WORKLOADS, NOT THE CLUSTERS

Contenu connexe

Tendances

Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionCloudera, Inc.
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondCloudera, Inc.
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningCloudera, Inc.
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera, Inc.
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsCloudera, Inc.
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Cloudera, Inc.
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsCloudera, Inc.
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseCloudera, Inc.
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera, Inc.
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)Cloudera, Inc.
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...Cloudera, Inc.
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedCloudera, Inc.
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...Cloudera, Inc.
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac... Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...Cloudera, Inc.
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming ArchitecturesCloudera, Inc.
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformCloudera, Inc.
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCloudera, Inc.
 

Tendances (20)

Get started with Cloudera's cyber solution
Get started with Cloudera's cyber solutionGet started with Cloudera's cyber solution
Get started with Cloudera's cyber solution
 
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and BeyondStanding Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
 
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine LearningThe Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
 
Cloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for AnalyticsCloudera - The Modern Platform for Analytics
Cloudera - The Modern Platform for Analytics
 
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber ThreatsA Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
 
Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18Consolidate your data marts for fast, flexible analytics 5.24.18
Consolidate your data marts for fast, flexible analytics 5.24.18
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
How Data Drives Business at Choice Hotels
How Data Drives Business at Choice HotelsHow Data Drives Business at Choice Hotels
How Data Drives Business at Choice Hotels
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
 
Cloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made EasyCloudera Altus: Big Data in the Cloud Made Easy
Cloudera Altus: Big Data in the Cloud Made Easy
 
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
How Big Data Can Enable Analytics from the Cloud (Technical Workshop)
 
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
The 6th Wave of Automation: Automation of Decisions | Cloudera Analytics & Ma...
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
 
How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...How komatsu is driving operational efficiencies using io t and machine learni...
How komatsu is driving operational efficiencies using io t and machine learni...
 
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac... Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
Cloudera Data Science Workbench: sparklyr, implyr, and More - dplyr Interfac...
 
End to End Streaming Architectures
End to End Streaming ArchitecturesEnd to End Streaming Architectures
End to End Streaming Architectures
 
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data PlatformHow to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
 
Customer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWSCustomer Best Practices: Optimizing Cloudera on AWS
Customer Best Practices: Optimizing Cloudera on AWS
 
Cloudera SDX
Cloudera SDXCloudera SDX
Cloudera SDX
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 

Similaire à Self-service Big Data Analytics on Microsoft Azure

Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Matt Stubbs
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudCloudera, Inc.
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudGoDataDriven
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera, Inc.
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSCloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Cloudera, Inc.
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloudera, Inc.
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Cloudera, Inc.
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Cloudera, Inc.
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019 Timothy Spann
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003lee tracie
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoopWei-Chiu Chuang
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsInformatica
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Big Data Spain
 
Cloudera enterprise-datasheet
Cloudera enterprise-datasheetCloudera enterprise-datasheet
Cloudera enterprise-datasheetpeerawicht
 
JoTechies - Azure SQL DB
JoTechies - Azure SQL DBJoTechies - Azure SQL DB
JoTechies - Azure SQL DBJoTechies
 
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...Ewa Stepien
 

Similaire à Self-service Big Data Analytics on Microsoft Azure (20)

Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
Big Data LDN 2018: CONSISTENT SECURITY, GOVERNANCE AND FLEXIBILITY FOR ALL WO...
 
A deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloudA deep dive into running data analytic workloads in the cloud
A deep dive into running data analytic workloads in the cloud
 
Cloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the CloudCloudera GoDataFest Deploying Cloudera in the Cloud
Cloudera GoDataFest Deploying Cloudera in the Cloud
 
Cloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemachtCloudera Altus: Big Data in der Cloud einfach gemacht
Cloudera Altus: Big Data in der Cloud einfach gemacht
 
Five Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWSFive Tips for Running Cloudera on AWS
Five Tips for Running Cloudera on AWS
 
Hybrid is the New Normal
Hybrid is the New NormalHybrid is the New Normal
Hybrid is the New Normal
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18Cloud Data Warehousing with Cloudera Altus 7.24.18
Cloud Data Warehousing with Cloudera Altus 7.24.18
 
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019
 
Hadoop security implementationon 20171003
Hadoop security implementationon 20171003Hadoop security implementationon 20171003
Hadoop security implementationon 20171003
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
 
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data AnalyticsHow to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
How to Architect a Serverless Cloud Data Lake for Enhanced Data Analytics
 
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
Securing Big Data at rest with encryption for Hadoop, Cassandra and MongoDB o...
 
Cloudera enterprise-datasheet
Cloudera enterprise-datasheetCloudera enterprise-datasheet
Cloudera enterprise-datasheet
 
JoTechies - Azure SQL DB
JoTechies - Azure SQL DBJoTechies - Azure SQL DB
JoTechies - Azure SQL DB
 
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...
Michał Wawrzyński @ "Oracle Systems jako infrastruktura dla chmur prywatnych"...
 

Plus de Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxCloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards FinalistsCloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformCloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Cloudera, Inc.
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Cloudera, Inc.
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Cloudera, Inc.
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceCloudera, Inc.
 

Plus de Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 
Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18Introducing Workload XM 8.7.18
Introducing Workload XM 8.7.18
 
Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18Spark and Deep Learning Frameworks at Scale 7.19.18
Spark and Deep Learning Frameworks at Scale 7.19.18
 
How Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR complianceHow Cloudera SDX can aid GDPR compliance
How Cloudera SDX can aid GDPR compliance
 

Dernier

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 

Dernier (20)

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 

Self-service Big Data Analytics on Microsoft Azure

  • 1. CLOUDERA ALTUS ON AZURE May 2018
  • 2. 2 © Cloudera, Inc. All rights reserved.2 © Cloudera, Inc. All rights reserved. - Cloudera Altus Overview - Altus Architecture Deep Dive - ADLS Deep Dive - Data Engineering Demo - Roadmap AGENDA
  • 3. 3 © Cloudera, Inc. All rights reserved.3 © Cloudera, Inc. All rights reserved. CLOUDERA ALTUS OVERVIEW
  • 4. 4 © Cloudera, Inc. All rights reserved.4 © Cloudera, Inc. All rights reserved. CLOUDERA ENTERPRISE DATA PLATFORM The modern platform for machine learning & analytics optimized for the cloud WORKLOADS 3RD PARTY SERVICES DATA ENGINEERING DATA SCIENCE ANALYTIC DATABASE OPERATIONAL DATABASE DATA CATALOG GOVERNANCESECURITY LIFECYCLE MANAGEMENT STORAGE Other cloud COMMON SERVICES HDFS Microsoft ADLS CONTROL PLANE KUDU
  • 5. 5 © Cloudera, Inc. All rights reserved.5 © Cloudera, Inc. All rights reserved. CLOUDERA ALTUS PAAS • Simple • Self-service • Auto-elastic • Role specific DATA ENGINEERING ANALYTIC DB DATA SCIENCE DATA CATALOG GOVERNANC E SECURITY CONTROL PLANE LIFECYCLE MANAGEMEN T beta soon Other Cloud Microsoft ADLS
  • 6. 6 © Cloudera, Inc. All rights reserved.6 © Cloudera, Inc. All rights reserved. What is it? - Short-lived - Single tenant - Spark, Hive, MapReduce, or YARN Cluster Used for things like - ETL jobs - Batch processing - With data living in ADLS - Provides fast and easy job submission without cluster management Generally Available on Azure ALTUS DATA ENGINEERING DATA ENGINEERING TRANSFORM DATA AT SCALE WITHOUT THE ADMINISTRATION
  • 7. 7 © Cloudera, Inc. All rights reserved.7 © Cloudera, Inc. All rights reserved. What is it? - Long-lived - Multi tenant - Impala Cluster Used for things like - Data warehousing - Analytics - With data living in ADLS - Provides fast and easy analytics without cluster management Available in Beta on Azure ALTUS ANALYTIC DATABASE ANALYTIC DATABASE MULTITENENT ANALYTICS AND DW AT SCALE WITHOUT THE ADMINISTRATION
  • 8. 8 © Cloudera, Inc. All rights reserved.8 © Cloudera, Inc. All rights reserved. CLOUDERA SDX EASIEST WAY TO COLLABORATE IN A SHARED ENVIRONMENT 8 • Unified security – protects sensitive data with consistent controls, even for transient and recurring workloads • Consistent governance – enables secure self-service access to all relevant data and increases compliance • Easy workload management – increases user productivity and boosts job predictability • Flexible ingest and replication – aggregates a single copy of all data, provides disaster recovery, and eases migration • Shared data catalog – defines and preserves structure and business context of data for new applications and partner solutions SHARED DATA EXPERIENCE Available in Beta on Azure
  • 9. 9 © Cloudera, Inc. All rights reserved.9 © Cloudera, Inc. All rights reserved. - Troubleshoot jobs after cluster termination - Insight into causes of job failure - Identification and root cause analysis of slow jobs - Define an SLA and get sizing recommendations ALTUS WORKLOAD ANALYTICS HELPING USERS FOCUS ONLY ON THEIR WORKLOADS – NOT CLUSTERS
  • 10. 10 © Cloudera, Inc. All rights reserved.10 © Cloudera, Inc. All rights reserved. If customers move from persistent analytic clusters to elastic clusters, they can save money and bring agility. Save labor and/ upfront expense required to accommodate new teams & environments Save hardware/software costs by using transient nodes for peak workloads as opposed to always-on nodes. Transient Persistent ELASTIC WORKLOADS DATA MART EXPANSION PEAK BURSTING COMMON USE CASES FLEXIBLE DEPLOYMENTS FOR OPTIMAL TOTAL COST OF OWNERSHIP DATA ENGINEERING ANALYTIC DATABASE DATA SCIENCE MACHINE LEARNING
  • 11. 11 © Cloudera, Inc. All rights reserved.11 © Cloudera, Inc. All rights reserved. ALTUS ON AZURE ARCHITECTURE
  • 12. 12 © Cloudera, Inc. All rights reserved.12 © Cloudera, Inc. All rights reserved. Cloudera Altus User Cloud Account ALTUS ARCHITECTURE Object Store Web UI CLI SDK/ Partners API Job Metadata Environment Cluster Metadata Altus DE Cluster Telemetry Storage Job Job InputData OutputData TelemetryData(Optional) Job Queue Workers Workers Workers Workers Workers Workers Workers Workers JobLogs Remote Management
  • 13. 13 © Cloudera, Inc. All rights reserved.13 © Cloudera, Inc. All rights reserved. Customer Azure Subscription ACCESS SECURITY Clusters Created with User Assigned MSI ADLS Job Workers Workers Workers VMs Workers Workers Workers VMs DataAccess SSH Altus Virtual Network/Subnet Resource Group Needed Permissions Provide consent for cross account access ● Can be restricted to Resource Groups ● Can leverage custom Azure RBAC roles ● For POCs, recommended to have contributor access to subscription Create a User Assigned MSI to: ● Read/write to ADLS folders/files ○ Governed by ACLs Network Security Group: ● Allow SSH from Altus management plane to VMs ○ Limited to Altus IPs Cross Account Access SSH
  • 14. 14 © Cloudera, Inc. All rights reserved.14 © Cloudera, Inc. All rights reserved. DATA SECURITY Customer Azure Subscription User Assigned MSI Permissions ADLS Job Data Access SSH Altus Virtual Network/Subnet Resource Group Cross Account Access Workers Workers Workers VMs Managed Disks Workers Workers Workers VMs Managed Disks Encrypted at rest by default Encrypted at rest by default ● Can use custom keys (Azure Key Vault) ● Data, Logs TLS in-cluster Kerberos enabled Communications Encrypted
  • 15. 15 © Cloudera, Inc. All rights reserved.15 © Cloudera, Inc. All rights reserved. MICROSOFT ADLS DEEP DIVE
  • 16.
  • 17. USGovGlobalRegionalIndustry  ISO 27001:2013  ISO 27017:2015  ISO 27018:2014  ISO 22301:2012  ISO 9001:2015  ISO 20000-1:2011  SOC 1 Type 2  SOC 2 Type 2  SOC 3  CSA STAR Certification  CSA STAR Attestation  CSA STAR Self-Assessment  WCAG 2.0  FedRAMP High  FedRAMP Moderate  EAR  DoD DISA SRG Level 5  DoD DISA SRG Level 4  DoD DISA SRG Level 2  DFARS  DoE 10 CFR Part 810  NIST SP 800-171  NIST CSF  Section 508 VPATs  PCI DSS Level 1  GLBA  FFIEC  Shared Assessments  FISC (Japan)  APRA (Australia)  FCA (UK)  MAS + ABS (Singapore)  23 NYCRR 500  HIPAA BAA  HITRUST  21 CFR Part 11 (GxP)  MARS-E  NHS IG Toolkit (UK)  NEN 7510:2011 (Netherlands)  FERPA  CDSA  MPAA  FACT (UK)  DPP (UK)  SOX  Argentina PDPA  Australia IRAP Unclassified  Australia IRAP Protected  Canada Privacy Laws  China GB 18030:2005  China DJCP (MLPS) Level 3  Germany C5  India MeitY  Japan CS Mark Gold  Japan My Number Act  Netherlands BIR 2012  New Zealand Gov CIO Fwk  Singapore MTCS Level 3  Spain ENS  Spain DPA  UK Cyber Essentials Plus  UK G-Cloud  UK PASF  FIPS 140-2  ITAR  CJIS  IRS 1075 Azure covers 73 compliance offerings Azure has the deepest and most comprehensive compliance coverage in the industry  China TRUCS / CCCPPF  EN 301 549  EU ENISA IAF  EU Model Clauses  EU – US Privacy Shield  Germany IT-Grundschutz workbook https://aka.ms/AzureCompliance
  • 18. Open source support Applications Infrastructure Management Databases & middleware App frameworks & tools DevOps
  • 19. Azure Data Lake Store (ADLS) A hyper scale repository for big data analytics workloads Store ANY DATA in its native format HADOOP FILE SYSTEM (HDFS) for the cloud ENTERPRISE GRADE No limits to SCALE Optimized for analytic workload PERFORMANCE YARN Hive | Spark | Impala Cloudera 5.1x Azure PaaS Services ADL Store Compute Data
  • 20. Hadoop Worker 2 Hadoop Worker 1 Hadoop Master Local Disk Local Disk HDFS Architectures - Traditional Hadoop Cluster
  • 22. ADLS – Under the hood Data Lake Store Backend SSD-backed Data Lake Ingestion layer Data Lake Client Data Lake Management Client Data Lake Client SDK REST API (Data Access) Data Lake Store Frontend Management API Scale out Storage Azure ML Metadata Service Naming Service File System/ HDFS API 1 2 4 3 5 6 Microsoft R Server
  • 23. Comparison between storage options Block based options Filesystem based options VHDs on WASB Premium Storage WASB ADLS Maximum volume 4TB per disk 4TB per disk 500 TB No limit (tested > exabytes) Maximum item size N/A N/A 4.75 TB No limit (tested > petabytes) Physical media HDD Flash/SSD HDD SSD + HDD Replication LRS and GRS None LRS and GRS LRS Throughput 60 MBps per disk 250 MBps per disk 60 MBps per blob Extremely high RBAC N/A N/A N/A POSIX compliant (file & folder level) Encryption SSE or Azure Key Vault N/A N/A Transparent (AES 256 + TLS 1.2) Workloads any any low TBs >10 TBs Locations all most all 4 and growing https://docs.microsoft.com/en-us/azure/storage/storage-scalability-targets
  • 24. Why Cloudera on Azure Data Lake Store? Separation of Compute & Storage Transient clusters for flexibility, lower TCO Shared storage for many optimized clusters Compute time M T W R F S S Data Lake Store Data Lake Store Data Lake Store
  • 25. 25 © Cloudera, Inc. All rights reserved.25 © Cloudera, Inc. All rights reserved. ALTUS DEMO
  • 26. 26 © Cloudera, Inc. All rights reserved.26 © Cloudera, Inc. All rights reserved. ALTUS ROADMAP
  • 27. 27 © Cloudera, Inc. All rights reserved.27 © Cloudera, Inc. All rights reserved. ALTUS DATA ENGINEERING ROADMAP Production Workflows • Scheduling, orchestration • Success/Failure notifications • Enhanced debugging • Failure handling • Dependency management Developer Workflows • IDE + partner integration • CI/CD • Python SDK • Interactive experience • Altus DS Integration Operational Efficiency • Autoscaling, enhanced spot • SLA and cost management • Workload automation
  • 28. 28 © Cloudera, Inc. All rights reserved.28 © Cloudera, Inc. All rights reserved. ALTUS ANALYTIC DATABASE ROADMAP Platform • Pause, resume, and resize for clusters • Shrink w graceful shutdown • Altus SQL editor • Autoscaling Integrations • SDX • Workload XM • Navigator • Navigator Optimizer Misc • UDF support • SQL CLI (impala-shell)
  • 29. 29 © Cloudera, Inc. All rights reserved.29 © Cloudera, Inc. All rights reserved. ALTUS PLATFORM ROADMAP Altus Self Service • Self-service subscription Platform • Increased Scalability • Java SDK for ADB, SDX Security • Identity federation • Enhanced security
  • 31. 31 © Cloudera, Inc. All rights reserved.31 © Cloudera, Inc. All rights reserved. - Install any software to start working - Install any hardware - Worry about cluster configuration - Upgrade/reconfigure clusters - OS upgrades/patching - Resource Management EVERYTHING YOU DON’T HAVE TO DO FOCUS ON YOUR WORKLOADS, NOT THE CLUSTERS