SlideShare une entreprise Scribd logo
1  sur  24
Télécharger pour lire hors ligne
DataWorks Summit 2019 - Barcelona
Audi‘s Hadoop Journey into the Hybrid Cloud
Carsten Herbe (Audi Business Innovation GmbH, Germany)
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe2
About us
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe3
Audi AG
1,8 million cars per year*, 90.000 employees worldwide*
* source: https://www.audi.com/de/company.html
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe4
Audi mobility
innovations
Audi on demand
Audi balanced
technologies
Audi e-gas
Audi customer
IT solutions
Audi Business Innovation GmbH
Munich based subsidiary of Audi AG
Carsten Herbe
Audi Business Innovation GmbH
» Data Platform & Solution Architecture
» Technical Product Owner & Architect for Cloud Hadoop
» 5 years Hadoop, 3 years Kafka, 1 year AWS
» 10+ years Data Warehousing & BI
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe5
HAAP – Hybrid Audi Analytic Platform
Big Data Capabilities & Focus data domains
! Data Domains
Finance
Purchase
Production
Quality
Sales
Car Data
Programs Projects Data Scientists
Embed Analytics
Analyze Data
Store, Distribute and Process Data
Deliver InformationSecurity
Infrastructure &
Services
Provision Data
Deliver Service
Manage
Information
Design &
Maintain
Solutions
Authentifi-
cation
Data
Encryption
Auditing
Complex Event
Processing
Analytical APIs
Dash-
boarding
Planning &
Simulation
Visual
Analytics
BI Report &
OLAP
Statistical
Methods
Analytical
Script
Data
Warehouse
Analytical
Databases
ETL Framework
Batch
Processing
Data Access /
APIs
On-Prem
Platform
Cloud Platform
Application
Deployment
Hardware,
Network, OS
Monitoring
Lifecycle Mgmt
Development
Process &
Methods
Master Data
Mgmt
Data Lineage
HAAP – HYBRID AUDI ANALYTIC PLATTFORM
File Systems
(HDFS)
Stream
Processing
Machine
Learning
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe6
Why cloud?
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe7
Audi’s motivation to extend its Hadoop platform to the cloud
• Audi is moving many applications to the cloud
• Data of one important use case is already in the cloud
Data “Locality”
• Scaling clusters: number of nodes, node types, …
• Scaling stages: testing new features, upgrades, …
Scalability
• Adding nodes with GPUs
• Use a more flexible staging process
• Cloud services: S3, RDS, Docker Registry, …
• Reducing work on infrastructure
Functionality
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe8
Goals
One platform as a hybrid solution
• Some related system are currently only on-premise:
• DWH, Reporting Tool, …
• Some data sources remain on-premise (e.g. manufacturing)
Hybrid
• Write once, run everywhere: identical tech stack
• Single sign-on: on-prem principals used for cloud
• Data: easy data movement & shared metadata
One platform
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe9
Project Setup
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe10
Team setup & project mode
• Companies: internal (Audi + ABI) + external (2 partner + HWX)
• Bases: 4 cities in 2 countries
• Nationalities: 5 different nationalities
Mixed Team
• Scrum based
• Weekly 2 days on-site workshop at the Audi project office
• Tools: Jira, Bitbucket, RocketChat
Collaboration
• get experts on various topics (devops, Hadoop, AWS) together
• Knowledge transfer from external to internal
Goals
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe11
Sprint Structure and on-site workshops
Week 1
Day1 - 10:00 Check-in
Day2 – 15:00 Check-out
Week 2
Day1 - 10:00 Check-in
Day2 – 15:00 Check-out
Week 3
Day1 - 10:00 Review
Day1 - 13:00 Retrospective
Day1 – 15:00 Planning
Day2 – 15:00 Check-out
alignment
on-prem team
co-location
Review
on-demand
Merge-Meeting/Call
Design Meeting/Call
on-demand
Merge-Meeting/Call
Design Meeting/Call
on-demand
Merge-Meeting/Call
Design Meeting/Call
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe12
Choice of Technologies
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe13
Finding the best fitting tech stack for Audi
• CloudFormation
• Terraform
AWS Infrastructure
setup
Terraform
• already used by other projects
• Terraform + Bash
• Ansible
• …
Configuration
Management
Ansible
• switched from Bash as complexity
increased
• already used by other projects
• Ambari Blueprints
• Cloudbreak
Hadoop Deployment
Ambari Blueprints
• Cloudbreak is difficult to integrate
into existing environment
• No versioning with Cloudbreak yet
• Local users manually
• Integrate with corporate
AD/LDAP
• Our own FreeIPA
User management
FreeIPA
• AD integration was not possible (yet)
• Highest flexibility (+AD later)
• DNS, Certificate Authority
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe14
Hybrid Architecture
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe15
HAAP Architecture – Big Picture
FW XTR
AAP messaging zone AAP data zone
Kafka Data Warehouse
AAP BI App Zone
Tableau
FW LSZ FW LSZ
on premise
KDC
HDP KDC
Splunk
FW XTR
AWS Frankfurt – CAAP VPC AWS Ireland
Kafka
Deploy
Automation
AWS Frankfurt - Hub VPC
public cloud
CAAP
KDC
FreeIPA
FW Cloud
DXC
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe16
High-level AWS network architecture
hub VPC
Cisco Router
Direct Connect
VPG
Spoke VPC C
Spoke VPC D
Spoke VPC A
Spoke VPC B
VPG
VPG
VPG
Cloud
On-Premise
FW Cloud
WAN Distri
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe17
Cloud Hadoop Platform: detailed view
mgmt public subnet
mgmt private subnet
blue public subnet
blue private hdp subnet
Cisco Router
bastion
deploy FreeIPA
IGW
DXC
NAT GW
IGWNAT GW
VPG
Ambari KDC
Edge
1
Master
1
Data
1
Data
2
Data
3
LLAP
1
SG bastion
SG deploy
SG edge
SG IDM
SG master
SG workerSG Ambari SG KDC
SG hdp
RDS Postgres
blue private rds subnet
ECR registry
VPG
S3
terraform
state
backup projects
S3 endpoint
S3 endpoint
CloudWatch CloudTrail IAM
blue VPChub VPCmgmt VPC
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe18
User Management & Kerberos Trust
Cloud DEV
MIT KDC
DEV.CAAP.AUDI.VWG
Cloud PRD
MIT KDC
PRD.CAAP.AUDI.VWG
FreeIPA
KDC
CAAP.AUDI.VWG
on-prem DEV
MIT KDC
DEV.AUDI.VWG
on-prem PRD
MIT KDC
PRD.AUDI.VWG
one-way trust
one-way trust one-way trust
LDAP
carsten: <dev>
carsten-adm: <dev, prd>
> kinit carsten@DEV.AUDI.VWG
> hdfs dfs –ls //ONPREMDEV:8020/user/carsten
> hdfs dfs –ls //CLOUDDEV:8020/user/carsten
> kinit carsten@CAAP.AUDI.VWG
> hdfs dfs –ls //CLOUDDEV:8020/user/carsten
> hdfs dfs –ls //CLOUDPRD:8020/user/carsten
> hdfs dfs –ls //ONPREMDEV:8020/user/carsten
ü
û
ü
ü
ü
ü
one-way trust
OS: local user mgmt
OS: local user mgmt
û
OS: FreeIPA user integration
OS: FreeIPA user integration
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe19
Lessons learned
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe20
With great freedom come great responsibilities …
• you can do anything you want right away!
• but you have to do it yourself: e.g. DNS, LDAP, …
• Automation pays off but requires initial invest
• Security must be considered from the start
Cloud
• Agile
• Strong involvement of product owner required
• Distributed teams costs lot of travelling time
• Different experts required: Cloud (AWS), Networking, DevOps, Hadoop, …
• Fluctuation: distribute knowledge
Project setup
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe21
Looking into the Future
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe22
Staging process for projects and platform
PRD
<projects>
feature A
<platform>
DEV
<platform>
feature B
<platform>
DEV & INT
<projects>
INT
<projects>
PRD
<projects>
DEV
<projects>
INT
<projects>
AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe23
Technologies on the road map
• on demand nodes with GPU for machine learning
• S3/Glacier for „cold“ data
• Looking into Kafka as a Service (Confluent, AWS)
Cloud
• Data Steward Service for hybrid Data Governance
• Data Lifecycle Manager for data transfers and backup
Data Plane
• Using Docker under Yarn for more flexibility/functionality
• Hive3 Kafka Integration
HDP3.x
• on demand nodes with GPU for machine learning
• Data Science Workbench
Machine Learing
WE ARE HIRING
https://www.audi.com/corporate/de/karriere/einstieg-bei-audi.html
https://karriere.audibusinessinnovation.com/

Contenu connexe

Tendances

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...Mark Rittman
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019Timothy Spann
 
Building Fast Applications for Streaming Data
Building Fast Applications for Streaming DataBuilding Fast Applications for Streaming Data
Building Fast Applications for Streaming Datafreshdatabos
 
Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Hortonworks
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonDataWorks Summit/Hadoop Summit
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadDataWorks Summit
 
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...DataWorks Summit
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark DataWorks Summit/Hadoop Summit
 
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...DataWorks Summit
 
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTimothy Spann
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...DataWorks Summit
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseDataWorks Summit
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data avanttic Consultoría Tecnológica
 
Airline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseAirline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseDataWorks Summit
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewVMware Tanzu
 
The Destiny of Data
The Destiny of DataThe Destiny of Data
The Destiny of DataHortonworks
 

Tendances (20)

IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
IlOUG Tech Days 2016 - Big Data for Oracle Developers - Towards Spark, Real-T...
 
Introduction to Apache NiFi dws19 DWS - DC 2019
Introduction to Apache NiFi   dws19 DWS - DC 2019Introduction to Apache NiFi   dws19 DWS - DC 2019
Introduction to Apache NiFi dws19 DWS - DC 2019
 
Building Fast Applications for Streaming Data
Building Fast Applications for Streaming DataBuilding Fast Applications for Streaming Data
Building Fast Applications for Streaming Data
 
Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?Your Self-Driving Car - How Did it Get So Smart?
Your Self-Driving Car - How Did it Get So Smart?
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road AheadCloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
 
On Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and AmbariOn Demand HDP Clusters using Cloudbreak and Ambari
On Demand HDP Clusters using Cloudbreak and Ambari
 
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
How Apache Spark and Apache Hadoop are being used to keep banking regulators ...
 
High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark High Performance Spatial-Temporal Trajectory Analysis with Spark
High Performance Spatial-Temporal Trajectory Analysis with Spark
 
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
 
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifiTracking crime as it occurs with apache phoenix, apache hbase and apache nifi
Tracking crime as it occurs with apache phoenix, apache hbase and apache nifi
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
Meetup Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data
 
Airline reservations and routing: a graph use case
Airline reservations and routing: a graph use caseAirline reservations and routing: a graph use case
Airline reservations and routing: a graph use case
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Pivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical OverviewPivotal Big Data Suite: A Technical Overview
Pivotal Big Data Suite: A Technical Overview
 
Big Data at your Desk with KNIME
Big Data at your Desk with KNIMEBig Data at your Desk with KNIME
Big Data at your Desk with KNIME
 
The Destiny of Data
The Destiny of DataThe Destiny of Data
The Destiny of Data
 

Similaire à Audi's Hybrid Hadoop Journey

SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)Sascha Dittmann
 
Cascading User Group Meet
Cascading User Group MeetCascading User Group Meet
Cascading User Group MeetVinoth Kannan
 
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...apidays
 
Journey to the Cloud with Red Hat
Journey to the Cloud with Red HatJourney to the Cloud with Red Hat
Journey to the Cloud with Red HatKen Thompson
 
Hadoop Desktop Cluster
Hadoop Desktop ClusterHadoop Desktop Cluster
Hadoop Desktop ClusterPaul Morse
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar
 
Deep dive into Google Cloud for Big Data
Deep dive into Google Cloud for Big DataDeep dive into Google Cloud for Big Data
Deep dive into Google Cloud for Big DataTu Le Dinh
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloudTu Pham
 
Hadoop acm presentation
Hadoop acm presentationHadoop acm presentation
Hadoop acm presentationBrad Sarsfield
 
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)Eric D. Schabell
 
NA Adabas & Natural User Group Meeting April 2023
NA Adabas & Natural User Group Meeting April 2023NA Adabas & Natural User Group Meeting April 2023
NA Adabas & Natural User Group Meeting April 2023Software AG
 
Accelerating Innovation with Hybrid Cloud
Accelerating Innovation with Hybrid CloudAccelerating Innovation with Hybrid Cloud
Accelerating Innovation with Hybrid CloudJeff Jakubiak
 
SAP Cloud Platform - Integration, Extensibility & Services
SAP Cloud Platform - Integration, Extensibility & ServicesSAP Cloud Platform - Integration, Extensibility & Services
SAP Cloud Platform - Integration, Extensibility & ServicesAndrew Harding
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data AnalyticsAmazon Web Services
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...Amazon Web Services
 
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...Jürgen Ambrosi
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019 Timothy Spann
 

Similaire à Audi's Hybrid Hadoop Journey (20)

SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
SQLSaturday #230 - Introduction to Microsoft Big Data (Part 1)
 
Cascading User Group Meet
Cascading User Group MeetCascading User Group Meet
Cascading User Group Meet
 
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...
Apidays Paris 2023 - Harmonizing Cloud Horizons: A Journey to a Resilient Hyb...
 
Journey to the Cloud with Red Hat
Journey to the Cloud with Red HatJourney to the Cloud with Red Hat
Journey to the Cloud with Red Hat
 
Hadoop Desktop Cluster
Hadoop Desktop ClusterHadoop Desktop Cluster
Hadoop Desktop Cluster
 
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
Openbar Kontich // Google Cloud: past, present and the (oh so sweet) future b...
 
Build a Cloud Day Paris
Build a Cloud Day ParisBuild a Cloud Day Paris
Build a Cloud Day Paris
 
Deep dive into Google Cloud for Big Data
Deep dive into Google Cloud for Big DataDeep dive into Google Cloud for Big Data
Deep dive into Google Cloud for Big Data
 
Big data on google cloud
Big data on google cloudBig data on google cloud
Big data on google cloud
 
Hadoop acm presentation
Hadoop acm presentationHadoop acm presentation
Hadoop acm presentation
 
Apresentação Hadoop
Apresentação HadoopApresentação Hadoop
Apresentação Hadoop
 
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)
Red Hat Forum Poland 2019 - Red Hat Open Hybrid Cloud (keynote)
 
NA Adabas & Natural User Group Meeting April 2023
NA Adabas & Natural User Group Meeting April 2023NA Adabas & Natural User Group Meeting April 2023
NA Adabas & Natural User Group Meeting April 2023
 
Accelerating Innovation with Hybrid Cloud
Accelerating Innovation with Hybrid CloudAccelerating Innovation with Hybrid Cloud
Accelerating Innovation with Hybrid Cloud
 
SAP Cloud Platform - Integration, Extensibility & Services
SAP Cloud Platform - Integration, Extensibility & ServicesSAP Cloud Platform - Integration, Extensibility & Services
SAP Cloud Platform - Integration, Extensibility & Services
 
Data & Analytics - Session 1 - Big Data Analytics
Data & Analytics - Session 1 -  Big Data AnalyticsData & Analytics - Session 1 -  Big Data Analytics
Data & Analytics - Session 1 - Big Data Analytics
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from ...
 
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...
Proposte ORACLE per la gestione dei contenuti digitali e per la ricerca scien...
 
Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019   Machine Learning in the Enterprise 2019
Machine Learning in the Enterprise 2019
 

Plus de DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

Plus de DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Dernier

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 

Dernier (20)

SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 

Audi's Hybrid Hadoop Journey

  • 1. DataWorks Summit 2019 - Barcelona Audi‘s Hadoop Journey into the Hybrid Cloud Carsten Herbe (Audi Business Innovation GmbH, Germany)
  • 2. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe2 About us
  • 3. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe3 Audi AG 1,8 million cars per year*, 90.000 employees worldwide* * source: https://www.audi.com/de/company.html
  • 4. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe4 Audi mobility innovations Audi on demand Audi balanced technologies Audi e-gas Audi customer IT solutions Audi Business Innovation GmbH Munich based subsidiary of Audi AG Carsten Herbe Audi Business Innovation GmbH » Data Platform & Solution Architecture » Technical Product Owner & Architect for Cloud Hadoop » 5 years Hadoop, 3 years Kafka, 1 year AWS » 10+ years Data Warehousing & BI
  • 5. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe5 HAAP – Hybrid Audi Analytic Platform Big Data Capabilities & Focus data domains ! Data Domains Finance Purchase Production Quality Sales Car Data Programs Projects Data Scientists Embed Analytics Analyze Data Store, Distribute and Process Data Deliver InformationSecurity Infrastructure & Services Provision Data Deliver Service Manage Information Design & Maintain Solutions Authentifi- cation Data Encryption Auditing Complex Event Processing Analytical APIs Dash- boarding Planning & Simulation Visual Analytics BI Report & OLAP Statistical Methods Analytical Script Data Warehouse Analytical Databases ETL Framework Batch Processing Data Access / APIs On-Prem Platform Cloud Platform Application Deployment Hardware, Network, OS Monitoring Lifecycle Mgmt Development Process & Methods Master Data Mgmt Data Lineage HAAP – HYBRID AUDI ANALYTIC PLATTFORM File Systems (HDFS) Stream Processing Machine Learning
  • 6. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe6 Why cloud?
  • 7. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe7 Audi’s motivation to extend its Hadoop platform to the cloud • Audi is moving many applications to the cloud • Data of one important use case is already in the cloud Data “Locality” • Scaling clusters: number of nodes, node types, … • Scaling stages: testing new features, upgrades, … Scalability • Adding nodes with GPUs • Use a more flexible staging process • Cloud services: S3, RDS, Docker Registry, … • Reducing work on infrastructure Functionality
  • 8. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe8 Goals One platform as a hybrid solution • Some related system are currently only on-premise: • DWH, Reporting Tool, … • Some data sources remain on-premise (e.g. manufacturing) Hybrid • Write once, run everywhere: identical tech stack • Single sign-on: on-prem principals used for cloud • Data: easy data movement & shared metadata One platform
  • 9. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe9 Project Setup
  • 10. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe10 Team setup & project mode • Companies: internal (Audi + ABI) + external (2 partner + HWX) • Bases: 4 cities in 2 countries • Nationalities: 5 different nationalities Mixed Team • Scrum based • Weekly 2 days on-site workshop at the Audi project office • Tools: Jira, Bitbucket, RocketChat Collaboration • get experts on various topics (devops, Hadoop, AWS) together • Knowledge transfer from external to internal Goals
  • 11. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe11 Sprint Structure and on-site workshops Week 1 Day1 - 10:00 Check-in Day2 – 15:00 Check-out Week 2 Day1 - 10:00 Check-in Day2 – 15:00 Check-out Week 3 Day1 - 10:00 Review Day1 - 13:00 Retrospective Day1 – 15:00 Planning Day2 – 15:00 Check-out alignment on-prem team co-location Review on-demand Merge-Meeting/Call Design Meeting/Call on-demand Merge-Meeting/Call Design Meeting/Call on-demand Merge-Meeting/Call Design Meeting/Call
  • 12. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe12 Choice of Technologies
  • 13. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe13 Finding the best fitting tech stack for Audi • CloudFormation • Terraform AWS Infrastructure setup Terraform • already used by other projects • Terraform + Bash • Ansible • … Configuration Management Ansible • switched from Bash as complexity increased • already used by other projects • Ambari Blueprints • Cloudbreak Hadoop Deployment Ambari Blueprints • Cloudbreak is difficult to integrate into existing environment • No versioning with Cloudbreak yet • Local users manually • Integrate with corporate AD/LDAP • Our own FreeIPA User management FreeIPA • AD integration was not possible (yet) • Highest flexibility (+AD later) • DNS, Certificate Authority
  • 14. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe14 Hybrid Architecture
  • 15. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe15 HAAP Architecture – Big Picture FW XTR AAP messaging zone AAP data zone Kafka Data Warehouse AAP BI App Zone Tableau FW LSZ FW LSZ on premise KDC HDP KDC Splunk FW XTR AWS Frankfurt – CAAP VPC AWS Ireland Kafka Deploy Automation AWS Frankfurt - Hub VPC public cloud CAAP KDC FreeIPA FW Cloud DXC
  • 16. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe16 High-level AWS network architecture hub VPC Cisco Router Direct Connect VPG Spoke VPC C Spoke VPC D Spoke VPC A Spoke VPC B VPG VPG VPG Cloud On-Premise FW Cloud WAN Distri
  • 17. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe17 Cloud Hadoop Platform: detailed view mgmt public subnet mgmt private subnet blue public subnet blue private hdp subnet Cisco Router bastion deploy FreeIPA IGW DXC NAT GW IGWNAT GW VPG Ambari KDC Edge 1 Master 1 Data 1 Data 2 Data 3 LLAP 1 SG bastion SG deploy SG edge SG IDM SG master SG workerSG Ambari SG KDC SG hdp RDS Postgres blue private rds subnet ECR registry VPG S3 terraform state backup projects S3 endpoint S3 endpoint CloudWatch CloudTrail IAM blue VPChub VPCmgmt VPC
  • 18. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe18 User Management & Kerberos Trust Cloud DEV MIT KDC DEV.CAAP.AUDI.VWG Cloud PRD MIT KDC PRD.CAAP.AUDI.VWG FreeIPA KDC CAAP.AUDI.VWG on-prem DEV MIT KDC DEV.AUDI.VWG on-prem PRD MIT KDC PRD.AUDI.VWG one-way trust one-way trust one-way trust LDAP carsten: <dev> carsten-adm: <dev, prd> > kinit carsten@DEV.AUDI.VWG > hdfs dfs –ls //ONPREMDEV:8020/user/carsten > hdfs dfs –ls //CLOUDDEV:8020/user/carsten > kinit carsten@CAAP.AUDI.VWG > hdfs dfs –ls //CLOUDDEV:8020/user/carsten > hdfs dfs –ls //CLOUDPRD:8020/user/carsten > hdfs dfs –ls //ONPREMDEV:8020/user/carsten ü û ü ü ü ü one-way trust OS: local user mgmt OS: local user mgmt û OS: FreeIPA user integration OS: FreeIPA user integration
  • 19. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe19 Lessons learned
  • 20. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe20 With great freedom come great responsibilities … • you can do anything you want right away! • but you have to do it yourself: e.g. DNS, LDAP, … • Automation pays off but requires initial invest • Security must be considered from the start Cloud • Agile • Strong involvement of product owner required • Distributed teams costs lot of travelling time • Different experts required: Cloud (AWS), Networking, DevOps, Hadoop, … • Fluctuation: distribute knowledge Project setup
  • 21. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe21 Looking into the Future
  • 22. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe22 Staging process for projects and platform PRD <projects> feature A <platform> DEV <platform> feature B <platform> DEV & INT <projects> INT <projects> PRD <projects> DEV <projects> INT <projects>
  • 23. AUDI AG DataWorks Summit Barcelona 2019 - Audi‘s Hadoop Journey into the Hybrid Cloud – Carsten Herbe23 Technologies on the road map • on demand nodes with GPU for machine learning • S3/Glacier for „cold“ data • Looking into Kafka as a Service (Confluent, AWS) Cloud • Data Steward Service for hybrid Data Governance • Data Lifecycle Manager for data transfers and backup Data Plane • Using Docker under Yarn for more flexibility/functionality • Hive3 Kafka Integration HDP3.x • on demand nodes with GPU for machine learning • Data Science Workbench Machine Learing