SlideShare une entreprise Scribd logo
1  sur  35
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich
 Überraschend mehr Möglichkeiten
© OPITZ CONSULTING 2018
Big Data Stories from the Field
Matthias Diekstall, Roland Wammers,
Manuel Marowski
From Theory to Practice
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich Seite 2
Agenda
1
2
3
DWH Modernization with AWS BigData
Advanced Analytics & Complex Event
Processing at congstar
Stream Analytics & Machine Learning with
AWS OC Quickstarter
Big Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich Seite 3
DWH Modernization with AWS
BigData as an Insurance Company
 Once upon a Time …
 Defined Targets
 Challenges
 Our Proposal
 Technical Implementation
 … and they lived happily ever after
1
Big Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 4
Once upon a Time …
 Mid-sized insurance company
 6000 Employees
 4 M Clients
 14 M Contracts
 3.2 B EUR in Revenues
 Enterprise DWH established
 Standard Reporting in place
 Data Mining in a few departments
 Using MS Excel mostly
 Partially R desktop usage
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 5
Defined Targets
 Get a feeling for new technologies (Hadoop Ecosystem)
 Learn their approach to data processing
 Low investment
 „Big Data Test Drive“
 Increase flexibility for data sources
 Enable self service for departments on a larger scale
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 6
Challenges
 No tangible use case initially
 No decision regarding products/license
model
 No good grasp on fundamental
concepts of Big Data technologies
 Little resources for driving this project
 No hardware available (short-term)
 Direct connectivity to source systems
questionable
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 7
Our Proposal
 Quick start with a cloud-based solution
 Start small and allow for growth
 Allow a wide variety of technologies without having to dedicate resources
to administration and operation
 To be more precise:
 Prepare environment for easy startup
 Train/coach employees in essential aspects
 Use AWS technologies
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 8
Technical Implementation
 AWS IAM for user management
 AWS S3 for data storage
 AWS EMR as the basis for data processing
 Hive
 Pig
 Spark
 Python
 Zeppelin as graphical frontend
 Augmented with R Studio
 Mini Tutorials for users
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 9
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 10
AWS Mini tutorials for users
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 11
… and they lived happily ever after
 Results
 Targets achieved at minimal cost (< $500 in ~ 3 months)
 Competency development
 Better understanding of „how it works“
 Lessons learned
 Focus on as few tools as possible
 Create simple step-by-step tutorials
 Even a hypothetical use case is better than none
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich Seite 12
Advanced Analytics & Complex Event
Processing at congstar
 First Thoughts
 Creating the Base
 Working with the Data
 First Steps to Advanced Analytics
2
Big Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 13
congstar GmbH
 Subsidiary of Telekom Deutschland GmbH
 Founded in July 2007
 Sells mobile contracts and DSL
 Over 4.500.000 customers
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 14
Motivation
 Better understanding of the user
 Improve the user experience
 Enhance existing systems
 Being prepared for future requirements
 Create new content in reasonable time
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 15
Challenges
 Building a big data system for advanced analytics and complex event
processing in AWS
 Find right technologies in Hadoop
 Find suitable AWS services
 Keeping the costs low
 Provisioning
 Replacing old systems with new technology
 Secure data transfer between on prem and AWS
 Live agile
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 16
Infrastructure as code
 Testing resources and services via AWS management console
 Creating CloudFormation templates
 Infrastructure as code
 Create stacks for development, test and production system
 Working with stacks
 Adjustments made in the code
 Diff of old and new code
 Rollback function in case of error
 Establishing a secure VPN connection
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 17
Overview of the basic Infrastructure
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 18
Collecting and loading data into S3
 Data transfer
 Initial connection only established from the on prem network
 Need on prem solution to transfer data into S3
 NIFI
 Web UI
 Schedule flows
 No programming skills needed
 Limited to used processors
 Format: CSV, AVRO
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 19
Process data
 Using Spark (Scala)
 Fast data processing
 Needs implementation
 Format: Parquet or Avro – saves space, time and money
 Organize the data
 Layer
 Partitions
 Purpose
 Source
 …
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 20
Using spot instances
 Data-backup capabilities
 Set a max. bidding price you are willing to pay
 Saves time and money
 Cons:
 You loose the instances when the spot-price increases you max. price
 2 minutes to save your data
 Hybrid model for Hadoop
 Master and 1/3 workers on on-demand instances
 Rest on spot instances
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 21
Get data available with SQL
 Create Glue catalog with a Glue crawler
 Scans all sub folders of a S3 path
 Tries to recognize the right format
 Classifies according to the file type
 Glue catalog
 Used as Hive metastore on an EMR cluster
 Used in Athena for ad hoc analytics
 Not all classifiers are perfect
 Manual adjustments of the crawler are required
 Manual adjustments of the table definitions are required
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 22
Testing Exasol on AWS market place
 Starting Exasol on EC2 instance
 Using an EBS instance
 Testing various instances
 Duplicating the instance to be more free in testing
 Testing different server types/sizes
 Testing licensed software (AWS Marketplace) before buying expensive
license
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 23
Amazon SageMaker
 JupyterHub
 Python-based API
 Focusing on development, learning, testing and distributing ML-Models
 Easy switching between several algorithms
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field Seite 24
Outlook
 Combine Exasol with ML models created by SageMaker
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
Stream Analytics & Machine Learning with
AWS OC Quickstarter
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich Seite 26
Stream Analytics & Machine
Learning with AWS OC Quickstarter
 Use case
 DWH offloading
 Architectural overview
 The data flow
 Industrial use case
3
Big Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
Use case: Twitter Stream Analytics
Seite 27
Twitter
Streaming Data
Machine Learning sentiment analysis
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
DWH Offloading
DWH
Integration
Layer
Enterprise
Layer
User View
Layer
Source
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
DWH offloading
Data
Integration
Layer
Enterprise
Layer
Offload
Refined Data Lake
User View
Layer
ETL
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
Advantages of DWH-Offloading
 Cost savings through outsourcing to low-cost storage space
 Combining structured data with unstructured data
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
Used technologies
 Scala
 Hive, Oozie, Kafka, Spark, Sqoop
➢ Stream Processing
➢ DWH Offloading
➢ Scheduling
 Spark.ML
➢ sentiment analysis
 AWS
➢ infrastructure / Hadoop / HDFS / S3 / Data lake
 ELK-Stack (Elastic Search, Logstash, Kibana)
➢ Visualization / Indexed data access
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
© OPITZ CONSULTING 2018
Informationsklassifikation:
ÖffentlichBig Data Stories from the Field
Industrial use cases
 Predictive Maintenance
 Real-time error detection in production processes
 Dynamic evaluation of component quality
© OPITZ CONSULTING 2018
Informationsklassifikation:
Öffentlich
 Überraschend mehr Möglichkeiten
@OC_WIRE OPITZCONSULTING opitzconsultingWWW.OPITZ-CONSULTING.COM
Seite 35
Contact us!
Big Data Stories from the Field
Matthias Diekstall
Developer
+49 201 892994-1753
Matthias.Diekstall@opitz-consulting.com
Roland Wammers
Solution Architect
+49 201 892994-1757
Roland.Wammers@opitz-consulting.com
Manuel Marowski
Developer
+49 201 892994-1748
Manuel.Marowski@opitz-consulting.com

Contenu connexe

Tendances

Connecting Buildings with AWS
Connecting Buildings with AWSConnecting Buildings with AWS
Connecting Buildings with AWSAWS Germany
 
TechEvent biGenius What's New
TechEvent biGenius What's NewTechEvent biGenius What's New
TechEvent biGenius What's NewTrivadis
 
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, Ciena
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, CienaAutograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, Ciena
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, CienaNeo4j
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Jaroslav Gergic
 
Understanding the Operational Database Infrastructure for IoT and Fast Data
Understanding the Operational Database Infrastructure for IoT and Fast DataUnderstanding the Operational Database Infrastructure for IoT and Fast Data
Understanding the Operational Database Infrastructure for IoT and Fast DataVoltDB
 
AI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfAI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfBESPIN GLOBAL
 
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at ScaleGraph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at ScaleTigerGraph
 
Couchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseCouchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseHPCC Systems
 
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...FIWARE
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. ServicesTigerGraph
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureMongoDB
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!TigerGraph
 
Cloudsim Projects With Source Code
Cloudsim Projects With Source CodeCloudsim Projects With Source Code
Cloudsim Projects With Source CodePhD Direction
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...TigerGraph
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunitiesBigdata Meetup Kochi
 
TigerGraph UI Toolkits Financial Crimes
TigerGraph UI Toolkits Financial CrimesTigerGraph UI Toolkits Financial Crimes
TigerGraph UI Toolkits Financial CrimesTigerGraph
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUITigerGraph
 
A view of graph data usage by Cerved
A view of graph data usage by CervedA view of graph data usage by Cerved
A view of graph data usage by CervedData Science Milan
 
Well Architected Framework - Data
Well Architected Framework - Data Well Architected Framework - Data
Well Architected Framework - Data Craig Milroy
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseGanesan Narayanasamy
 

Tendances (20)

Connecting Buildings with AWS
Connecting Buildings with AWSConnecting Buildings with AWS
Connecting Buildings with AWS
 
TechEvent biGenius What's New
TechEvent biGenius What's NewTechEvent biGenius What's New
TechEvent biGenius What's New
 
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, Ciena
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, CienaAutograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, Ciena
Autograph - Natural Signatures for Graph Modelling, Simon Brueckheimer, Ciena
 
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
Big Data Pipeline for Analytics at Scale @ FIT CVUT 2014
 
Understanding the Operational Database Infrastructure for IoT and Fast Data
Understanding the Operational Database Infrastructure for IoT and Fast DataUnderstanding the Operational Database Infrastructure for IoT and Fast Data
Understanding the Operational Database Infrastructure for IoT and Fast Data
 
AI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End ItselfAI/ML is a Means to Digital Transformation, Not an End Itself
AI/ML is a Means to Digital Transformation, Not an End Itself
 
Graph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at ScaleGraph-Based Identity Resolution at Scale
Graph-Based Identity Resolution at Scale
 
Couchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterpriseCouchbase & HPCC Systems – A complete mobile & data platform in the enterprise
Couchbase & HPCC Systems – A complete mobile & data platform in the enterprise
 
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...
FIWARE Global Summit - International Data Spaces - From Industry 4.0 to Data ...
 
Graph+AI for Fin. Services
Graph+AI for Fin. ServicesGraph+AI for Fin. Services
Graph+AI for Fin. Services
 
Big Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise ArchitectureBig Data Paris - A Modern Enterprise Architecture
Big Data Paris - A Modern Enterprise Architecture
 
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!Graph Hardware Architecture - Enterprise graphs deserve great hardware!
Graph Hardware Architecture - Enterprise graphs deserve great hardware!
 
Cloudsim Projects With Source Code
Cloudsim Projects With Source CodeCloudsim Projects With Source Code
Cloudsim Projects With Source Code
 
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
Hardware Accelerated Machine Learning Solution for Detecting Fraud and Money ...
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
TigerGraph UI Toolkits Financial Crimes
TigerGraph UI Toolkits Financial CrimesTigerGraph UI Toolkits Financial Crimes
TigerGraph UI Toolkits Financial Crimes
 
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUIMachine Learning Feature Design with TigerGraph 3.0 No-Code GUI
Machine Learning Feature Design with TigerGraph 3.0 No-Code GUI
 
A view of graph data usage by Cerved
A view of graph data usage by CervedA view of graph data usage by Cerved
A view of graph data usage by Cerved
 
Well Architected Framework - Data
Well Architected Framework - Data Well Architected Framework - Data
Well Architected Framework - Data
 
Deep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the EnterpriseDeep Learning Image Processing Applications in the Enterprise
Deep Learning Image Processing Applications in the Enterprise
 

Similaire à Analytics Web Day | From Theory to Practice: Big Data Stories from the Field

Next Gen Big Data Plattform mit Hadoop, APIs und Kubernetes
Next Gen Big Data Plattform mit Hadoop, APIs und KubernetesNext Gen Big Data Plattform mit Hadoop, APIs und Kubernetes
Next Gen Big Data Plattform mit Hadoop, APIs und KubernetesSven Bernhardt
 
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, ProcessesGerd Prüßmann
 
Sql Azure Partner Opportunities 07 29 2008
Sql Azure Partner Opportunities 07 29 2008Sql Azure Partner Opportunities 07 29 2008
Sql Azure Partner Opportunities 07 29 2008clapal
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AISemantic Web Company
 
Cloud and Data Analytics Architecture: Data Everywhere for Everyone
Cloud and Data Analytics Architecture: Data Everywhere for EveryoneCloud and Data Analytics Architecture: Data Everywhere for Everyone
Cloud and Data Analytics Architecture: Data Everywhere for EveryoneMichal Hodinka
 
Drowning in Data but Thirsty for Insights
Drowning in Data but Thirsty for InsightsDrowning in Data but Thirsty for Insights
Drowning in Data but Thirsty for InsightsBenjamin Nussbaum
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH ModernizationTrivadis
 
How would cloud computing Effect to Software Industry?
How would cloud computing  Effect to Software Industry?How would cloud computing  Effect to Software Industry?
How would cloud computing Effect to Software Industry?Thanachart Numnonda
 
How would cloud computing Effect to Software Industry?
How would cloud computing Effect to Software Industry?How would cloud computing Effect to Software Industry?
How would cloud computing Effect to Software Industry?IMC Institute
 
Cloud Computing - A collection of predictions, principles and providers - Feb...
Cloud Computing - A collection of predictions, principles and providers - Feb...Cloud Computing - A collection of predictions, principles and providers - Feb...
Cloud Computing - A collection of predictions, principles and providers - Feb...William Santiago
 
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...Enterprise Management Associates
 
2018 19 Cloudcomputing
2018 19 Cloudcomputing2018 19 Cloudcomputing
2018 19 CloudcomputingRajesh Math
 
Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Ankit Bose
 
Cloud Automation and Machine learning: A selection of real world case studies...
Cloud Automation and Machine learning: A selection of real world case studies...Cloud Automation and Machine learning: A selection of real world case studies...
Cloud Automation and Machine learning: A selection of real world case studies...Amazon Web Services
 
Oracle IoT Cloud Service - First practical experience
Oracle IoT Cloud Service - First practical experience Oracle IoT Cloud Service - First practical experience
Oracle IoT Cloud Service - First practical experience OPITZ CONSULTING Deutschland
 
Inawisdom Overview - construction.pdf
Inawisdom Overview - construction.pdfInawisdom Overview - construction.pdf
Inawisdom Overview - construction.pdfPhilipBasford
 
The IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessThe IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessAleksandar Francuz
 
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS2020.aero
 
OVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceOVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceMojtaba Imani
 

Similaire à Analytics Web Day | From Theory to Practice: Big Data Stories from the Field (20)

Next Gen Big Data Plattform mit Hadoop, APIs und Kubernetes
Next Gen Big Data Plattform mit Hadoop, APIs und KubernetesNext Gen Big Data Plattform mit Hadoop, APIs und Kubernetes
Next Gen Big Data Plattform mit Hadoop, APIs und Kubernetes
 
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes
[DOST] OpenStack & the Enterprise Hybrid Cloud - Tech, People, Processes
 
Sql Azure Partner Opportunities 07 29 2008
Sql Azure Partner Opportunities 07 29 2008Sql Azure Partner Opportunities 07 29 2008
Sql Azure Partner Opportunities 07 29 2008
 
Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
Cloud and Data Analytics Architecture: Data Everywhere for Everyone
Cloud and Data Analytics Architecture: Data Everywhere for EveryoneCloud and Data Analytics Architecture: Data Everywhere for Everyone
Cloud and Data Analytics Architecture: Data Everywhere for Everyone
 
Drowning in Data but Thirsty for Insights
Drowning in Data but Thirsty for InsightsDrowning in Data but Thirsty for Insights
Drowning in Data but Thirsty for Insights
 
TechEvent DWH Modernization
TechEvent DWH ModernizationTechEvent DWH Modernization
TechEvent DWH Modernization
 
How would cloud computing Effect to Software Industry?
How would cloud computing  Effect to Software Industry?How would cloud computing  Effect to Software Industry?
How would cloud computing Effect to Software Industry?
 
How would cloud computing Effect to Software Industry?
How would cloud computing Effect to Software Industry?How would cloud computing Effect to Software Industry?
How would cloud computing Effect to Software Industry?
 
Cloud Computing - A collection of predictions, principles and providers - Feb...
Cloud Computing - A collection of predictions, principles and providers - Feb...Cloud Computing - A collection of predictions, principles and providers - Feb...
Cloud Computing - A collection of predictions, principles and providers - Feb...
 
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
How Analytics Optimize Migration to Amazon Web Services, Microsoft Azure and ...
 
2018 19 Cloudcomputing
2018 19 Cloudcomputing2018 19 Cloudcomputing
2018 19 Cloudcomputing
 
Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise Multi-cloud strategy for enterprise
Multi-cloud strategy for enterprise
 
Cloud Automation and Machine learning: A selection of real world case studies...
Cloud Automation and Machine learning: A selection of real world case studies...Cloud Automation and Machine learning: A selection of real world case studies...
Cloud Automation and Machine learning: A selection of real world case studies...
 
Oracle IoT Cloud Service - First practical experience
Oracle IoT Cloud Service - First practical experience Oracle IoT Cloud Service - First practical experience
Oracle IoT Cloud Service - First practical experience
 
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing SeminarCloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
Cloud Computing and Edge Computing(CTO Kieun Park) - Edge Computing Seminar
 
Inawisdom Overview - construction.pdf
Inawisdom Overview - construction.pdfInawisdom Overview - construction.pdf
Inawisdom Overview - construction.pdf
 
The IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for businessThe IBM Cloud is the cloud made for business
The IBM Cloud is the cloud made for business
 
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
ICARUS @EBDVF 2018 - TransformingTransport Session (November 2018, Vienna)
 
OVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a ServiceOVH Analytics Data Compute and Apache Spark as a Service
OVH Analytics Data Compute and Apache Spark as a Service
 

Plus de AWS Germany

Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...AWS Germany
 
Modern Applications Web Day | Impress Your Friends with Your First Serverless...
Modern Applications Web Day | Impress Your Friends with Your First Serverless...Modern Applications Web Day | Impress Your Friends with Your First Serverless...
Modern Applications Web Day | Impress Your Friends with Your First Serverless...AWS Germany
 
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...AWS Germany
 
Modern Applications Web Day | Container Workloads on AWS
Modern Applications Web Day | Container Workloads on AWSModern Applications Web Day | Container Workloads on AWS
Modern Applications Web Day | Container Workloads on AWSAWS Germany
 
Modern Applications Web Day | Continuous Delivery to Amazon EKS with Spinnaker
Modern Applications Web Day | Continuous Delivery to Amazon EKS with SpinnakerModern Applications Web Day | Continuous Delivery to Amazon EKS with Spinnaker
Modern Applications Web Day | Continuous Delivery to Amazon EKS with SpinnakerAWS Germany
 
Building Smart Home skills for Alexa
Building Smart Home skills for AlexaBuilding Smart Home skills for Alexa
Building Smart Home skills for AlexaAWS Germany
 
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructureHotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructureAWS Germany
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopAWS Germany
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWSAWS Germany
 
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS AWS Germany
 
AWS Programme für Nonprofits
AWS Programme für NonprofitsAWS Programme für Nonprofits
AWS Programme für NonprofitsAWS Germany
 
Microservices and Data Design
Microservices and Data DesignMicroservices and Data Design
Microservices and Data DesignAWS Germany
 
Serverless vs. Developers – the real crash
Serverless vs. Developers – the real crashServerless vs. Developers – the real crash
Serverless vs. Developers – the real crashAWS Germany
 
Query your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceQuery your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceAWS Germany
 
Secret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultSecret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultAWS Germany
 
Scale to Infinity with ECS
Scale to Infinity with ECSScale to Infinity with ECS
Scale to Infinity with ECSAWS Germany
 
Containers on AWS - State of the Union
Containers on AWS - State of the UnionContainers on AWS - State of the Union
Containers on AWS - State of the UnionAWS Germany
 
Deploying and Scaling Your First Cloud Application with Amazon Lightsail
Deploying and Scaling Your First Cloud Application with Amazon LightsailDeploying and Scaling Your First Cloud Application with Amazon Lightsail
Deploying and Scaling Your First Cloud Application with Amazon LightsailAWS Germany
 
Building Personalized Data Products - From Idea to Product
Building Personalized Data Products - From Idea to ProductBuilding Personalized Data Products - From Idea to Product
Building Personalized Data Products - From Idea to ProductAWS Germany
 

Plus de AWS Germany (20)

Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
Analytics Web Day | Query your Data in S3 with SQL and optimize for Cost and ...
 
Modern Applications Web Day | Impress Your Friends with Your First Serverless...
Modern Applications Web Day | Impress Your Friends with Your First Serverless...Modern Applications Web Day | Impress Your Friends with Your First Serverless...
Modern Applications Web Day | Impress Your Friends with Your First Serverless...
 
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
 
Modern Applications Web Day | Container Workloads on AWS
Modern Applications Web Day | Container Workloads on AWSModern Applications Web Day | Container Workloads on AWS
Modern Applications Web Day | Container Workloads on AWS
 
Modern Applications Web Day | Continuous Delivery to Amazon EKS with Spinnaker
Modern Applications Web Day | Continuous Delivery to Amazon EKS with SpinnakerModern Applications Web Day | Continuous Delivery to Amazon EKS with Spinnaker
Modern Applications Web Day | Continuous Delivery to Amazon EKS with Spinnaker
 
Building Smart Home skills for Alexa
Building Smart Home skills for AlexaBuilding Smart Home skills for Alexa
Building Smart Home skills for Alexa
 
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructureHotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure
Hotel or Taxi? "Sorting hat" for travel expenses with AWS ML infrastructure
 
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless WorkshopWild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
Wild Rydes with Big Data/Kinesis focus: AWS Serverless Workshop
 
Log Analytics with AWS
Log Analytics with AWSLog Analytics with AWS
Log Analytics with AWS
 
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS
Deep Dive into Concepts and Tools for Analyzing Streaming Data on AWS
 
AWS Programme für Nonprofits
AWS Programme für NonprofitsAWS Programme für Nonprofits
AWS Programme für Nonprofits
 
Microservices and Data Design
Microservices and Data DesignMicroservices and Data Design
Microservices and Data Design
 
Serverless vs. Developers – the real crash
Serverless vs. Developers – the real crashServerless vs. Developers – the real crash
Serverless vs. Developers – the real crash
 
Query your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performanceQuery your data in S3 with SQL and optimize for cost and performance
Query your data in S3 with SQL and optimize for cost and performance
 
Secret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s VaultSecret Management with Hashicorp’s Vault
Secret Management with Hashicorp’s Vault
 
EKS Workshop
 EKS Workshop EKS Workshop
EKS Workshop
 
Scale to Infinity with ECS
Scale to Infinity with ECSScale to Infinity with ECS
Scale to Infinity with ECS
 
Containers on AWS - State of the Union
Containers on AWS - State of the UnionContainers on AWS - State of the Union
Containers on AWS - State of the Union
 
Deploying and Scaling Your First Cloud Application with Amazon Lightsail
Deploying and Scaling Your First Cloud Application with Amazon LightsailDeploying and Scaling Your First Cloud Application with Amazon Lightsail
Deploying and Scaling Your First Cloud Application with Amazon Lightsail
 
Building Personalized Data Products - From Idea to Product
Building Personalized Data Products - From Idea to ProductBuilding Personalized Data Products - From Idea to Product
Building Personalized Data Products - From Idea to Product
 

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 

Analytics Web Day | From Theory to Practice: Big Data Stories from the Field

  • 1. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich  Überraschend mehr Möglichkeiten © OPITZ CONSULTING 2018 Big Data Stories from the Field Matthias Diekstall, Roland Wammers, Manuel Marowski From Theory to Practice
  • 2. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich Seite 2 Agenda 1 2 3 DWH Modernization with AWS BigData Advanced Analytics & Complex Event Processing at congstar Stream Analytics & Machine Learning with AWS OC Quickstarter Big Data Stories from the Field
  • 3. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich Seite 3 DWH Modernization with AWS BigData as an Insurance Company  Once upon a Time …  Defined Targets  Challenges  Our Proposal  Technical Implementation  … and they lived happily ever after 1 Big Data Stories from the Field
  • 4. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 4 Once upon a Time …  Mid-sized insurance company  6000 Employees  4 M Clients  14 M Contracts  3.2 B EUR in Revenues  Enterprise DWH established  Standard Reporting in place  Data Mining in a few departments  Using MS Excel mostly  Partially R desktop usage
  • 5. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 5 Defined Targets  Get a feeling for new technologies (Hadoop Ecosystem)  Learn their approach to data processing  Low investment  „Big Data Test Drive“  Increase flexibility for data sources  Enable self service for departments on a larger scale
  • 6. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 6 Challenges  No tangible use case initially  No decision regarding products/license model  No good grasp on fundamental concepts of Big Data technologies  Little resources for driving this project  No hardware available (short-term)  Direct connectivity to source systems questionable
  • 7. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 7 Our Proposal  Quick start with a cloud-based solution  Start small and allow for growth  Allow a wide variety of technologies without having to dedicate resources to administration and operation  To be more precise:  Prepare environment for easy startup  Train/coach employees in essential aspects  Use AWS technologies
  • 8. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 8 Technical Implementation  AWS IAM for user management  AWS S3 for data storage  AWS EMR as the basis for data processing  Hive  Pig  Spark  Python  Zeppelin as graphical frontend  Augmented with R Studio  Mini Tutorials for users
  • 9. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 9
  • 10. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 10 AWS Mini tutorials for users
  • 11. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 11 … and they lived happily ever after  Results  Targets achieved at minimal cost (< $500 in ~ 3 months)  Competency development  Better understanding of „how it works“  Lessons learned  Focus on as few tools as possible  Create simple step-by-step tutorials  Even a hypothetical use case is better than none
  • 12. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich Seite 12 Advanced Analytics & Complex Event Processing at congstar  First Thoughts  Creating the Base  Working with the Data  First Steps to Advanced Analytics 2 Big Data Stories from the Field
  • 13. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 13 congstar GmbH  Subsidiary of Telekom Deutschland GmbH  Founded in July 2007  Sells mobile contracts and DSL  Over 4.500.000 customers
  • 14. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 14 Motivation  Better understanding of the user  Improve the user experience  Enhance existing systems  Being prepared for future requirements  Create new content in reasonable time
  • 15. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 15 Challenges  Building a big data system for advanced analytics and complex event processing in AWS  Find right technologies in Hadoop  Find suitable AWS services  Keeping the costs low  Provisioning  Replacing old systems with new technology  Secure data transfer between on prem and AWS  Live agile
  • 16. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 16 Infrastructure as code  Testing resources and services via AWS management console  Creating CloudFormation templates  Infrastructure as code  Create stacks for development, test and production system  Working with stacks  Adjustments made in the code  Diff of old and new code  Rollback function in case of error  Establishing a secure VPN connection
  • 17. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 17 Overview of the basic Infrastructure
  • 18. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 18 Collecting and loading data into S3  Data transfer  Initial connection only established from the on prem network  Need on prem solution to transfer data into S3  NIFI  Web UI  Schedule flows  No programming skills needed  Limited to used processors  Format: CSV, AVRO
  • 19. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 19 Process data  Using Spark (Scala)  Fast data processing  Needs implementation  Format: Parquet or Avro – saves space, time and money  Organize the data  Layer  Partitions  Purpose  Source  …
  • 20. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 20 Using spot instances  Data-backup capabilities  Set a max. bidding price you are willing to pay  Saves time and money  Cons:  You loose the instances when the spot-price increases you max. price  2 minutes to save your data  Hybrid model for Hadoop  Master and 1/3 workers on on-demand instances  Rest on spot instances
  • 21. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 21 Get data available with SQL  Create Glue catalog with a Glue crawler  Scans all sub folders of a S3 path  Tries to recognize the right format  Classifies according to the file type  Glue catalog  Used as Hive metastore on an EMR cluster  Used in Athena for ad hoc analytics  Not all classifiers are perfect  Manual adjustments of the crawler are required  Manual adjustments of the table definitions are required
  • 22. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 22 Testing Exasol on AWS market place  Starting Exasol on EC2 instance  Using an EBS instance  Testing various instances  Duplicating the instance to be more free in testing  Testing different server types/sizes  Testing licensed software (AWS Marketplace) before buying expensive license
  • 23. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 23 Amazon SageMaker  JupyterHub  Python-based API  Focusing on development, learning, testing and distributing ML-Models  Easy switching between several algorithms
  • 24. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Seite 24 Outlook  Combine Exasol with ML models created by SageMaker
  • 25. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Stream Analytics & Machine Learning with AWS OC Quickstarter
  • 26. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich Seite 26 Stream Analytics & Machine Learning with AWS OC Quickstarter  Use case  DWH offloading  Architectural overview  The data flow  Industrial use case 3 Big Data Stories from the Field
  • 27. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Use case: Twitter Stream Analytics Seite 27 Twitter Streaming Data Machine Learning sentiment analysis
  • 28. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field DWH Offloading DWH Integration Layer Enterprise Layer User View Layer Source
  • 29. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field DWH offloading Data Integration Layer Enterprise Layer Offload Refined Data Lake User View Layer ETL
  • 30. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Advantages of DWH-Offloading  Cost savings through outsourcing to low-cost storage space  Combining structured data with unstructured data
  • 31. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Used technologies  Scala  Hive, Oozie, Kafka, Spark, Sqoop ➢ Stream Processing ➢ DWH Offloading ➢ Scheduling  Spark.ML ➢ sentiment analysis  AWS ➢ infrastructure / Hadoop / HDFS / S3 / Data lake  ELK-Stack (Elastic Search, Logstash, Kibana) ➢ Visualization / Indexed data access
  • 32. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field
  • 33. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field
  • 34. © OPITZ CONSULTING 2018 Informationsklassifikation: ÖffentlichBig Data Stories from the Field Industrial use cases  Predictive Maintenance  Real-time error detection in production processes  Dynamic evaluation of component quality
  • 35. © OPITZ CONSULTING 2018 Informationsklassifikation: Öffentlich  Überraschend mehr Möglichkeiten @OC_WIRE OPITZCONSULTING opitzconsultingWWW.OPITZ-CONSULTING.COM Seite 35 Contact us! Big Data Stories from the Field Matthias Diekstall Developer +49 201 892994-1753 Matthias.Diekstall@opitz-consulting.com Roland Wammers Solution Architect +49 201 892994-1757 Roland.Wammers@opitz-consulting.com Manuel Marowski Developer +49 201 892994-1748 Manuel.Marowski@opitz-consulting.com