SlideShare une entreprise Scribd logo
1  sur  54
Télécharger pour lire hors ligne
© Hitachi Vantara Corporation 2018© Hitachi Vantara Corporation 2018.
Eine Challenge für Architekten
#DOAG2018, Nürnberg, 20. November 2018
Machine Learning
Harald Erb
Solutions Engineer, EMEA Central
Data Analytics & IoT
© Hitachi Vantara Corporation 2018
OT
107+
YEARS
BUSINESS
INDUSTRIAL
CONSUMER
CITY
IT
58+
YEARS
COMMUNICATIONS
BIG DATA
ANALYTICS
ARTIFICIAL
INTELLIGENCE
CLOUD
IT SYSTEMS
IoT
INSIGHT
Hitachi
Hitachi Vantara: What? And Why?
© Hitachi Vantara Corporation 2018
Analytics
Artificial
Intelligence
Machine
Learning
Stream
Analytics
Batch
Analytics
Data Management
Data
Orchestration
Data
Engineering
Data
Blending
Data
Collection
Asset
Asset
Management
Asset
Avatars
Data
Stores
Business
Connectors
APIs,
App-Enabling Studio
Alerts,
Notifications
Dashboards,
UI, UX
Maintenance Insights Manufacturing Insights Video Insights
Manufacturing Utilities Logistics Oil & Gas Mining Financial Services Other
Co-Creation
Services
Professional
Services
Edge
Asset
Integration
Device
Control
Data Caching,
Filtering
◼ Industry solutions (Hitachi BUs, ISVs, SIs) ◼ IoT applications ◼ IoT platform services ◼ Edge-to-cloud infrastructure
Foundry
Edge controllers,
appliances
Converged and
hyperconverged
Block-, file- and
object storage
Pentaho
IoT Solution Portfolio of Hitachi Vantara
© Hitachi Vantara Corporation 2018
© Hitachi Vantara Corporation 2018
public class AgendaForThisTalk {
public String Topic1 = "Process, Architecture and 1 Example";
public String Topic2 = "1 Dataset, 4 different Tools";
public String Topic3 = "From Prototype to Production";
}
© Hitachi Vantara Corporation 2018
Process, Architecture and 1 Example_
© Hitachi Vantara Corporation 2018
Data Warehouse & Analytics back in 1998
z
Data
Discovery
z
© Hitachi Vantara Corporation 2018
Architecting for Analytics & Machine Learning today
Source: Carlton E. Sapp: “Preparing and Architecting for Machine Learning”, Gartner, 2017
© Hitachi Vantara Corporation 2018
Analytic Dashboard Example
Source: Carlton E. Sapp: “Preparing and Architecting for Machine Learning”, Gartner, 2017
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
End-to-end Fleet Management Solution
• Combine Sensor Data with Contextual Information
• Overcome scarce availability of capable technicians
• Lower costs, reduced customer downtimes_
Fleet Optimization
© Hitachi Vantara Corporation 2018
Truck Leasing Company
Issues: Trucks have become more and more technology based, availability of capable technicians
is scarce, and managing large number of maintenance centers is expensive.
Business Objectives: Better Fleet Management - lowering costs, improving efficiency of
maintenance to reduce customer downtime
Strategic Goals: Need to take competitive advantage of truck data. 40-50,000 trucks purchased
per year – more data than any truck OEM. Gain a competitive edge through lowered costs &
predictive technology (automation in repairs & diagnostics).
© Hitachi Vantara Corporation 2018
Vehicle Sensor Data
Store models and sensor data
Asset ModelSensor DataUtility Vehicle Asset
Air Pressure
Axle Vibration
Lights
Load Weight
Movement
Temperature
Sensor
Data
Journey
Stream
Blend
Infer
Sense
Inspect
Embed & Integrate
Store
© Hitachi Vantara Corporation 2018
Adding Context to Sensor Data
IoT Data Refinery
Contextual DataSensor Data
Sensor
Data
Journey
Stream
Blend
Infer
Sense
Inspect
Embed & Integrate
Store
Vehicle Location
• GPS
• Lat / Long
• Mapping
• Movement
Vehicle Profile
• Make
• Model
• Mileage
Operational
Systems
• Maintenance
History
• Maintenance
Schedule
• Service Centers
• Parts Ordering
• Parts Inventory
Business
Outcomes
• Real-Time Fleet Status and Health
• Repair Recommendations
• Optimized Maintenance Scheduling
• Automated Parts Ordering
© Hitachi Vantara Corporation 2018
Fleet Management Dashboard: Situation Overview
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Fleet Management Dashboard: Contextual View
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Data Science – Personas & Process Model
Source: K. Bollhöfer, Chief Data Scientist, *um
Cross-functional
team
© Hitachi Vantara Corporation 2018
Architecture Challenge
Source: D. Sculley, et al.: “Hidden technical debt in Machine learning systems”, 2015
© Hitachi Vantara Corporation 2018
1 Dataset, 4 different Tools_
© Hitachi Vantara Corporation 2018
1 Dataset, 4 Tools
Dataset:
California
House Prices
github.com/ageron/handson-ml/tree/master/datasets/housing
Tools:
Jupyter Notebook
• End-to-end ML Projects
• ML with preferred programing
language like Python, R, Julia
• Live-Code embedded in Markup
Document
Oracle Data Visualization
• For data exploration
• ML to explain dependencies in
dataset
• 1-click analytical functions and
model training possible
H2O Flow
• End-to-end ML Platform with own
compute engine, AutoML,…
• Notebook UI, H2O algorithms
accessible from Python and R
• Java-Export of trained Models
Pentaho Data Integration
• Embedding ML Code in ETL
dataflows
• ML Orchestration: Model training
and management
• Plugin Machine Intelligence
Free trial versions
available for all tools!
© Hitachi Vantara Corporation 2018
Machine Learning Coding with Jupyter Notebook & Python
Start here: jupyter.org
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Machine Learning with Jupyter Notebook & Python
ML Process support: End-to-end, focus on experimentation not on production ML
Personas: Data Scientists
Useful: • Notebooks allow reproduceable ML from data ingestion to model
evaluation
• Multiple programming languages, access to latest ML frameworks via
Python interface
• Sophisticated visualizations
Architecture &
Development
related:
• Trained ML models can be serialized/saved, i.e. via Pickle, Joblib
• Models can be published as REST API endpoints, ie. via Flask
• Notebooks stored as JSON files → Code versioning / merge not allways
easy, less comfort compared to other IDE‘s (no syntax highlighting,
no key word completion)
• Large-scale ML possible, i.e. via Apache Spark MLlib; Cluster
deployments are better done separately
© Hitachi Vantara Corporation 2018
Dataset Exploration with Oracle Data Visualization
Start here: www.oracle.com/technetwork/middleware/oracle-data-visualization
© Hitachi Vantara Corporation 2018
Überschrift
© Hitachi Vantara Corporation 2018
Dataset Exploration with Oracle Data Visualization
ML Process support: Data Understanding phase, Results Presentation & Story Telling
Personas: Business Analysts, Data Scientists
Useful: • Highly interactive Charts and Filters and intuitive analysis support
(i.e. pattern brushing)
• Supporting functions to highlight and explain attribute/variable
dependencies
• Formular editor, advanced (ML) functions and Model training and
scoring available for experimentation (experienced Users only)
Architecture &
Development
related:
• Use Oracle Analytics Cloud for better collaboration; can be combined
with Data Visualization Desktop
• Dataset preparation functionality is improving over time, but will not
replace existing ETL platforms (Scalability, Job-Management)
• Good for rapid prototyping, but limited reusability of results
(curated datasets, ML Models)
© Hitachi Vantara Corporation 2018
Machine Learning Orchestration and more with Pentaho
Start here: community.hitachivantara.com/docs/DOC-1009931-downloads
Blog: community.hitachivantara.com/community/products-and-
solutions/pentaho/blog/2018/10/16/deep-learning-coming-to-pentaho
© Hitachi Vantara Corporation 2018
Bring Your Own (ML) Code
© Hitachi Vantara Corporation 2018
ETL-Tools + Python/R – A Door Opener for Deep Learning?
© Hitachi Vantara Corporation 2018
Embedding ML Algorithms into Data Pipelines
Source: XXXXXX
© Hitachi Vantara Corporation 2018
„Model Zoo“ managed by your Data Integration Solution
© Hitachi Vantara Corporation 2018
Machine Learning within Data Integration Platforms
ML Process support: Data Preparation phase, Operationalize ML, Experimentation?
Personas: Data Engineers, Data Scientists (supporting)
Useful: • DI platforms provide advanced data preparation and automation
features for effective creation of datasets
• Intuitive UI, Drag & drop instead of coding
• Skilled DI team already in place
Architecture &
Development
related:
• DI platforms are optimized to utilize full computing power for ETL/ELT,
but not for ML Tasks (i.e. parallel execution might need extra effort)
• When using R / Python interfaces: ensure to collect status infos +
performance metrics of your ML model execution within DI logging
• „ML toolbox“ of DI solutions is often not complete: i.e. script-based
work arounds needed for imputation of missing values, working with
latest algorithms, model management
• Limited and not intuitive data exploration features, i.e. visualisations
© Hitachi Vantara Corporation 2018
End-to-end Machine Learning for everyone with H20 (Flow)
Start here: www.h2o.ai/download
Accessing a 2 node H2O cluster in a R environment
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
End-to-end Machine Learning for everyone with H20 Flow
ML Process support: End-to-end, focus is on Model training/evaluation, Feature Engineering
Personas: Data Scientists, ambitious Business Analysts(?)
Useful: • H2O Flow: intuitive Notebook-style UI + user guidance
• Use the programing language you already know like R, Python
• AutoML can be used for automating the ML workflow, including training
and tuning of many models within a user-specified time limit
Architecture &
Development
related:
• Takes advantage of the computing power of distributed systems and in-
memory computing to accelerate machine learning
• Works on existing big data infrastructure, on bare metal or on top of
existing Hadoop or Spark clusters; can ingest data from HDFS, Spark, S3
• Model deployment into production with Java (POJO) and binary formats
(MOJO), Hive UDF, or as API endpoint
• H2O Flow: Limited Data Exploration (no visual only in combination with
R/Python
© Hitachi Vantara Corporation 2018
Another tool decision: Which ML Framework to choose?
Source: Dr. D. James: „Entscheidungsmatrix „Machine Learning“, it-novum.com
© Hitachi Vantara Corporation 2018
From Prototype to Production_
© Hitachi Vantara Corporation 2018
From Exploratory Data Science to Production Workflows
Line of
Governance
• Commercial exploitation
• Integration to operations
• Non-functional requirements
• Standardisation & governance
Model
• Unbounded discovery
• Self-Service sandbox
• Wide toolset / IDE’s
• Agile methods
© Hitachi Vantara Corporation 2018
Model deployment
Source: Sergei Izrailev: Design Patterns for Machine Learning in Production, 2017
• Data transformations must be the
same in training and scoring
• Interface between building &
scoring:
− In-memory - model is never
persisted, train then score
→ single & applications
− Data only, i.e. PMML, etc
→ code is independent
− Serialized objects - Pickle, R,
Spark, custom → reuse code
− Code + Data – i.e., H2O’s POJO
→ code is generated
Detailed article in
“Java aktuell” 06/2018
© Hitachi Vantara Corporation 2018
Überschrift
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
H2O POJO (Plain Old Java Object):
• ML Model implemented through
Java classes
• has dependencies with H2O
specific classes
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Gradient Boosting Machines (GBM)
• A family of powerful machine-
learning techniques for regression
and classification problems
• GBM’s produce a prediction model
in the form of an ensemble of weak
prediction models, typically
decision trees
© Hitachi Vantara Corporation 2018
From Prototype to Machine Learning in Production
Source: Sergei Izrailev: Design Patterns for Machine Learning in Production, 2017
© Hitachi Vantara Corporation 2018
Applying ML Model for Scoring of new (unlabeled) Data
Source: K. Wähner: How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka
• Add ML model =
Java application to
Apache Kafka
stream processing
application to
apply it on new
incoming events
• Spark Streaming
also allows ML
model serving via
mini batches and
makes use of in-
memory
processing and
load distribution
© Hitachi Vantara Corporation 2018
ML Model as API endpoint: Object detection example
Source: A. Rosebrock: “A scalable Keras + deep learning REST API”, pyimagesearch Blog
• “API first” approach instead of rewriting
code to replicate ML model in a
programming language supported by the
enterprise IT
• Web APIs have made it easy for cross-
language applications to work well. If a
developer needs a Model to create a ML
powered web application, they would
just need to get the URL Endpoint from
where the API is being served
• Webservice development frameworks like
Flask allow prototyping in Python
• For production environments, an
additional web server and a messaging
should be considered
Keras: a simple, modular, and extensible Deep Learning library, written in Python and
designed to enable fast experimentation with deep neural networks
Redis (Remote Dictionary Server): implements a distributed, in-memory key-value
database with optional durability
© Hitachi Vantara Corporation 2018Source: A. Rosebrock: "Building a simple Keras + deep learning REST API", The Keras Blog
© Hitachi Vantara Corporation 2018
Überschrift
Source: XXXXXX
© Hitachi Vantara Corporation 2018
Object Detection – Use Case
PicSure AI Platform für Versichrungslösungen
© Hitachi Vantara Corporation 2018
Takeaway
• Find the right problem
• Define constraints
• Design components and interfaces
• Take into account organizational constraints
• Production can’t be an afterthought
• The process is a lot of work, but it’s not
rocket science
Processing huge amounts of data with complex algorithms can a bit too much of
time. Kudos Randall Munroe / XKCD for the original
© Hitachi Vantara Corporation 2018© Hitachi Vantara Corporation 2018. All Rights Reserved
Thank You
© Hitachi Vantara Corporation 2018

Contenu connexe

Tendances

Eh p8 sp10_delta_scope_final
Eh p8 sp10_delta_scope_finalEh p8 sp10_delta_scope_final
Eh p8 sp10_delta_scope_finalClara Sack
 
Pentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sqlPentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sqlAHMED ENNAJI
 
SAUG Melbourne plenary 2017 embedded analytics
SAUG Melbourne plenary 2017 embedded analyticsSAUG Melbourne plenary 2017 embedded analytics
SAUG Melbourne plenary 2017 embedded analyticspaul.hawking
 
Transform Your Data Integration Platform From Informatica To ODI
Transform Your Data Integration Platform From Informatica To ODI Transform Your Data Integration Platform From Informatica To ODI
Transform Your Data Integration Platform From Informatica To ODI Jade Global
 
New BI Tools with HANA
New BI Tools with HANANew BI Tools with HANA
New BI Tools with HANAtasmc
 
Informatica overview
Informatica overviewInformatica overview
Informatica overviewSwetha Naveen
 
MS Cloud Day - Introduction to Windows Azure platform and real world case study
MS Cloud Day - Introduction to Windows Azure platform and real world case studyMS Cloud Day - Introduction to Windows Azure platform and real world case study
MS Cloud Day - Introduction to Windows Azure platform and real world case studySpiffy
 
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014Denis ONeil
 
Lecture about SAP HANA and Enterprise Comupting at University of Halle
Lecture about SAP HANA and Enterprise Comupting at University of HalleLecture about SAP HANA and Enterprise Comupting at University of Halle
Lecture about SAP HANA and Enterprise Comupting at University of HalleTobias Trapp
 
Fulfilling real time analytics on obi apps platform
Fulfilling real time analytics on obi apps platformFulfilling real time analytics on obi apps platform
Fulfilling real time analytics on obi apps platformShiv Bharti
 
Sap s4 hana logistics ppt
Sap s4 hana logistics pptSap s4 hana logistics ppt
Sap s4 hana logistics pptRamaCharitha1
 
Modern Reporting at Scale: How to Distribute Information and Answers to the M...
Modern Reporting at Scale: How to Distribute Information and Answers to the M...Modern Reporting at Scale: How to Distribute Information and Answers to the M...
Modern Reporting at Scale: How to Distribute Information and Answers to the M...TIBCO Jaspersoft
 
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...Jade Global
 
Stratebi_Emilio_Arias_PCM14
Stratebi_Emilio_Arias_PCM14Stratebi_Emilio_Arias_PCM14
Stratebi_Emilio_Arias_PCM14Stratebi
 
Digital economy with the speed of s4 hana
Digital economy with the speed of s4 hanaDigital economy with the speed of s4 hana
Digital economy with the speed of s4 hanaKyyba Inc.
 
Restart EAM at OSRAM with a lean approach
Restart EAM at OSRAM with a lean approachRestart EAM at OSRAM with a lean approach
Restart EAM at OSRAM with a lean approachLeanIX GmbH
 
Self service BI overview + Power BI
Self service BI overview + Power BISelf service BI overview + Power BI
Self service BI overview + Power BIArthur Graus
 

Tendances (19)

Eh p8 sp10_delta_scope_final
Eh p8 sp10_delta_scope_finalEh p8 sp10_delta_scope_final
Eh p8 sp10_delta_scope_final
 
Pentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sqlPentaho data integration 4.0 and my sql
Pentaho data integration 4.0 and my sql
 
SAUG Melbourne plenary 2017 embedded analytics
SAUG Melbourne plenary 2017 embedded analyticsSAUG Melbourne plenary 2017 embedded analytics
SAUG Melbourne plenary 2017 embedded analytics
 
Transform Your Data Integration Platform From Informatica To ODI
Transform Your Data Integration Platform From Informatica To ODI Transform Your Data Integration Platform From Informatica To ODI
Transform Your Data Integration Platform From Informatica To ODI
 
Resume (3)
Resume (3)Resume (3)
Resume (3)
 
New BI Tools with HANA
New BI Tools with HANANew BI Tools with HANA
New BI Tools with HANA
 
Informatica overview
Informatica overviewInformatica overview
Informatica overview
 
MS Cloud Day - Introduction to Windows Azure platform and real world case study
MS Cloud Day - Introduction to Windows Azure platform and real world case studyMS Cloud Day - Introduction to Windows Azure platform and real world case study
MS Cloud Day - Introduction to Windows Azure platform and real world case study
 
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014
Sap HANA Presentation to SAPnsight Dallas Breakfast Huddle in June 2014
 
Informatica session
Informatica sessionInformatica session
Informatica session
 
Lecture about SAP HANA and Enterprise Comupting at University of Halle
Lecture about SAP HANA and Enterprise Comupting at University of HalleLecture about SAP HANA and Enterprise Comupting at University of Halle
Lecture about SAP HANA and Enterprise Comupting at University of Halle
 
Fulfilling real time analytics on obi apps platform
Fulfilling real time analytics on obi apps platformFulfilling real time analytics on obi apps platform
Fulfilling real time analytics on obi apps platform
 
Sap s4 hana logistics ppt
Sap s4 hana logistics pptSap s4 hana logistics ppt
Sap s4 hana logistics ppt
 
Modern Reporting at Scale: How to Distribute Information and Answers to the M...
Modern Reporting at Scale: How to Distribute Information and Answers to the M...Modern Reporting at Scale: How to Distribute Information and Answers to the M...
Modern Reporting at Scale: How to Distribute Information and Answers to the M...
 
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...
Why and How Migrate Informatica to ODI | Infa to ODI Migration | Infa to ODI ...
 
Stratebi_Emilio_Arias_PCM14
Stratebi_Emilio_Arias_PCM14Stratebi_Emilio_Arias_PCM14
Stratebi_Emilio_Arias_PCM14
 
Digital economy with the speed of s4 hana
Digital economy with the speed of s4 hanaDigital economy with the speed of s4 hana
Digital economy with the speed of s4 hana
 
Restart EAM at OSRAM with a lean approach
Restart EAM at OSRAM with a lean approachRestart EAM at OSRAM with a lean approach
Restart EAM at OSRAM with a lean approach
 
Self service BI overview + Power BI
Self service BI overview + Power BISelf service BI overview + Power BI
Self service BI overview + Power BI
 

Similaire à Machine Learning - Eine Challenge für Architekten

UiPath + Alteryx CE Final_042822.pdf
UiPath + Alteryx CE Final_042822.pdfUiPath + Alteryx CE Final_042822.pdf
UiPath + Alteryx CE Final_042822.pdfDiana Gray, MBA
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalytixDataServices
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationAbdelkrim Hadjidj
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?SnapLogic
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolAlex Rayón Jerez
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futuremarkgrover
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesKarthik Murugesan
 
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...Khalid Salama
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 
Machine Intelligence for Design Automation
Machine Intelligence for Design AutomationMachine Intelligence for Design Automation
Machine Intelligence for Design Automations.rohit
 
Digital Reinvention by NRB
Digital Reinvention by NRBDigital Reinvention by NRB
Digital Reinvention by NRBWilliam Poos
 
Jupyter in the modern enterprise data and analytics ecosystem
Jupyter in the modern enterprise data and analytics ecosystem Jupyter in the modern enterprise data and analytics ecosystem
Jupyter in the modern enterprise data and analytics ecosystem Gerald Rousselle
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
 
UiPath 23.4 Product Release Updates
UiPath 23.4 Product Release UpdatesUiPath 23.4 Product Release Updates
UiPath 23.4 Product Release UpdatesDianaGray10
 
Ecotech Presentation.pdf
Ecotech Presentation.pdfEcotech Presentation.pdf
Ecotech Presentation.pdfJobPuneRS
 

Similaire à Machine Learning - Eine Challenge für Architekten (20)

UiPath Insights
UiPath InsightsUiPath Insights
UiPath Insights
 
UiPath + Alteryx CE Final_042822.pdf
UiPath + Alteryx CE Final_042822.pdfUiPath + Alteryx CE Final_042822.pdf
UiPath + Alteryx CE Final_042822.pdf
 
Shaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M ResumeShaik Niyas Ahamed M Resume
Shaik Niyas Ahamed M Resume
 
Analyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentationAnalyti x mapping manager product overview presentation
Analyti x mapping manager product overview presentation
 
Paris FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant PresentationParis FOD Meetup #5 Cognizant Presentation
Paris FOD Meetup #5 Cognizant Presentation
 
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
Intelligent data summit: Self-Service Big Data and AI/ML: Reality or Myth?
 
Resume
ResumeResume
Resume
 
Kettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration toolKettle: Pentaho Data Integration tool
Kettle: Pentaho Data Integration tool
 
The Lyft data platform: Now and in the future
The Lyft data platform: Now and in the futureThe Lyft data platform: Now and in the future
The Lyft data platform: Now and in the future
 
Lyft data Platform - 2019 slides
Lyft data Platform - 2019 slidesLyft data Platform - 2019 slides
Lyft data Platform - 2019 slides
 
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
Operational Machine Learning: Using Microsoft Technologies for Applied Data S...
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
Amit_Kumar_CV
Amit_Kumar_CVAmit_Kumar_CV
Amit_Kumar_CV
 
Machine Intelligence for Design Automation
Machine Intelligence for Design AutomationMachine Intelligence for Design Automation
Machine Intelligence for Design Automation
 
Digital Reinvention by NRB
Digital Reinvention by NRBDigital Reinvention by NRB
Digital Reinvention by NRB
 
Jupyter in the modern enterprise data and analytics ecosystem
Jupyter in the modern enterprise data and analytics ecosystem Jupyter in the modern enterprise data and analytics ecosystem
Jupyter in the modern enterprise data and analytics ecosystem
 
Talend Metadata Bridge
Talend Metadata BridgeTalend Metadata Bridge
Talend Metadata Bridge
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
UiPath 23.4 Product Release Updates
UiPath 23.4 Product Release UpdatesUiPath 23.4 Product Release Updates
UiPath 23.4 Product Release Updates
 
Ecotech Presentation.pdf
Ecotech Presentation.pdfEcotech Presentation.pdf
Ecotech Presentation.pdf
 

Plus de Harald Erb

Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceHarald Erb
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data EngineeringHarald Erb
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Harald Erb
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?Harald Erb
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauHarald Erb
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyHarald Erb
 
Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Harald Erb
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Harald Erb
 
Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Harald Erb
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data DiscoveryHarald Erb
 
DOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataDOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataHarald Erb
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleHarald Erb
 
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Harald Erb
 

Plus de Harald Erb (13)

Actionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data ScienceActionable Insights with AI - Snowflake for Data Science
Actionable Insights with AI - Snowflake for Data Science
 
Snowflake for Data Engineering
Snowflake for Data EngineeringSnowflake for Data Engineering
Snowflake for Data Engineering
 
Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020Dataiku & Snowflake Meetup Berlin 2020
Dataiku & Snowflake Meetup Berlin 2020
 
Does it only have to be ML + AI?
Does it only have to be ML + AI?Does it only have to be ML + AI?
Does it only have to be ML + AI?
 
Delivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and TableauDelivering rapid-fire Analytics with Snowflake and Tableau
Delivering rapid-fire Analytics with Snowflake and Tableau
 
DOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud JourneyDOAG Big Data Days 2017 - Cloud Journey
DOAG Big Data Days 2017 - Cloud Journey
 
Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen Do you know what k-Means? Cluster-Analysen
Do you know what k-Means? Cluster-Analysen
 
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
Exploratory Analysis in the Data Lab - Team-Sport or for Nerds only?
 
Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!Big Data Discovery + Analytics = Datengetriebene Innovation!
Big Data Discovery + Analytics = Datengetriebene Innovation!
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
 
DOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big DataDOAG News 2012 - Analytische Mehrwerte mit Big Data
DOAG News 2012 - Analytische Mehrwerte mit Big Data
 
Oracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by ExampleOracle Unified Information Architeture + Analytics by Example
Oracle Unified Information Architeture + Analytics by Example
 
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
Endeca Web Acquisition Toolkit - Integration verteilter Web-Anwendungen und a...
 

Dernier

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 

Dernier (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

Machine Learning - Eine Challenge für Architekten

  • 1. © Hitachi Vantara Corporation 2018© Hitachi Vantara Corporation 2018. Eine Challenge für Architekten #DOAG2018, Nürnberg, 20. November 2018 Machine Learning Harald Erb Solutions Engineer, EMEA Central Data Analytics & IoT
  • 2. © Hitachi Vantara Corporation 2018 OT 107+ YEARS BUSINESS INDUSTRIAL CONSUMER CITY IT 58+ YEARS COMMUNICATIONS BIG DATA ANALYTICS ARTIFICIAL INTELLIGENCE CLOUD IT SYSTEMS IoT INSIGHT Hitachi Hitachi Vantara: What? And Why?
  • 3. © Hitachi Vantara Corporation 2018 Analytics Artificial Intelligence Machine Learning Stream Analytics Batch Analytics Data Management Data Orchestration Data Engineering Data Blending Data Collection Asset Asset Management Asset Avatars Data Stores Business Connectors APIs, App-Enabling Studio Alerts, Notifications Dashboards, UI, UX Maintenance Insights Manufacturing Insights Video Insights Manufacturing Utilities Logistics Oil & Gas Mining Financial Services Other Co-Creation Services Professional Services Edge Asset Integration Device Control Data Caching, Filtering ◼ Industry solutions (Hitachi BUs, ISVs, SIs) ◼ IoT applications ◼ IoT platform services ◼ Edge-to-cloud infrastructure Foundry Edge controllers, appliances Converged and hyperconverged Block-, file- and object storage Pentaho IoT Solution Portfolio of Hitachi Vantara
  • 4. © Hitachi Vantara Corporation 2018
  • 5. © Hitachi Vantara Corporation 2018 public class AgendaForThisTalk { public String Topic1 = "Process, Architecture and 1 Example"; public String Topic2 = "1 Dataset, 4 different Tools"; public String Topic3 = "From Prototype to Production"; }
  • 6. © Hitachi Vantara Corporation 2018 Process, Architecture and 1 Example_
  • 7. © Hitachi Vantara Corporation 2018 Data Warehouse & Analytics back in 1998 z Data Discovery z
  • 8. © Hitachi Vantara Corporation 2018 Architecting for Analytics & Machine Learning today Source: Carlton E. Sapp: “Preparing and Architecting for Machine Learning”, Gartner, 2017
  • 9. © Hitachi Vantara Corporation 2018 Analytic Dashboard Example Source: Carlton E. Sapp: “Preparing and Architecting for Machine Learning”, Gartner, 2017
  • 10. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX End-to-end Fleet Management Solution • Combine Sensor Data with Contextual Information • Overcome scarce availability of capable technicians • Lower costs, reduced customer downtimes_ Fleet Optimization
  • 11. © Hitachi Vantara Corporation 2018 Truck Leasing Company Issues: Trucks have become more and more technology based, availability of capable technicians is scarce, and managing large number of maintenance centers is expensive. Business Objectives: Better Fleet Management - lowering costs, improving efficiency of maintenance to reduce customer downtime Strategic Goals: Need to take competitive advantage of truck data. 40-50,000 trucks purchased per year – more data than any truck OEM. Gain a competitive edge through lowered costs & predictive technology (automation in repairs & diagnostics).
  • 12. © Hitachi Vantara Corporation 2018 Vehicle Sensor Data Store models and sensor data Asset ModelSensor DataUtility Vehicle Asset Air Pressure Axle Vibration Lights Load Weight Movement Temperature Sensor Data Journey Stream Blend Infer Sense Inspect Embed & Integrate Store
  • 13. © Hitachi Vantara Corporation 2018 Adding Context to Sensor Data IoT Data Refinery Contextual DataSensor Data Sensor Data Journey Stream Blend Infer Sense Inspect Embed & Integrate Store Vehicle Location • GPS • Lat / Long • Mapping • Movement Vehicle Profile • Make • Model • Mileage Operational Systems • Maintenance History • Maintenance Schedule • Service Centers • Parts Ordering • Parts Inventory Business Outcomes • Real-Time Fleet Status and Health • Repair Recommendations • Optimized Maintenance Scheduling • Automated Parts Ordering
  • 14. © Hitachi Vantara Corporation 2018 Fleet Management Dashboard: Situation Overview Source: XXXXXX
  • 15. © Hitachi Vantara Corporation 2018 Fleet Management Dashboard: Contextual View Source: XXXXXX
  • 16. © Hitachi Vantara Corporation 2018 Data Science – Personas & Process Model Source: K. Bollhöfer, Chief Data Scientist, *um Cross-functional team
  • 17. © Hitachi Vantara Corporation 2018 Architecture Challenge Source: D. Sculley, et al.: “Hidden technical debt in Machine learning systems”, 2015
  • 18. © Hitachi Vantara Corporation 2018 1 Dataset, 4 different Tools_
  • 19. © Hitachi Vantara Corporation 2018 1 Dataset, 4 Tools Dataset: California House Prices github.com/ageron/handson-ml/tree/master/datasets/housing Tools: Jupyter Notebook • End-to-end ML Projects • ML with preferred programing language like Python, R, Julia • Live-Code embedded in Markup Document Oracle Data Visualization • For data exploration • ML to explain dependencies in dataset • 1-click analytical functions and model training possible H2O Flow • End-to-end ML Platform with own compute engine, AutoML,… • Notebook UI, H2O algorithms accessible from Python and R • Java-Export of trained Models Pentaho Data Integration • Embedding ML Code in ETL dataflows • ML Orchestration: Model training and management • Plugin Machine Intelligence Free trial versions available for all tools!
  • 20. © Hitachi Vantara Corporation 2018 Machine Learning Coding with Jupyter Notebook & Python Start here: jupyter.org
  • 21. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 22. © Hitachi Vantara Corporation 2018
  • 23. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 24. © Hitachi Vantara Corporation 2018 Machine Learning with Jupyter Notebook & Python ML Process support: End-to-end, focus on experimentation not on production ML Personas: Data Scientists Useful: • Notebooks allow reproduceable ML from data ingestion to model evaluation • Multiple programming languages, access to latest ML frameworks via Python interface • Sophisticated visualizations Architecture & Development related: • Trained ML models can be serialized/saved, i.e. via Pickle, Joblib • Models can be published as REST API endpoints, ie. via Flask • Notebooks stored as JSON files → Code versioning / merge not allways easy, less comfort compared to other IDE‘s (no syntax highlighting, no key word completion) • Large-scale ML possible, i.e. via Apache Spark MLlib; Cluster deployments are better done separately
  • 25. © Hitachi Vantara Corporation 2018 Dataset Exploration with Oracle Data Visualization Start here: www.oracle.com/technetwork/middleware/oracle-data-visualization
  • 26. © Hitachi Vantara Corporation 2018 Überschrift
  • 27. © Hitachi Vantara Corporation 2018 Dataset Exploration with Oracle Data Visualization ML Process support: Data Understanding phase, Results Presentation & Story Telling Personas: Business Analysts, Data Scientists Useful: • Highly interactive Charts and Filters and intuitive analysis support (i.e. pattern brushing) • Supporting functions to highlight and explain attribute/variable dependencies • Formular editor, advanced (ML) functions and Model training and scoring available for experimentation (experienced Users only) Architecture & Development related: • Use Oracle Analytics Cloud for better collaboration; can be combined with Data Visualization Desktop • Dataset preparation functionality is improving over time, but will not replace existing ETL platforms (Scalability, Job-Management) • Good for rapid prototyping, but limited reusability of results (curated datasets, ML Models)
  • 28. © Hitachi Vantara Corporation 2018 Machine Learning Orchestration and more with Pentaho Start here: community.hitachivantara.com/docs/DOC-1009931-downloads Blog: community.hitachivantara.com/community/products-and- solutions/pentaho/blog/2018/10/16/deep-learning-coming-to-pentaho
  • 29. © Hitachi Vantara Corporation 2018 Bring Your Own (ML) Code
  • 30. © Hitachi Vantara Corporation 2018 ETL-Tools + Python/R – A Door Opener for Deep Learning?
  • 31. © Hitachi Vantara Corporation 2018 Embedding ML Algorithms into Data Pipelines Source: XXXXXX
  • 32. © Hitachi Vantara Corporation 2018 „Model Zoo“ managed by your Data Integration Solution
  • 33. © Hitachi Vantara Corporation 2018 Machine Learning within Data Integration Platforms ML Process support: Data Preparation phase, Operationalize ML, Experimentation? Personas: Data Engineers, Data Scientists (supporting) Useful: • DI platforms provide advanced data preparation and automation features for effective creation of datasets • Intuitive UI, Drag & drop instead of coding • Skilled DI team already in place Architecture & Development related: • DI platforms are optimized to utilize full computing power for ETL/ELT, but not for ML Tasks (i.e. parallel execution might need extra effort) • When using R / Python interfaces: ensure to collect status infos + performance metrics of your ML model execution within DI logging • „ML toolbox“ of DI solutions is often not complete: i.e. script-based work arounds needed for imputation of missing values, working with latest algorithms, model management • Limited and not intuitive data exploration features, i.e. visualisations
  • 34. © Hitachi Vantara Corporation 2018 End-to-end Machine Learning for everyone with H20 (Flow) Start here: www.h2o.ai/download Accessing a 2 node H2O cluster in a R environment
  • 35. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 36. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 37. © Hitachi Vantara Corporation 2018 End-to-end Machine Learning for everyone with H20 Flow ML Process support: End-to-end, focus is on Model training/evaluation, Feature Engineering Personas: Data Scientists, ambitious Business Analysts(?) Useful: • H2O Flow: intuitive Notebook-style UI + user guidance • Use the programing language you already know like R, Python • AutoML can be used for automating the ML workflow, including training and tuning of many models within a user-specified time limit Architecture & Development related: • Takes advantage of the computing power of distributed systems and in- memory computing to accelerate machine learning • Works on existing big data infrastructure, on bare metal or on top of existing Hadoop or Spark clusters; can ingest data from HDFS, Spark, S3 • Model deployment into production with Java (POJO) and binary formats (MOJO), Hive UDF, or as API endpoint • H2O Flow: Limited Data Exploration (no visual only in combination with R/Python
  • 38. © Hitachi Vantara Corporation 2018 Another tool decision: Which ML Framework to choose? Source: Dr. D. James: „Entscheidungsmatrix „Machine Learning“, it-novum.com
  • 39. © Hitachi Vantara Corporation 2018 From Prototype to Production_
  • 40. © Hitachi Vantara Corporation 2018 From Exploratory Data Science to Production Workflows Line of Governance • Commercial exploitation • Integration to operations • Non-functional requirements • Standardisation & governance Model • Unbounded discovery • Self-Service sandbox • Wide toolset / IDE’s • Agile methods
  • 41. © Hitachi Vantara Corporation 2018 Model deployment Source: Sergei Izrailev: Design Patterns for Machine Learning in Production, 2017 • Data transformations must be the same in training and scoring • Interface between building & scoring: − In-memory - model is never persisted, train then score → single & applications − Data only, i.e. PMML, etc → code is independent − Serialized objects - Pickle, R, Spark, custom → reuse code − Code + Data – i.e., H2O’s POJO → code is generated Detailed article in “Java aktuell” 06/2018
  • 42. © Hitachi Vantara Corporation 2018 Überschrift
  • 43. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX H2O POJO (Plain Old Java Object): • ML Model implemented through Java classes • has dependencies with H2O specific classes
  • 44. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 45. © Hitachi Vantara Corporation 2018 Gradient Boosting Machines (GBM) • A family of powerful machine- learning techniques for regression and classification problems • GBM’s produce a prediction model in the form of an ensemble of weak prediction models, typically decision trees
  • 46. © Hitachi Vantara Corporation 2018 From Prototype to Machine Learning in Production Source: Sergei Izrailev: Design Patterns for Machine Learning in Production, 2017
  • 47. © Hitachi Vantara Corporation 2018 Applying ML Model for Scoring of new (unlabeled) Data Source: K. Wähner: How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka • Add ML model = Java application to Apache Kafka stream processing application to apply it on new incoming events • Spark Streaming also allows ML model serving via mini batches and makes use of in- memory processing and load distribution
  • 48. © Hitachi Vantara Corporation 2018 ML Model as API endpoint: Object detection example Source: A. Rosebrock: “A scalable Keras + deep learning REST API”, pyimagesearch Blog • “API first” approach instead of rewriting code to replicate ML model in a programming language supported by the enterprise IT • Web APIs have made it easy for cross- language applications to work well. If a developer needs a Model to create a ML powered web application, they would just need to get the URL Endpoint from where the API is being served • Webservice development frameworks like Flask allow prototyping in Python • For production environments, an additional web server and a messaging should be considered Keras: a simple, modular, and extensible Deep Learning library, written in Python and designed to enable fast experimentation with deep neural networks Redis (Remote Dictionary Server): implements a distributed, in-memory key-value database with optional durability
  • 49. © Hitachi Vantara Corporation 2018Source: A. Rosebrock: "Building a simple Keras + deep learning REST API", The Keras Blog
  • 50. © Hitachi Vantara Corporation 2018 Überschrift Source: XXXXXX
  • 51. © Hitachi Vantara Corporation 2018 Object Detection – Use Case PicSure AI Platform für Versichrungslösungen
  • 52. © Hitachi Vantara Corporation 2018 Takeaway • Find the right problem • Define constraints • Design components and interfaces • Take into account organizational constraints • Production can’t be an afterthought • The process is a lot of work, but it’s not rocket science Processing huge amounts of data with complex algorithms can a bit too much of time. Kudos Randall Munroe / XKCD for the original
  • 53. © Hitachi Vantara Corporation 2018© Hitachi Vantara Corporation 2018. All Rights Reserved Thank You
  • 54. © Hitachi Vantara Corporation 2018