SlideShare une entreprise Scribd logo
1  sur  21
Télécharger pour lire hors ligne
Open Source AI Platform for
Business Transformation
Desmond Chan
Senior Director of Marketing, H2O.ai
Agenda for H2O Introduction Webinar
▪ Company Introduction (5 mins)
▪ H2O Introduction and Demo (35 mins)
– Installation of H2O
– Flight delay prediction use case
• Use case description
• Data set description
• Data munging
• Model creation
▪ Q&A (10 mins)
H2O AI Platform
In-Memory, Distributed
Machine Learning with
Visual Intelligence
H2O AI in Spark
with Data Prep and ML
Pipelines
Operationalize Model
Building and Deployment
Governance.
Best-of-breed
GPU Deep Learning
with easy API and AutoML
TensorFlow, MXNet or Caffe
and H2O
Deep
Water
AI For Business
Transformation
Insights on Text,
Images, Transactions,
Speech
Best Machine
Learning Algorithms
on Spark
Platform to Build and
Scale Data Products.
Dual licensing (AGPL
and Commercial)
H2O is the #1 Platform for Open Source AI
Open Source Drives Community Adoption
Companies Using H2O.ai
2014 2015 2016 2017
9173
6427
3810
400
H2O.ai Users
2014 2015 2016 2017
83108
54163
38257
1000
* Data from July of every year, except for 2017 when data from Feb 21st are used.
H2O Recognized by Press and Customers
H2O.ai Strongly Positioned in Key Analyst Reports and Press
“Overall customer satisfaction is very
high.”
“H2O is especially suited to IoT edge
and device scenarios.”
“H2O had the highest reference customer
analytics support score of all the
vendors.”
H2O.ai is a Visionary 

in the Gartner Magic Quadrant

for Data Science Platforms
“H2O.ai has significant adoption by
large enterprises such as Macy’s,
Comcast, and Capital One.”
“H2O.ai is best known for developing
open source, cluster-distributed ML
algorithms at a time (2011) when big data
demanded them, but no one else had
them.”
H2O.ai is a Strong Performer

in the Forrester Predictive
Analytics & Machine Learning
H2O.ai is a Top 10 Hot Artificial
Intelligence (AI) Technologies
on Forbes
H2O.ai named alongside Nvidia, Google,
IBM, Intel, Microsoft, SAS, et al as in Top
10 Hot Artificial Intelligence (AI) on
Forbes - contributed by Gil Press
H2O Use Cases – Videos and Talks
Auto
Insurance
UBI
Telematics
Commercial
Insurance
Risk Analytics
Financial
Services
Customer
Insights
Digital Marketing
Consumer
Behavior
Pawan Divarkarla
Chief Data Officer
“H2O is an enabler in
how people are
thinking about data.”
Conor Jensen
Analytics Director
“Advanced analytics
was one of the key
investments we
decided to make.”
Brendan Herger
Data Scientist
“H2O is the best solution
to to iterate very quickly
on large datasets and
produce meaning models.”
Satya Satyamoorthy
Director, Software Dev
"I am a big fan of open
source. H2O is the best
fit in terms of cost as
well as ease of use and
scalability and
usability.”
Play Video Play Video Play Video Play Video
Progressive Zurich Capital One Nielsen Catalina
Amy Wang
Math Hacker, H2O.ai
What is H2O?
Open%source%in,memory%prediction%engineMath%Platform
• Parallelized%and%distributed%algorithms%making%the%most%use%out%of%
multithreaded%systems
• GLM,%Random%Forest,%GBM,%PCA,%etc.
Easy%to%use%and%adoptAPI
• Written%in%Java%– perfect%for%Java%Programmers
• REST%API%(JSON)%– drives%H2O%from%R,%Python,%Excel,%Tableau
More%data?%Or%better%models?%BOTHBig%Data
• Use%all%of%your%data%– model%without%down%sampling
• Run%a%simple%GLM%or%a%more%complex%GBM%to%find%the%best%fit%for%the%data
• More%Data%+%Better%Models%=%Better%Predictions
Supervised Learning
H2O Algorithms
Statistical
Analysis
Ensembles
Deep Neural
Networks
• Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson, and
Tweedie
• Naive Bayes: Binary Text Classification
• Distributed Random Forest: Classification or Regression Models
• Gradient Boosting Machine: Ensembles of shallow decision trees with
increasing refined approximations
• Deep Learning: Create multi-layer feed forward neural networks starting
with an input layer followed by multiple layers of nonlinear transformations
Unsupervised Learning
Clustering
Dimensionality
Reduction
Anomaly Detection
• K-means: Partition observations into k clusters of the same spatial size.
Categorical features are one hot encoded.
• Archetypes [GLRM]: Partition observations into k archetypes.
• Principal Component Analysis: Linearly transforms correlated variables
to independent components
• Generalized Low Rank Model: Approximates data set as a product of
two low dimensional factors. Extends PCA to handle sparse data,
categorical data, and adds regularization.
• Autoencoders [Deep Learning]: Create multi-layer feed forward neural
networks starting with an input layer followed by multiple layers of
nonlinear transformations
H2O Algorithms
Accuracy with Speed and Scale
HDFS
S3
SQL
NoSQL
Classification
Regression
Feature
Engineering
In-Memory
Map Reduce/Fork Join
Columnar Compression
Deep Learning
PCA, GLM, Cox
Random Forest / GBM
Ensembles
Fast Modeling Engine
Streaming
Nano Fast Java Scoring Engines
Matrix
Factorization
Clustering
Munging
Reading Data into H2O with R
STEP 1
R user
h2o_df = h2o.importFile(“../data/allyears2k.csv”)
Reading Data from HDFS into H2O with R
H2O
H2O
H2O
data.csv
HTTP REST
API request to
H2O
has HDFS path
H2O ClusterInitiate
distributed
ingest
HDFS
Request
data from
HDFS
STEP 2
2.2
2.3
2.4
R
h2o.importFile()
2.1
R function
call
Reading Data from HDFS into H2O with R
H2O
H2O
H2O
R
HDFS
STEP 3
Cluster IP
Cluster Port
Pointer to Data
Return pointer
to data in
REST API
JSON
Response
HDFS
provides
data
3.3
3.4
3.1h2o_df object
created in R
data.csv
h2o_df
H2O
Fram
e
3.2
Distributed
H2O
Frame in DKV
H2O Cluster
Data Munging in R
Installing in R
Installing in Python
R> install.packages(“h2o”)
Terminal$ pip install h2o
Demo Time!
Questions?
Thanks for joining us!

Contenu connexe

Tendances

Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewSri Ambati
 
The Big Data Ecosystem at LinkedIn
The Big Data Ecosystem at LinkedInThe Big Data Ecosystem at LinkedIn
The Big Data Ecosystem at LinkedInOSCON Byrum
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning PlatformMk Kim
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?Vincent Terrasi
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Open source log analytics
Open source log analyticsOpen source log analytics
Open source log analyticsVinod Nayal
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Casesboorad
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterImply
 
Graph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierGraph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierDemai Ni
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionSteve Loughran
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesLynn Langit
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTurkish Testing Board
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...DataWorks Summit
 
How To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUsHow To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUsKinetica
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive AnalyticsInfochimps, a CSC Big Data Business
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPUSri Ambati
 
Bigdata Hadoop project payment gateway domain
Bigdata Hadoop project payment gateway domainBigdata Hadoop project payment gateway domain
Bigdata Hadoop project payment gateway domainKamal A
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Data Con LA
 

Tendances (20)

Introduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain ViewIntroduction to Data Science with H2O- Mountain View
Introduction to Data Science with H2O- Mountain View
 
The Big Data Ecosystem at LinkedIn
The Big Data Ecosystem at LinkedInThe Big Data Ecosystem at LinkedIn
The Big Data Ecosystem at LinkedIn
 
Bigdata Machine Learning Platform
Bigdata Machine Learning PlatformBigdata Machine Learning Platform
Bigdata Machine Learning Platform
 
How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?How to boost your datamanagement with Dremio ?
How to boost your datamanagement with Dremio ?
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Machine Data Analytics
Machine Data AnalyticsMachine Data Analytics
Machine Data Analytics
 
Open source log analytics
Open source log analyticsOpen source log analytics
Open source log analytics
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use Cases
 
Analytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at TwitterAnalytics over Terabytes of Data at Twitter
Analytics over Terabytes of Data at Twitter
 
Make your data talk
Make your data talkMake your data talk
Make your data talk
 
Graph Data: a New Data Management Frontier
Graph Data: a New Data Management FrontierGraph Data: a New Data Management Frontier
Graph Data: a New Data Management Frontier
 
My other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 editionMy other computer is a datacentre - 2012 edition
My other computer is a datacentre - 2012 edition
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland LeusdenTestistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
Testistanbul 2016 - Keynote: "Performance Testing of Big Data" by Roland Leusden
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
How To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUsHow To Achieve Real-Time Analytics On A Data Lake Using GPUs
How To Achieve Real-Time Analytics On A Data Lake Using GPUs
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
An Introduction to H2O4GPU
An Introduction to H2O4GPUAn Introduction to H2O4GPU
An Introduction to H2O4GPU
 
Bigdata Hadoop project payment gateway domain
Bigdata Hadoop project payment gateway domainBigdata Hadoop project payment gateway domain
Bigdata Hadoop project payment gateway domain
 
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
Big Data Day LA 2016/ NoSQL track - Analytics at the Speed of Light with Redi...
 

Similaire à Start Getting Your Feet Wet in Open Source Machine and Deep Learning

Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIAmazon Web Services
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 editionDavid Talby
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSSri Ambati
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategyJames Serra
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"Jo-fai Chow
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAmazon Web Services
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...Denodo
 
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...Daniel Zivkovic
 
Bas van Dorst - Microsoft
Bas van Dorst - MicrosoftBas van Dorst - Microsoft
Bas van Dorst - MicrosoftDutch Power
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachDataWorks Summit
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache SoftwareBob Marcus
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoSri Ambati
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaData Science Milan
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OSri Ambati
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?James Serra
 

Similaire à Start Getting Your Feet Wet in Open Source Machine and Deep Learning (20)

Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Initiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AIInitiate Edinburgh 2019 - Big Data Meets AI
Initiate Edinburgh 2019 - Big Data Meets AI
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
Accelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWSAccelerate ML Deployment with H2O Driverless AI on AWS
Accelerate ML Deployment with H2O Driverless AI on AWS
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AIAWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
AWS Initiate Day Manchester 2019 – AWS Big Data Meets AI
 
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
How to Swiftly Operationalize the Data Lake for Advanced Analytics Using a Lo...
 
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...
Building a Data Cloud to enable Analytics & AI-Driven Innovation - Lak Lakshm...
 
Bas van Dorst - Microsoft
Bas van Dorst - MicrosoftBas van Dorst - Microsoft
Bas van Dorst - Microsoft
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run ApproachEvolution of Big Data at Intel - Crawl, Walk and Run Approach
Evolution of Big Data at Intel - Crawl, Walk and Run Approach
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Introducción al Machine Learning Automático
Introducción al Machine Learning AutomáticoIntroducción al Machine Learning Automático
Introducción al Machine Learning Automático
 
Games en
Games enGames en
Games en
 
Serverless machine learning architectures at Helixa
Serverless machine learning architectures at HelixaServerless machine learning architectures at Helixa
Serverless machine learning architectures at Helixa
 
Machine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2OMachine Learning on Google Cloud with H2O
Machine Learning on Google Cloud with H2O
 
How does Microsoft solve Big Data?
How does Microsoft solve Big Data?How does Microsoft solve Big Data?
How does Microsoft solve Big Data?
 

Dernier

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Mark Simos
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Dernier (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
Tampa BSides - The No BS SOC (slides from April 6, 2024 talk)
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

Start Getting Your Feet Wet in Open Source Machine and Deep Learning

  • 1. Open Source AI Platform for Business Transformation
  • 2. Desmond Chan Senior Director of Marketing, H2O.ai
  • 3. Agenda for H2O Introduction Webinar ▪ Company Introduction (5 mins) ▪ H2O Introduction and Demo (35 mins) – Installation of H2O – Flight delay prediction use case • Use case description • Data set description • Data munging • Model creation ▪ Q&A (10 mins)
  • 4. H2O AI Platform In-Memory, Distributed Machine Learning with Visual Intelligence H2O AI in Spark with Data Prep and ML Pipelines Operationalize Model Building and Deployment Governance. Best-of-breed GPU Deep Learning with easy API and AutoML TensorFlow, MXNet or Caffe and H2O Deep Water AI For Business Transformation Insights on Text, Images, Transactions, Speech Best Machine Learning Algorithms on Spark Platform to Build and Scale Data Products. Dual licensing (AGPL and Commercial) H2O is the #1 Platform for Open Source AI
  • 5. Open Source Drives Community Adoption Companies Using H2O.ai 2014 2015 2016 2017 9173 6427 3810 400 H2O.ai Users 2014 2015 2016 2017 83108 54163 38257 1000 * Data from July of every year, except for 2017 when data from Feb 21st are used.
  • 6. H2O Recognized by Press and Customers
  • 7. H2O.ai Strongly Positioned in Key Analyst Reports and Press “Overall customer satisfaction is very high.” “H2O is especially suited to IoT edge and device scenarios.” “H2O had the highest reference customer analytics support score of all the vendors.” H2O.ai is a Visionary 
 in the Gartner Magic Quadrant
 for Data Science Platforms “H2O.ai has significant adoption by large enterprises such as Macy’s, Comcast, and Capital One.” “H2O.ai is best known for developing open source, cluster-distributed ML algorithms at a time (2011) when big data demanded them, but no one else had them.” H2O.ai is a Strong Performer
 in the Forrester Predictive Analytics & Machine Learning H2O.ai is a Top 10 Hot Artificial Intelligence (AI) Technologies on Forbes H2O.ai named alongside Nvidia, Google, IBM, Intel, Microsoft, SAS, et al as in Top 10 Hot Artificial Intelligence (AI) on Forbes - contributed by Gil Press
  • 8. H2O Use Cases – Videos and Talks Auto Insurance UBI Telematics Commercial Insurance Risk Analytics Financial Services Customer Insights Digital Marketing Consumer Behavior Pawan Divarkarla Chief Data Officer “H2O is an enabler in how people are thinking about data.” Conor Jensen Analytics Director “Advanced analytics was one of the key investments we decided to make.” Brendan Herger Data Scientist “H2O is the best solution to to iterate very quickly on large datasets and produce meaning models.” Satya Satyamoorthy Director, Software Dev "I am a big fan of open source. H2O is the best fit in terms of cost as well as ease of use and scalability and usability.” Play Video Play Video Play Video Play Video Progressive Zurich Capital One Nielsen Catalina
  • 10. What is H2O? Open%source%in,memory%prediction%engineMath%Platform • Parallelized%and%distributed%algorithms%making%the%most%use%out%of% multithreaded%systems • GLM,%Random%Forest,%GBM,%PCA,%etc. Easy%to%use%and%adoptAPI • Written%in%Java%– perfect%for%Java%Programmers • REST%API%(JSON)%– drives%H2O%from%R,%Python,%Excel,%Tableau More%data?%Or%better%models?%BOTHBig%Data • Use%all%of%your%data%– model%without%down%sampling • Run%a%simple%GLM%or%a%more%complex%GBM%to%find%the%best%fit%for%the%data • More%Data%+%Better%Models%=%Better%Predictions
  • 11. Supervised Learning H2O Algorithms Statistical Analysis Ensembles Deep Neural Networks • Generalized Linear Models: Binomial, Gaussian, Gamma, Poisson, and Tweedie • Naive Bayes: Binary Text Classification • Distributed Random Forest: Classification or Regression Models • Gradient Boosting Machine: Ensembles of shallow decision trees with increasing refined approximations • Deep Learning: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations
  • 12. Unsupervised Learning Clustering Dimensionality Reduction Anomaly Detection • K-means: Partition observations into k clusters of the same spatial size. Categorical features are one hot encoded. • Archetypes [GLRM]: Partition observations into k archetypes. • Principal Component Analysis: Linearly transforms correlated variables to independent components • Generalized Low Rank Model: Approximates data set as a product of two low dimensional factors. Extends PCA to handle sparse data, categorical data, and adds regularization. • Autoencoders [Deep Learning]: Create multi-layer feed forward neural networks starting with an input layer followed by multiple layers of nonlinear transformations H2O Algorithms
  • 13. Accuracy with Speed and Scale HDFS S3 SQL NoSQL Classification Regression Feature Engineering In-Memory Map Reduce/Fork Join Columnar Compression Deep Learning PCA, GLM, Cox Random Forest / GBM Ensembles Fast Modeling Engine Streaming Nano Fast Java Scoring Engines Matrix Factorization Clustering Munging
  • 14. Reading Data into H2O with R STEP 1 R user h2o_df = h2o.importFile(“../data/allyears2k.csv”)
  • 15. Reading Data from HDFS into H2O with R H2O H2O H2O data.csv HTTP REST API request to H2O has HDFS path H2O ClusterInitiate distributed ingest HDFS Request data from HDFS STEP 2 2.2 2.3 2.4 R h2o.importFile() 2.1 R function call
  • 16. Reading Data from HDFS into H2O with R H2O H2O H2O R HDFS STEP 3 Cluster IP Cluster Port Pointer to Data Return pointer to data in REST API JSON Response HDFS provides data 3.3 3.4 3.1h2o_df object created in R data.csv h2o_df H2O Fram e 3.2 Distributed H2O Frame in DKV H2O Cluster
  • 18. Installing in R Installing in Python R> install.packages(“h2o”) Terminal$ pip install h2o