SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
SCALABLE DATA SCIENCE AND
DEEP LEARNING WITH H2O
Arno Candel, H2O.ai
O P E N
D A T A
S C I E N C E
C O N F E R E N C E_ BOSTON 2015
@opendatasci
H2O.ai

Machine Intelligence
Who Am I?
Arno Candel
Chief Architect,
Physicist & Hacker at H2O.ai
PhD Physics, ETH Zurich 2005
10+ yrs Supercomputing (HPC)
6 yrs at SLAC (Stanford Lin. Accel.)
3.5 yrs Machine Learning
1.5 yrs at H2O.ai
Fortune Magazine
Big Data All Star 2014
Follow me @ArnoCandel 2
H2O.ai

Machine Intelligence
Outline
• Introduction
• H2O Deep Learning Architecture
• Live Demos:
Flow GUI - Airline Dataset
R - MNIST World Record + Anomaly Detection
Flow GUI - Higgs Boson Classification
Sparkling Water - Chicago Crime Prediction
iPython - CitiBike Demand Prediction
Scoring Engine - Million Songs Classification
• Outlook
3
H2O.ai

Machine Intelligence
See All Demos In Full Tomorrow!
4
http://www.meetup.com/bostonml/events/222464750/
H2O.ai

Machine Intelligence
In-Memory ML
Distributed
Open Source
APIs
5
Memory-Efficient Data Structures
Cutting-Edge Algorithms
Use all your Data (No Sampling)
Accuracy with Speed and Scale
Ownership of Methods - Apache V2
Easy to Deploy: Bare, Hadoop, Spark, etc.
Java, Scala, R, Python, JavaScript, JSON
NanoFast Scoring Engine (POJO)
H2O - Product Overview
H2O.ai

Machine Intelligence
25,000 commits / 3yrs
H2O World Conference 2014
Team Work @ H2O.ai
6
Join H2O World 2015!
H2O.ai

Machine Intelligence
103 634 2789
463 2,887 13,237
Companies
Users
Mar 2014 July 2014 Mar 2015
Active Users
150+
7
Strong Community & Growth
5/25/15 @kdnuggets t.co/4xSgleSIdY
H2O.ai

Machine Intelligence
8
HDFS
S3
SQL
NoSQL
Classification
Regression
Feature
Engineering
Distributed In-Memory
Map Reduce/Fork Join
Columnar Compression
GLM, Deep Learning
K-Means, PCA, NB,
Cox
Random Forest / GBM
Ensembles
Fast Modeling Engine
Streaming
Nano Fast Java Scoring Engines
(POJO code generation)
Matrix
Factorization Clustering
Munging
Unsupervised
Supervised
Accuracy with Speed and Scale
Most code is written
in-house from scratch
H2O.ai

Machine Intelligence
9
Ad Optimization (200% CPA Lift with H2O)
P2B Model Factory (60k models,
15x faster with H2O than before)
Fraud Detection (11% higher accuracy with
H2O Deep Learning - saves millions)
…and many large insurance and financial
services companies!
Real-time marketing (H2O is 10x faster than
anything else)
Actual Customer Use Cases
H2O.ai

Machine Intelligence
10
h2o.ai/download & Run!
H2O.ai

Machine Intelligence
11
Airline Data: Predict Delayed Departure
Predict dep. delay Y/N
116M rows
31 colums
12 GB CSV
4 GB compressed
20 years of domestic
airline flight data
H2O.ai

Machine Intelligence
12
Results in Seconds on Big Data
Logistic Regression: ~20s
elastic net, alpha=0.5, lambda=1.379e-4 (auto)
Deep Learning: ~70s
4 hidden ReLU layers of 20 neurons, 2 epochs
8-node EC2 cluster: 64 virtual cores, 1GbE
Year, Month, Sched.
Dep. Time have
non-linear impact
Chicago, Atlanta,
Dallas:

often delayed
All cores maxed out
+9% AUC
+--+++
H2O.ai

Machine Intelligence
Multi-layer feed-forward Neural Network

Trained with back-propagation (SGD, ADADELTA)
+ distributed processing for big data
(fine-grain in-memory MapReduce on distributed data)
+ multi-threaded speedup
(async fork/join worker threads operate at FORTRAN speeds)
+ smart algorithms for fast & accurate results
(automatic standardization, one-hot encoding of categoricals, missing value imputation, weight &
bias initialization, adaptive learning rate, momentum, dropout/l1/L2 regularization, grid search, 

N-fold cross-validation, checkpointing, load balancing, auto-tuning, model averaging, etc.)
= powerful tool for (un)supervised machine
learning on real-world data
13
H2O Deep Learning
all 320 cores maxed out
H2O.ai

Machine Intelligence
threads: async
14
H2O Deep Learning Architecture
K-V
K-V
HTTPD
HTTPD
nodes/JVMs: sync
communication
w
w w
w w w w
w1
w3 w2
w4
w2+w4
w1+w3
w* = (w1+w2+w3+w4)/4
map:

each node trains a copy of
the weights and biases with
(some* or all of) its local
data with asynchronous F/J
threads
initial model: weights and biases w
updated model: w*
H2O in-memory non-
blocking hash map:

K-V store
reduce:

model averaging: average weights
and biases from all nodes,
speedup is at least #nodes/
log(#rows)
http://arxiv.org/abs/1209.4129
Keep iterating over the data (“epochs”), score at user-given times
Query & display the
model via JSON, WWW
2
2 4
31
1
1
1
4
3 2
1 2
1
i
*auto-tuned (default) or user-
specified number of rows per
MapReduce iteration
Main Loop:
H2O.ai

Machine Intelligence
15
H2O Deep Learning beats MNIST
MNIST: Handwritten digits: 28^2=784 gray-scale pixel values
full run: 10 hours on 10-node cluster
2 hours on desktop gets to 0.9% test set error
Just supervised training
on original 60k/10k dataset:
No data augmentation
No distortions
No convolutions
No pre-training
No ensemble
0.83% test set error:

current world record
1-liner: call h2o.deeplearning() in R
H2O.ai

Machine Intelligence
16
Unsupervised Anomaly Detection
The good The bad The ugly
Try it yourself!
Auto-Encoder learns
“Compressed Identity”
H2O.ai

Machine Intelligence
17
Images courtesy CERN / LHC
Higgs

vs

Background
Large Hadron Collider: Largest experiment of mankind!
$13+ billion, 16.8 miles long, 120 MegaWatts, -456F, 1PB/day, etc.
Higgs boson discovery (July ’12) led to 2013 Nobel prize!
Higgs Boson - Classification Problem
H2O.ai

Machine Intelligence
18
UCI Higgs Dataset: 11M rows, 29 cols
C2-C22: 21 low-level features

(detector data)
7 high-level features

(physics formulae)
Assume we don’t know Physics…
H2O.ai

Machine Intelligence
19
? ? ?
Former CERN baseline for AUC: 0.733 and 0.816
H2O Algorithm low-level H2O AUC all features H2O AUC
Generalized Linear Model 0.596 0.684
Random Forest 0.764 0.840
Gradient Boosted Trees 0.753 0.839
Neural Net 1 hidden layer 0.760 0.830
H2O Deep Learning ?
add
derived
features
Deep Learning for Higgs Detection?
Q: Can Deep Learning learn Physics for us?
H2O.ai

Machine Intelligence
20
EC2 Demo Cluster: 8 nodes, 64 cores
H2O Deep Learning:
Expect good cluster
utilization :)
H2O.ai

Machine Intelligence
21
Deep DL model on

low-level features
only
valid 500k rows
test 500k rows
train 10M rows
H2O Deep Learning Higgs Demo
H2O: same results as Nature paper
Deep Learning just learned Particle Physics!
8 EC2 nodes:
AUC = 0.86 after 100 mins
AUC = 0.87+ overnight
H2O.ai

Machine Intelligence
22
http://www.slideshare.net/0xdata/crime-deeplearningkey
http://www.datanami.com/2015/05/07/what-police-can-learn-from-deep-learning/
H2O Deep Learning in the News
Alex, Michal,
et al.
H2O.ai

Machine Intelligence
23
Weather + Census + Crime Data
H2O.ai

Machine Intelligence 24
Spark + H2O = Sparkling Water
H2O.ai

Machine Intelligence
25
Sparkling Water Demo
Instructions at h2o.ai/download
H2O.ai

Machine Intelligence
26
Parse & Munge with H2O, Convert to RDD
H2O Parser: Robust & Fast
Simple Column Selection
H2O.ai

Machine Intelligence
27
Parse & Munge with H2O, Convert to RDD
Munging: Date Manipulations
Conversion to DataFrame
H2O.ai

Machine Intelligence
28
Join RDDs with SQL, Convert to H2O
Spark SQL Query Execution
Convert back to H2OFrame
Split into Train 80% / Test 20%
H2O.ai

Machine Intelligence
29
Build H2O Deep Learning Model
Train a H2O Deep Learning
Model on Data obtained by
Spark SQL Query
Predict whether Arrest will be
made with AUC of 0.90+
H2O.ai

Machine Intelligence
30
Visualize Results with Flow
Using Flow to interactively plot
Arrest Rate (blue)
vs
Relative Occurrence (red)
per crime type.
H2O.ai

Machine Intelligence
31
Predict Rental Bike Demand in NYC
Cliff et al.
H2O.ai

Machine Intelligence
iPython
Notebook Demo
32
Group-By Aggregation
H2O.ai

Machine Intelligence
iPython
Notebook Demo
33
Model Building And Scoring
91% AUC baseline
H2O.ai

Machine Intelligence
34
Joining Bikes-Per-Day with Weather
H2O.ai

Machine Intelligence
35
Improved Models with Weather Data
93% AUC after joining
bike and weather data
H2O.ai

Machine Intelligence
36
Example: First GBM tree
Fast and easy path to Production (batch or real-time)!
POJO Scoring Engine
Standalone Java scoring code is auto-generated!
Note:
no heap allocation,
pure decision-making
H2O.ai

Machine Intelligence
More Info in H2O Booklets
https://leanpub.com/u/h2oai
http://learn.h2o.ai
37
H2O.ai

Machine Intelligence
38
Kaggle - H2O Starter Scripts
H2O.ai

Machine Intelligence
39
Kaggle - Active H2O Community
18k+ views in 2 weeks
H2O.ai

Machine Intelligence
40
Hyper-Parameter Tuning
93 numerical features
9 output classes
62k training set rows
144k test set rows
Ensemble of H2O DL + GBM => Top 10%
R script by DataGeek
H2O.ai

Machine Intelligence
41
Mastering Kaggle with H2O
DL + GBM
GBM
GBM + GLM
DRF + GLM
Stay tuned: Kaggle Master
@Mark_A_Landry recently
joined H2O as Competitive
Data Scientist!
www.meetup.com/Silicon-
Valley-Big-Data-Science/
events/222303884/
H2O.ai

Machine Intelligence
42
R’s data.table now in H2O!
H2O.ai

Machine Intelligence
Outlook - Algorithm Roadmap
• Ensembles (Erin LeDell et al.)
• Automated Hyper-Parameter Tuning
• Convolutional Layers for Deep Learning
• Natural Language Processing: tf-idf, Word2Vec, …
• Generalized Low Rank Models
• PCA, SVD, K-Means, Matrix Factorization
• Recommender Systems
And many more!
43
Public JIRAs - Join H2O!
H2O.ai

Machine Intelligence
Key Take-Aways
H2O is an open source predictive analytics
platform for data scientists and business analysts
who need scalable, fast and accurate machine
learning.
H2O Deep Learning is ready to take your
advanced analytics to the next level.
Try it on your data!
44
https://github.com/h2oai
H2O Google Group
http://h2o.ai
@h2oai
Thank You!

Contenu connexe

Tendances

Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
Sri Ambati
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
Paco Nathan
 

Tendances (20)

Project "Deep Water"
Project "Deep Water"Project "Deep Water"
Project "Deep Water"
 
Big Data Science with H2O in R
Big Data Science with H2O in RBig Data Science with H2O in R
Big Data Science with H2O in R
 
Intro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LAIntro to H2O in Python - Data Science LA
Intro to H2O in Python - Data Science LA
 
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science LabScalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
 
New Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 EditionNew Developments in H2O: April 2017 Edition
New Developments in H2O: April 2017 Edition
 
Scalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2OScalable Machine Learning in R and Python with H2O
Scalable Machine Learning in R and Python with H2O
 
Introduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use CasesIntroduction to H2O and Model Stacking Use Cases
Introduction to H2O and Model Stacking Use Cases
 
AI Development with H2O.ai
AI Development with H2O.aiAI Development with H2O.ai
AI Development with H2O.ai
 
Introduction to GPUs for Machine Learning
Introduction to GPUs for Machine LearningIntroduction to GPUs for Machine Learning
Introduction to GPUs for Machine Learning
 
Automatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIMEAutomatic and Interpretable Machine Learning with H2O and LIME
Automatic and Interpretable Machine Learning with H2O and LIME
 
Deep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry LarkoDeep Learning with MXNet - Dmitry Larko
Deep Learning with MXNet - Dmitry Larko
 
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2ODeep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
Deep Water - Bringing Tensorflow, Caffe, Mxnet to H2O
 
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to EveryoneH2O Deep Water - Making Deep Learning Accessible to Everyone
H2O Deep Water - Making Deep Learning Accessible to Everyone
 
Machine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville MeetupMachine Learning for Smarter Apps - Jacksonville Meetup
Machine Learning for Smarter Apps - Jacksonville Meetup
 
H2O 3 REST API Overview
H2O 3 REST API OverviewH2O 3 REST API Overview
H2O 3 REST API Overview
 
Intro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara UniversityIntro to H2O Machine Learning in R at Santa Clara University
Intro to H2O Machine Learning in R at Santa Clara University
 
From Kaggle to H2O - The True Story of a Civil Engineer Turned Data Geek
From Kaggle to H2O - The True Story of a Civil Engineer Turned Data GeekFrom Kaggle to H2O - The True Story of a Civil Engineer Turned Data Geek
From Kaggle to H2O - The True Story of a Civil Engineer Turned Data Geek
 
Microservices, containers, and machine learning
Microservices, containers, and machine learningMicroservices, containers, and machine learning
Microservices, containers, and machine learning
 
Spark streaming
Spark streamingSpark streaming
Spark streaming
 
Sparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal MalohlavaSparkling Water 2.0 - Michal Malohlava
Sparkling Water 2.0 - Michal Malohlava
 

En vedette

Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
Sri Ambati
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
Sri Ambati
 

En vedette (20)

Data Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2OData Science, Machine Learning, and H2O
Data Science, Machine Learning, and H2O
 
H2O Big Data Environments
H2O Big Data EnvironmentsH2O Big Data Environments
H2O Big Data Environments
 
Build Your Own Recommendation Engine
Build Your Own Recommendation EngineBuild Your Own Recommendation Engine
Build Your Own Recommendation Engine
 
Intro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWSIntro to Machine Learning with H2O and AWS
Intro to Machine Learning with H2O and AWS
 
Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_odsc_boston2015
 
Classification of Crime
Classification of CrimeClassification of Crime
Classification of Crime
 
Introduction to data science with H2O-Chicago
Introduction to data science with H2O-ChicagoIntroduction to data science with H2O-Chicago
Introduction to data science with H2O-Chicago
 
San Francisco Crime Classification
San Francisco Crime ClassificationSan Francisco Crime Classification
San Francisco Crime Classification
 
H2O Machine Learning and Kalman Filters for Machine Prognostics
H2O Machine Learning and Kalman Filters for Machine PrognosticsH2O Machine Learning and Kalman Filters for Machine Prognostics
H2O Machine Learning and Kalman Filters for Machine Prognostics
 
Hadoop cluster os_tuning_v1.0_20170106_mobile
Hadoop cluster os_tuning_v1.0_20170106_mobileHadoop cluster os_tuning_v1.0_20170106_mobile
Hadoop cluster os_tuning_v1.0_20170106_mobile
 
Formal methods 8 - category theory (last one)
Formal methods   8 - category theory (last one)Formal methods   8 - category theory (last one)
Formal methods 8 - category theory (last one)
 
Building Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling WaterBuilding Machine Learning Applications with Sparkling Water
Building Machine Learning Applications with Sparkling Water
 
Applied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R WorkshopApplied Machine learning using H2O, python and R Workshop
Applied Machine learning using H2O, python and R Workshop
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno CandelH2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
H2O World - Top 10 Deep Learning Tips & Tricks - Arno Candel
 
H2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal MalohlavaH2O World - Sparkling Water - Michal Malohlava
H2O World - Sparkling Water - Michal Malohlava
 
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri AmbatiH2O.ai - Road Ahead - keynote presentation by Sri Ambati
H2O.ai - Road Ahead - keynote presentation by Sri Ambati
 
GBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O APIGBM in H2O with Cliff Click: H2O API
GBM in H2O with Cliff Click: H2O API
 
Building Random Forest at Scale
Building Random Forest at ScaleBuilding Random Forest at Scale
Building Random Forest at Scale
 
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno CandelDeep Water - GPU Deep Learning for H2O - Arno Candel
Deep Water - GPU Deep Learning for H2O - Arno Candel
 

Similaire à Scalable Data Science and Deep Learning with H2O

Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Sri Ambati
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
Sri Ambati
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
Heiko Joerg Schick
 
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
Checkpointing the Un-checkpointable: MANA and the Split-Process ApproachCheckpointing the Un-checkpointable: MANA and the Split-Process Approach
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
inside-BigData.com
 

Similaire à Scalable Data Science and Deep Learning with H2O (20)

Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
Arno candel scalabledatascienceanddeeplearningwithh2o_reworkboston2015
 
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochgArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
ArnoCandelScalabledatascienceanddeeplearningwithh2o_gotochg
 
Deep Learning in the Wild with Arno Candel
Deep Learning in the Wild with Arno CandelDeep Learning in the Wild with Arno Candel
Deep Learning in the Wild with Arno Candel
 
Saving Human Lives with the IoT
Saving Human Lives with the IoTSaving Human Lives with the IoT
Saving Human Lives with the IoT
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022OpenACC Monthly Highlights: February 2022
OpenACC Monthly Highlights: February 2022
 
Deep Learning through Examples
Deep Learning through ExamplesDeep Learning through Examples
Deep Learning through Examples
 
Python and H2O with Cliff Click at PyData Dallas 2015
Python and H2O with Cliff Click at PyData Dallas 2015Python and H2O with Cliff Click at PyData Dallas 2015
Python and H2O with Cliff Click at PyData Dallas 2015
 
Transform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine LearningTransform your Business with AI, Deep Learning and Machine Learning
Transform your Business with AI, Deep Learning and Machine Learning
 
Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.
 
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
qconsf 2013: Top 10 Performance Gotchas for scaling in-memory Algorithms - Sr...
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 
Apache Beam (incubating)
Apache Beam (incubating)Apache Beam (incubating)
Apache Beam (incubating)
 
H2O at Poznan R Meetup
H2O at Poznan R MeetupH2O at Poznan R Meetup
H2O at Poznan R Meetup
 
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
Checkpointing the Un-checkpointable: MANA and the Split-Process ApproachCheckpointing the Un-checkpointable: MANA and the Split-Process Approach
Checkpointing the Un-checkpointable: MANA and the Split-Process Approach
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
Aplicações Potenciais de Deep Learning à Indústria do PetróleoAplicações Potenciais de Deep Learning à Indústria do Petróleo
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
 
The Joys of Clean Data with Matt Dowle
The Joys of Clean Data with Matt DowleThe Joys of Clean Data with Matt Dowle
The Joys of Clean Data with Matt Dowle
 
scalable machine learning
scalable machine learningscalable machine learning
scalable machine learning
 
Functional architectural patterns
Functional architectural patternsFunctional architectural patterns
Functional architectural patterns
 

Plus de odsc

Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
odsc
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
odsc
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
odsc
 

Plus de odsc (20)

Understanding the Chief Data Officer
Understanding the Chief Data Officer Understanding the Chief Data Officer
Understanding the Chief Data Officer
 
Machine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge DiscoveryMachine-In-The-Loop for Knowledge Discovery
Machine-In-The-Loop for Knowledge Discovery
 
API Driven Development
API Driven Development API Driven Development
API Driven Development
 
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata AnalysisMobile technology Usage by Humanitarian Programs: A Metadata Analysis
Mobile technology Usage by Humanitarian Programs: A Metadata Analysis
 
Productionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground UpProductionizing Deep Learning From the Ground Up
Productionizing Deep Learning From the Ground Up
 
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and HiveBig Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
Big Data Infrastructure: Introduction to Hadoop with MapReduce, Pig, and Hive
 
Think Breadth, Not Depth
Think Breadth, Not DepthThink Breadth, Not Depth
Think Breadth, Not Depth
 
Data Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and InformationData Science at Dow Jones: Monetizing Data, News and Information
Data Science at Dow Jones: Monetizing Data, News and Information
 
Spark, Python and Parquet
Spark, Python and Parquet Spark, Python and Parquet
Spark, Python and Parquet
 
Building a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure MLBuilding a Predictive Analytics Solution with Azure ML
Building a Predictive Analytics Solution with Azure ML
 
Beyond Names
Beyond NamesBeyond Names
Beyond Names
 
How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500How Woman are Conquering the S&P 500
How Woman are Conquering the S&P 500
 
Domain Expertise and Unstructured Data
Domain Expertise and Unstructured DataDomain Expertise and Unstructured Data
Domain Expertise and Unstructured Data
 
Kaggle The Home of Data Science
Kaggle The Home of Data ScienceKaggle The Home of Data Science
Kaggle The Home of Data Science
 
Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions Open Source Tools & Data Science Competitions
Open Source Tools & Data Science Competitions
 
Machine Learning with scikit-learn
Machine Learning with scikit-learnMachine Learning with scikit-learn
Machine Learning with scikit-learn
 
Bridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source ToolsBridging the Gap Between Data and Insight using Open-Source Tools
Bridging the Gap Between Data and Insight using Open-Source Tools
 
Top 10 Signs of the Textpocalypse
Top 10 Signs of the TextpocalypseTop 10 Signs of the Textpocalypse
Top 10 Signs of the Textpocalypse
 
The Art of Data Science
The Art of Data Science The Art of Data Science
The Art of Data Science
 
Frontiers of Open Data Science Research
Frontiers of Open Data Science ResearchFrontiers of Open Data Science Research
Frontiers of Open Data Science Research
 

Dernier

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Scalable Data Science and Deep Learning with H2O