Arno Candel introduces Deep Water, which brings Tensorflow, Caffe, Mxnet to H2O. It also brings support for GPUs, image classification, NLP and much more to H2O.
- Powered by the open source machine learning software H2O.ai. Contributors welcome at: https://github.com/h2oai
- To view videos on H2O open source machine learning software, go to: https://www.youtube.com/user/0xdata
4. • conceptually simple
• non linear
• highly flexible and configurable
• learned features can be extracted
• can be fine-tuned with more data
• efficient for multi-class problems
• world-class at pattern recognition
Deep Learning Pros and Cons
Pros:
• hard to interpret
• theory not well understood
• slow to train and score
• overfits, needs regularization
• many hyper-parameters
• inefficient for categorical variables
• very data hungry, learns slowly
Cons:
Deep Learning got boosted recently by faster computers
5. Brief History of A.I., ML and DL
John McCarthy
Princeton, Bell Labs, Dartmouth, later: MIT, Stanford
1955: “A proposal for the Dartmouth summer
research project on Artificial Intelligence”
with Marvin Minsky (MIT), Claude Shannon
(Bell Labs) and Nathaniel Rochester (IBM)
http://www.asiapacific-mathnews.com/04/0403/0015_0020.pdf
A step back: A.I. was coined over 60 years ago
6. 1955 proposal for the Dartmouth summer research project on A.I.
“We propose that a 2-month, 10-man study of artificial
intelligence be carried out during the summer of 1956 at
Dartmouth College in Hanover, New Hampshire. The study is to
proceed on the basis of the conjecture that every aspect of
learning and any other feature of intelligence can in principle be
so precisely described that a machine can be made to simulate
it. An attempt will be made to find how to make machines use
language, form abstractions and concepts, solve kinds of problems
now reserved for humans, and improve themselves. We think that
a significant advance can be made in one or more of these
problems if a carefully selected group of scientists work on it
together for one summer.”
9. Step 3: Big Data + In-Memory Clusters
2011: Jeopardy (IBM Watson)
In-Memory Analytics/ML
4 TB of data (incl. wikipedia), 90 servers,
16 TB RAM, Hadoop, 6 million logic rules
https://www.youtube.com/watch?v=P18EdAKuC1U https://en.wikipedia.org/wiki/Watson_(computer)
Note: IBM Watson received the question in electronic written form, and was often
able to press the answer button faster than the competing humans.
“No computer will ever answer random questions!?”
10. “No computer will ever understand my language!?”
2014: Google
(acquired Quest Visual)
Deep Learning
Convolutional and Recurrent
Neural Networks,
with training data from users
Step 4: Deep Learning
• Translate between 103 languages by typing
• Instant camera translation: Use your camera to translate text instantly in 29 languages
• Camera Mode: Take pictures of text for higher-quality translations in 37 languages
• Conversation Mode: Two-way instant speech translation in 32 languages
• Handwriting: Draw characters instead of using the keyboard in 93 languages
11. Step 5: Augmented Deep Learning
2014: Atari Games (DeepMind)
2016: AlphaGo (Google DeepMind)
Deep Learning
+ reinforcement learning, tree search,
Monte Carlo, GPUs, playing against itself, …
Go board has approx.
200,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,
000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 (2E170) possible positions.
trained from raw pixel values, no human rules
“No computer will ever beat the best Go master!?”
12. Microsoft had won the Visual Recognition challenge:
http://image-net.org/challenges/LSVRC/2015/
Step 6: A.I. Chatbots have Opinions too!
14. What about Jobs?
Anything that can be automated will be automated.
Jobs of the past:
assembly line work, teller (ATMs), taxi-firm receptionist
Jobs being automated away now:
resume matching, driving, language translation, education
Jobs being automated away soon:
healthcare, arts & crafts, entertainment, design, decoration,
software engineering, politics, management, professional
gaming, financial planning, auditing, real estate agent
Jobs of the future:
professional sports, food & wine reviewer
17. H2O Elastic Net (GLM): 10 secs
alpha=0.5, lambda=1.379e-4 (auto)
H2O Deep Learning: 45 secs
4 hidden ReLU layers of 20 neurons, 1 epoch
Features have non-
linear impact
Chicago, Atlanta,
Dallas:
often delayed
Significant Performance Gains with Deep Learning
Predict departure delay (Y/N) on 20 years of airline flight data
(116M rows, 12 cols, categorical + numerical data with missing values)
WATCH NOW
AUC: 0.656
AUC: 0.703
(higher is better, ranges from 0.5 to 1)
Feature importances
10 nodes: Dual E5-2650 (8 cores, 2.6GHz), 10GbE
18. READ MORE
Kaggle challenge
2nd place winner
Colin Priest
READ MORE
“For my final competition submission I used an ensemble of
models, including 3 deep learning models built with R and h2o.”
“I did really like H2O’s deep learning implementation in
R, though - the interface was great, the back end
extremely easy to understand, and it was scalable and
flexible. Definitely a tool I’ll be going back to.”
H2O Deep Learning Community Quotes
19. H2O Deep Learning Community Quotes
READ MORE WATCH NOW
“H2O Deep Learning models outperform other Gleason
predicting models.”
READ MORE
“… combine ADAM and Apache Spark with H2O’s deep learning capabilities to predict
an individual’s population group based on his or her genomic data. Our results
demonstrate that we can predict these very well, with more than 99% accuracy.”