3. Nexxworks Bootcamp Ghent - 27/09/2017
Team of Data Engineers, Data Scientist & Machine
Learning Engineers
Closing the gap between lots of data - lacking insights
Robust & agile ML solutions through scalable APIs
Premium Partner of Google Cloud
8. Confidential & ProprietaryGoogle Cloud Platform 8
2012 20132002 2004 2006 2008 2010
GFS
MapReduce
Bigtable Colossus
Dremel Flume
Megastore
Spanner
Millwheel
Pub/Sub
F1
2016
Dataflow
TensorFlow
Google’s 15+ years innovation in data
9. Confidential & ProprietaryGoogle Cloud Platform 9
Decrease the innovation gap
2012 20132002 2004 2006 2008 2010
GCS
Dataproc
Bigtable GCS
BigQuery Dataflow
Datastore
Spanner
Dataflow
Pub/Sub
F1
2016
Dataflow
Cloud ML
NoSQL
10.
11. 11
Rapidly Accelerating Use of Deep Learning at Google
Number of projects using some form of deep learning
2012 2013 2014 2015
1500
1000
500
0
Used across products:
12. Nexxworks Bootcamp Ghent - 27/09/2017
What’s going on?
The cloud is very good at
handling, storing and manipulating
large volumes of data.
What we really care about now is
understanding the data.
17. Nexxworks Bootcamp Ghent - 27/09/2017 17
The old, algorithmic approach
“apple”
“orange”
“banana”
IF (round) THEN
IF (orange AND coarse) THEN
“orange”
ELSE IF (green AND smooth) THEN
“apple”
ELSE IF ...
...
ELSE IF …
“banana”
18. Nexxworks Bootcamp Ghent - 27/09/2017 18
Let the machine find the rules
“apple”
“orange”
“banana”
?
22. + ‘The Next Big Thing’
+ Sentient AI in the next 10 years
(‘The Singularity’)
+ Will put humans out of a job
+ Foolproof
MACHINE LEARNING
in popular culture
23. + Been around for 60 years now
+ ‘Sentient next year’, every year,
for the last 60 years
+ AI winters: 1970, 1990, … ?
+ Not foolproof
MACHINE LEARNING
reality
24. 24
A person on a beach
flying a kite.
A person skiing down a
snow covered slope.
A group of giraffe standing
next to each other.
25. 25
A woman riding a horse
on a dirt road.
An airplane is parked on the
tarmac at an airport.
A group of people
standing on top of a
beach.
33. Nexxworks Bootcamp Ghent - 27/09/2017 33
Logistics optimisation
Predicting the time packages stay in customs
+ Better predict airplane cargo loads
for an international courier
+ For each parcel, predict the time
under inspection by customs
(0 min = not picked up)
+ Better prediction → Better planning
Better planning → Fuller cargos
Avoid sudden overload
34. Nexxworks Bootcamp Ghent - 27/09/2017 34
Training the deep learning network (simplified)
A parcel’s
attributes
(weight, size, text,
colours, origin ...)
The predicted
package delay
in customs
>100k examples
Full case: 3 days!
36. Nexxworks Bootcamp Ghent - 27/09/2017 36
Predictive Maintenance: towards a highly dynamic Digital Twin
Design Connect Predict
theoretical model collect data create Digital Twin
Testing generates theoretical
life-time model
- in-house testing
- remaining life-time in
‘ideal conditions’
Machine Learning model
predicts expected failure time
Client receives personalized
model by including custom
environmental factors
- usage, temperature,
humidity…
Cloud computing/storage
STATIC MODEL HIGHLY DYNAMIC ML MODEL
connected device
Digital Twin
ML
38. 38
What did we do?
+ Shipped state-of-the-art system in under one month
turnaround time
+ System is not an ‘Artificial Intelligence’
+ System outperforms most solutions currently marketed as AI
Why use Google Cloud Platform?
+ Access to massive computational resources to train our models
+ Able to dynamically adjust to changing number of requests
+ Leverage Google powered API’s
The solution
57. 57
Entity Extraction
● “I got charged 7 dollars for a Christmas Card
that I never ordered. Plz remove this from my
invoice, my telephone number is 472867486.”
● “My internet connection is no longer working. I
have ADSL smart-50 and my address is NY
city 52 A on 21st.”
Entity Extraction Postprocessing
7 dollars $7.00
Christmas Card Christmas-card
472867486 +32472867486
ADSL smart-50 ADSL smart 50
NY city 52 A on 21st 52A, 21st avenue, New
York
58. Nexxworks Bootcamp Ghent - 27/09/2017 58
Other ideas?
smart cities? (garbage detection, road conditions, mobile camera’s for anomaly detection...)
61. MEDICAL IMAGE CLASSIFICATION
TECHNICAL DETAILS - DATA
Camelyon16: ISBI challenge on cancer metastasis detection in lymph node
Training & Evaluation: 110 tumor slides, 160 normal slides, 130 evaluation slides
Task 1: whole-slide level prediction binary classification problem
Task 2: find metastasis location segmentation problem
62.
63. MEDICAL IMAGE CLASSIFICATION
TECHNICAL DETAILS - CHALLENGE BREAKDOWN
Not just another standard image classifier…
Network architecture
Training set construction
Computing environment
Post-processing (classification and segmentation)
64. MEDICAL IMAGE CLASSIFICATION
TECHNICAL DETAILS - TRAINING
Patches are used to train a deep learning model
- started with simple CIFAR10-model
- ended up retraining Inception-v3 to experiment with synchronous updates across
multiple GPUs
Metastasis patches
Normal patches
Deep Learning model
e.g. Inception-v3
65. MEDICAL IMAGE CLASSIFICATION
TECHNICAL DETAILS - MULTIPLE GPUs
Multiple GPUs were used to calculate model
updates (based on TensorFlow tutorial)
- individual model replica on each GPU
- all GPUs process a batch of data and send
update of model parameters to CPU, which
performs a synchronized update
Approach was extended to Cloud ML Engine GPUs
during an external ackathon to build a dermatology
POC in 1 day (by starting from a.o. pretrained VGG16)
81. Nexxworks Bootcamp Ghent - 27/09/2017 81
Use reinforcement learning to optimize energy usage
Reinforcement learning allows for automatic parameter optimization:
- e.g. energy optimization: let AI-agent control settings to optimize
energy consumption in data center cooling
- See also Google Deepmind
85. Nexxworks Bootcamp Ghent - 27/09/2017 85
Advantages that AlphaGO can leverage
1. Fully deterministic: no noise in the game
2. Fully observed: each player has complete information and there are no
hidden variables. (unlike Poker for example)
3. Discrete action space
4. Each game is relatively short (approximately 200 actions)
5. Target function is clear (win/lose) & fast to evaluate
6. Huge datasets of human gameplay are available to bootstrap the
learning, so AlphaGo doesn’t have to start from scratch
87. Nexxworks Bootcamp Ghent - 27/09/2017 87
“If a top 1% CEO today
understands how software applications get built and
how that changes the way she/he manages,
a top 1% CEO in the near future
will understand how models get built and
how that changes the way she/he builds her/his organization.”
James Cham, Bloomberg Data
88. Nexxworks Bootcamp Ghent - 27/09/2017 88
1. Educate yourself
2. AI strategy
3. TEST!
(Get your hands dirty)
92. Tensorflow Training
THANKS!
What my friends think I do
What other computer
scientists think I do
What society thinks I do
What mathematicians think I do What I think I do What I actually do