3. Fast growth in ML adoption at Google
Google projects containing ML Models
2012 2013 2014 2015
3000
2000
1000
0
Used across products:
4000
2016
Uniqueprojectdirectories
Time
5. improvement
to ranking quality
in 2+ years
#1
signal
for search ranking out
of hundreds
#3
Search
RankBrain - A deep neural network for
search ranking
machine learning for search engines
7. An open source solution
Created by Google Brain team
Most popular ML project on Github
● Over 480 contributors
● 10,000 commits in 12 months
Multiple deployment options:
● Mobile, Desktop, Server, Cloud
● CPU, GPU
8. Call our trained models as APIs:
our data + our models
Cloud
Vision API
Cloud
Translation API
Cloud Natural
Language API
Cloud
Speech API
Cloud Video
Intelligence API
Build your own models:
your data + your model
Cloud ML Engine
AutoML:
your data + our model
Cloud
AutoML
Main ML Capabilities on GCP
10. Call our trained models as APIs:
our data + our models
Cloud
Vision API
Cloud
Translation API
Cloud Natural
Language API
Cloud
Speech API
Cloud Video
Intelligence API
Build your own models:
your data + your model
Cloud ML Engine
AutoML:
your data + our model
Cloud
AutoML
Main ML Capabilities on GCP
11. Cloud Vision API
Label & web detection OCR Logo detection
Explicit content detectionCrop hintsLandmark detection
14. Call our trained models as APIs:
our data + our models
Cloud
Vision API
Cloud
Translation API
Cloud Natural
Language API
Cloud
Speech API
Cloud Video
Intelligence API
Build your own models:
your data + your model
Cloud ML Engine
AutoML:
your data + our model
Cloud
AutoML
Main ML Capabilities on GCP
21. With AutoML Vision you can create custom
image classification models
What about other types of
data, like text?
Introducing AutoML
Natural Language
24. Call our trained models as APIs:
our data + our models
Cloud
Vision API
Cloud
Translation API
Cloud Natural
Language API
Cloud
Speech API
Cloud Video
Intelligence API
Build your own models:
your data + your model
Cloud ML Engine
AutoML:
your data + our model
Cloud
AutoML
Main ML Capabilities on GCP
25. Tools to build train and serve your own models
TensorFlow ML Engine
26. Hyperparameter tuning with HyperTune
● Automatic hyperparameter tuning
● Runs multiple trials in a training with
specified HP metrics
● Gaussian process to optimize the trials
HyperParam #1
Objective
Want to find this
Not these
28. Cloud Datalab
Data stored in GCS, source code in
Cloud Repositories (ugit).
Notebook interface - Leverage existing
Jupyter modules and knowledge.
Scalable (CPU, GPUs & memory) &
cost-effective (pay as you use)
Dataproc (Spark) integration for large
scale exploratory data analysis.
BigQuery “magic” and Cloud Storage
integration.
Cloud Datalab
30. ML End to End Pipeline
Inputs
Train
model
Pre
processing
Asset
Creation
Distributed
training,
Hyper-parameter
tuning
Deploy: Including
Model Versioning
Cloud MLRemote Clients
REST API call with
input variables
Feature
Creation
ModelModel
Preprocess
& Feature
Creation Prediction
31. 17+ Years of Tackling Big Data Problems
Google
Papers
20082002 2004 2006 2010 2012 2014 2015
GFS
Map
Reduce
Flume
Java
Open
Source
2005
Google
Cloud
Products
BigTable Dremel PubSub Millwheel TensorflowSpanner
2016
32. 17+ Years of Tackling Big Data Problems
Google
Papers
20082002 2004 2006 2010 2012 2014 2015
GFS
Map
Reduce
Flume
Java
Open
Source
2005
Google
Cloud
Products
BigTable Dremel PubSub Millwheel TensorflowSpanner
2016
33. 17+ Years of Tackling Big Data Problems
Google
Papers
20082002 2004 2006 2010 2012 2014 2015
GFS
Map
Reduce
Flume
Java
Open
Source
2005
Google
Cloud
Products BigQuery Pub/Sub Dataflow Bigtable
BigTable Dremel PubSub Millwheel TensorflowSpanner
ML
2016
Spanner
34. Google
BigQuery
Fully Managed and Serverless
Google Cloud’s Enterprise Data Warehouse
for Analytics
Petabyte-Scale and Fast Convenience
of SQL
Encrypted, Durable and Highly Available
35. Where to store your analytics data
Inputs
Training &
Validation
Datasets
Deploy
API and
Versioning
Prediction
Clients
(Online Systems)
REST API call with
input variables
Trained
Models
Training
Serving
Preprocess
& Feature
Creation
Preprocess
& Feature
Creation
Train/Tune
Model
Integrate your data into BiqQuery
to perform Exploratory Data
Analysis at Scale
36. Data Engineering as part of you pipeline
Inputs
Train
model
Pre
processing
Asset
Creation
Deploy: Including
Model Versioning
Cloud MLRemote Clients
REST API call with
input variables
Feature
Creation
ModelModel
Preprocess
& Feature
Creation Prediction
How to do your data
engineering and when to
use what?
38. A new possibility with BQML:
Create ML Model directly in BigQuery
Data analysts and data scientists can
Use familiar SQL for
machine learning
Train models over all
their data in BigQuery
Not worry about
hypertuning or feature
transformations
1 2 3
39. 39
BQML and SQL
● BQML provides a SQL interface for feature engineering, creation and execution of ML models
for data in BQ
○ SQL statements for creation and execution of models
○ Automated hyperparameter tuning (transparent tuning)
○ Constructs for pre-processing for feature engineering
● The SQL constructs is part of GoogleSQL standard
CREATE MODEL my_models.car_accidents
OPTIONS(type=‘log_reg’)
AS SELECT
speed, age, ..., bad_accident as label
FROM
input_table);
SELECT label
FROM
ml.PREDICT(
MODEL my_models.car_accidents,
(SELECT speed, age, ...
FROM input_table));
CREATE MODEL my_models.car_accidents
OPTIONS(type=‘log_reg’)
AS SELECT
speed, age, ..., bad_accident as label
FROM
input_table);
SELECT label
FROM
ml.PREDICT(
MODEL my_models.car_accidents,
(SELECT speed, age, ...
FROM input_table));
40. BigQuery ML: Further Extend ML Accessibility
Developer SQL Analyst Data Scientist Use cases and skills
TensorFlow and
CloudML Engine
● Build and deploy state-of-art custom models
● Requires deep understanding of ML
and programming
BigQuery ML
● Build and deploy custom models using SQL
● Requires only basic understanding of ML
AutoML and
CloudML APIs
● Build and deploy Google-provided models
for standard use cases
● Requires almost no ML knowledge
48. Empowering developers and data scientists
to run AI applications anywhere
Kubeflow Pipelines AI Hub
Collaboration
One-stop catalog of pre-built
pipelines & AI components
Reuse
Composable and reusable pipelines
1 2
49. Kubeflow Pipelines
● Workbench to compose, deploy, and
manage end-to-end ML workflows
● Reliable and repeatable machine learning
for all users
● Rapid experimentation
● Integrated TFX tools for bias detection
● Open and hybrid
50. AI Hub ALPHA
1. One stop AI catalog
Easily discover plug & play pipelines
& other content built by Google AI
and partners.
2. Enterprise-grade sharing controls
Host pipelines and ML content with private
sharing controls within an enterprise to
foster reuse within organizations.
3. Easy deployment on GCP and hybrid
Deploy pipelines via Kubeflow on GCP
and on premise.