SlideShare a Scribd company logo
1 of 30
Download to read offline
IoT @ Google Scale
James Chittenden
Google Cloud Platform Solutions Engineer
jameschi@google.com
+James Chittenden
(Big Data Cloud Engineer)
jameschi@google.com
Big Data at Google
aka. Data at Google
Manage the Entire Lifecycle of Big Data
Cloud Logs
Google App
Engine
Google Analytics
Premium
Cloud Pub/Sub
BigQuery Storage
(tables)
Cloud Bigtable
(noSQL)
Cloud Storage
(files)
Cloud Dataflow
BigQuery Analytics
(SQL)
Capture Store Analyze
Batch
Real time analytics
and Alerts
Cloud DataStore
Process
Stream
Cloud Dataflow
Cloud
Monitoring
End to End View of the GCP IoT Architecture
Device to Device Protocols
● Device Discovery
● Device to Device authentication
● Device Configuration
● Protocol Routing
Machine Learning: Pattern Detection and Prediction
● Subscribers scan real time
streams and feed data into the
Machine Learning Recognition
algorithm
● Dataflow Orchestrates
streaming algorithms which
compare data streams against
Experience Database
● Correlators detect known
patterns and publish alerts
using Cloud Pub/Sub
Cloud Storage Archival and Retrieval
● Data is periodically unloaded
from Big Table and stored in
Cloud Storage for archival
● Data in Cloud Storage can be
quickly re-loaded in Big Table
should it need to be re-
processed.
Cloud Pub/Sub
Real-time and reliable messaging with Pub/Sub
Messaging is a shock-absorber
Throughput LatencyAvailability
Images by Connie
Zhou
• Buffer new requests
during outages
• Prevent overloads that
cause outages
• Redirect requests to
recover from outages
• Smooth out spikes in
new request rate
• Balance load across
multiple workers
• Balance arrival rate
with service rate
• Accept requests closer
to the network edge
• Optimize message
flow across regions
• Leverage shared
efforts to improve
protocols
Pub/Sub is a change-absorber
Sinks TransformsSources
Images by Connie
Zhou
• New data sources can
plug into old data
flows
• New data sources can
use new schemas
• Common security
policies for all sources
• Data can be sent to
new destinations
• Push and Pull delivery
are both available
• Spans organizational
boundaries
• Select subsets of
messages that matter
• Helps manage schema
and version changes
• Can merge streams
into new topics
Chat & Mobile
Every time your GMail box
pops up a new message,
it’s because of a push
notification to your
browser or mobile
device.
One of the most important
real-time information
streams in the company is
advertising revenue — we
use Pub/Sub to broadcast
budgets to our entire fleet
of search engines
Google Cloud Messaging
for Android delivers
billions of messages a
day, reliably and securely
for Google’s own mobile
apps and the entire
developer community
Updating search results as
you type is a feat of real-
time indexing that
depends on Pub/Sub to
update caches with
breaking news
Ads & Budgets Instant SearchPush Notifications
Pub/Sub at Google
HTTP Server
Subscriber
Pub/Sub System
Webhook
Delivery
Publisher
Topic
Subscription
HTTP Push
Delivery
Google
App Engine
Pull
Subscriber
Subscription Subscription
Google RPC
Delivery
Cloud
Dataflow
Subscription
On-Prem/Cloud Any Environment
Subscriber
Msg
Pub/Sub System
Subscriber
Msg
Pub/Sub System
Ack
RPC Send
RPC Return
Ack
Push Subscription Pull Subscription
“We don’t really run MapReduce at Google anymore”
- Urs Hoelzle
Google Dataflow
Google Technologies
SpannerDremelMapReduce
Big Table
MillWheel
2012 2014+2002 2004 2006 2008 2010
GFS
2013
More!
Flumejava
Colossus
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Dataflow Goodies
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Pipeline p = Pipeline.create();
p.begin()
.apply(TextIO.Read.from(“gs://…”))
.apply(ParDo.of(new ExtractTags())
.apply(Count.create())
.apply(ParDo.of(new ExpandPrefixes())
.apply(Top.largestPerKey(3))
.apply(TextIO.Write.to(“gs://…”));
p.run();
Dataflow Goodies
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Deploy
Schedule & Monitor
Dataflow Goodies
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
800 RPS 1200 RPS 5000 RPS 50 RPS
Dataflow Goodies
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Dataflow Goodies
Autoscaling mid-job
Fully managed - No-Ops
Intuitive Data Processing Framework
Batch and Stream Processing in one
Liquid sharding mid-job
1
2
3
4
5
Pipeline p = Pipeline.create();
p.begin()
.apply(TextIO.Read.from(“gs://…”))
.apply(ParDo.of(new ExtractTags())
.apply(Count.create())
.apply(ParDo.of(new ExpandPrefixes())
.apply(Top.largestPerKey(3))
.apply(TextIO.Write.to(“gs://…”));
p.run();
.apply(PubsubIO.Read.from(“input_topic”))
.apply(Window.<Integer>by(FixedWindows.of(5, MINUTES))
.apply(PubsubIO.Write.to(“output_topic”));
Dataflow Goodies
Unified Model
Unified Model
Pub/Sub + Dataflow + BigQuery Demo
Life of a Pipeline
Dataflow
Your Data
BigQuery
Fast ETL
Regex
JSON
UDFs
Spreadsheets
BI Tools
Coworkers
Applications + Reports
PubSub
Cloud Storage
BigTable
Enterprise Big Data Architecture on Google
Plus True Stream Processing
Plus Autoscaling and per-minute billing
All the benefits of Hadoop-on-Google
Plus a Fully-Managed Service
Plus New, Intuitive Framework
1
2
3
4
5
Why Dataflow?
Questions?

More Related Content

What's hot

In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017SingleStore
 
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLBuilding Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLSingleStore
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsSingleStore
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...Flink Forward
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsSingleStore
 
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Databricks
 
SnapLogic Live: AWS Integration
SnapLogic Live: AWS IntegrationSnapLogic Live: AWS Integration
SnapLogic Live: AWS IntegrationSnapLogic
 
SnapLogic Live: IoT Integration
SnapLogic Live: IoT IntegrationSnapLogic Live: IoT Integration
SnapLogic Live: IoT IntegrationSnapLogic
 
Snaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in MotionSnaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in MotionSnapLogic
 
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahLeveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahDatabricks
 
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisThe Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisSingleStore
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkSingleStore
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Imam Raza
 
SnapLogic Live: Salesforce Integration
SnapLogic Live: Salesforce IntegrationSnapLogic Live: Salesforce Integration
SnapLogic Live: Salesforce IntegrationSnapLogic
 
Next Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Next Generation of Data Integration with Azure Data Factory by Tom KerkhoveNext Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Next Generation of Data Integration with Azure Data Factory by Tom KerkhoveCodit
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Shirshanka Das
 
Streaming Analytics for IoT with Apache Spark
Streaming Analytics for IoT with Apache SparkStreaming Analytics for IoT with Apache Spark
Streaming Analytics for IoT with Apache SparkImpetus Technologies
 
Cloud Developer Days - BigQuery
Cloud Developer Days - BigQueryCloud Developer Days - BigQuery
Cloud Developer Days - BigQueryWlodek Bielski
 

What's hot (20)

In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017In-Memory Computing Webcast. Market Predictions 2017
In-Memory Computing Webcast. Market Predictions 2017
 
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQLBuilding Real-Time Data Pipelines with Kafka, Spark, and MemSQL
Building Real-Time Data Pipelines with Kafka, Spark, and MemSQL
 
Winning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive AnalyticsWinning the On-Demand Economy with Spark and Predictive Analytics
Winning the On-Demand Economy with Spark and Predictive Analytics
 
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...Event Streaming Architecture for Industry 4.0 -  Abdelkrim Hadjidj & Jan Kuni...
Event Streaming Architecture for Industry 4.0 - Abdelkrim Hadjidj & Jan Kuni...
 
Driving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive AnalyticsDriving the On-Demand Economy with Spark and Predictive Analytics
Driving the On-Demand Economy with Spark and Predictive Analytics
 
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
Building Identity Graph at Scale for Programmatic Media Buying Using Apache S...
 
SnapLogic Live: AWS Integration
SnapLogic Live: AWS IntegrationSnapLogic Live: AWS Integration
SnapLogic Live: AWS Integration
 
Zero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using HadoopZero Downtime App Deployment using Hadoop
Zero Downtime App Deployment using Hadoop
 
SnapLogic Live: IoT Integration
SnapLogic Live: IoT IntegrationSnapLogic Live: IoT Integration
SnapLogic Live: IoT Integration
 
Snaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in MotionSnaplogic Live: Big Data in Motion
Snaplogic Live: Big Data in Motion
 
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq AbdullahLeveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
Leveraging Spark to Democratize Data for Omni-Commerce with Shafaq Abdullah
 
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and AnalysisThe Impact of Always-on Connectivity for Geospatial Applications and Analysis
The Impact of Always-on Connectivity for Geospatial Applications and Analysis
 
The Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with SparkThe Fast Path to Building Operational Applications with Spark
The Fast Path to Building Operational Applications with Spark
 
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
Big Data with hadoop, Spark and BigQuery (Google cloud next Extended 2017 Kar...
 
SnapLogic Live: Salesforce Integration
SnapLogic Live: Salesforce IntegrationSnapLogic Live: Salesforce Integration
SnapLogic Live: Salesforce Integration
 
Next Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Next Generation of Data Integration with Azure Data Factory by Tom KerkhoveNext Generation of Data Integration with Azure Data Factory by Tom Kerkhove
Next Generation of Data Integration with Azure Data Factory by Tom Kerkhove
 
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
Strata 2017 (San Jose): Building a healthy data ecosystem around Kafka and Ha...
 
Streaming Analytics for IoT with Apache Spark
Streaming Analytics for IoT with Apache SparkStreaming Analytics for IoT with Apache Spark
Streaming Analytics for IoT with Apache Spark
 
Cloud Developer Days - BigQuery
Cloud Developer Days - BigQueryCloud Developer Days - BigQuery
Cloud Developer Days - BigQuery
 
Google Bigtable
Google BigtableGoogle Bigtable
Google Bigtable
 

Similar to IoT at Google Scale

Google на конференции Big Data Russia
Google на конференции Big Data RussiaGoogle на конференции Big Data Russia
Google на конференции Big Data Russiarusbase.vc
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsData Driven Innovation
 
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big query
Google for モバイル アプリ   16:00: モバイル kpi 分析の新標準 fluentd + google big queryGoogle for モバイル アプリ   16:00: モバイル kpi 分析の新標準 fluentd + google big query
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big queryGoogle Cloud Platform - Japan
 
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...GameCamp
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTJames Chittenden
 
Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)David Torres
 
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...HostedbyConfluent
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureMediaAgility
 
Google Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylayGoogle Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylayVeselin Pizurica
 
Big data in action
Big data in actionBig data in action
Big data in actionTu Pham
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionDaniel Zivkovic
 
Google Cloud Platform - Service Glossary
Google Cloud Platform - Service GlossaryGoogle Cloud Platform - Service Glossary
Google Cloud Platform - Service GlossaryJoseph's Cloud Library
 
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...Google Cloud Platform - Japan
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Alluxio, Inc.
 
Google Cloud Study Jam | GDSC NCU
Google Cloud Study Jam | GDSC NCUGoogle Cloud Study Jam | GDSC NCU
Google Cloud Study Jam | GDSC NCUShivam254129
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite Codemotion
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platformdhruv_chaudhari
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoophuguk
 

Similar to IoT at Google Scale (20)

Google на конференции Big Data Russia
Google на конференции Big Data RussiaGoogle на конференции Big Data Russia
Google на конференции Big Data Russia
 
Critical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and AnalyticsCritical Breakthroughs and Challenges in Big Data and Analytics
Critical Breakthroughs and Challenges in Big Data and Analytics
 
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big query
Google for モバイル アプリ   16:00: モバイル kpi 分析の新標準 fluentd + google big queryGoogle for モバイル アプリ   16:00: モバイル kpi 分析の新標準 fluentd + google big query
Google for モバイル アプリ 16:00: モバイル kpi 分析の新標準 fluentd + google big query
 
Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017Modern Thinking área digital MSKM 21/09/2017
Modern Thinking área digital MSKM 21/09/2017
 
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...
Build your own event analytics pipeline using BigQuery, Dataflow, and k8s. Je...
 
IoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoTIoT NY - Google Cloud Services for IoT
IoT NY - Google Cloud Services for IoT
 
Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)Google not all clouds are created equal - sap sapphire 2014 (1)
Google not all clouds are created equal - sap sapphire 2014 (1)
 
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
Navigating Your Data Landscape With Siddharth Desai and Elena Cuevas | Curren...
 
Building what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructureBuilding what's next with google cloud's powerful infrastructure
Building what's next with google cloud's powerful infrastructure
 
Google Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylayGoogle Cloud infrastructure in Conrad Connect by Google & waylay
Google Cloud infrastructure in Conrad Connect by Google & waylay
 
Big data in action
Big data in actionBig data in action
Big data in action
 
Google Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data editionGoogle Cloud Next '22 Recap: Serverless & Data edition
Google Cloud Next '22 Recap: Serverless & Data edition
 
GCP Slide.pptx
GCP Slide.pptxGCP Slide.pptx
GCP Slide.pptx
 
Google Cloud Platform - Service Glossary
Google Cloud Platform - Service GlossaryGoogle Cloud Platform - Service Glossary
Google Cloud Platform - Service Glossary
 
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
Google Developers Summit Tokyo - Google Cloud Platform で知る Google クラウドの「Googl...
 
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
Integrating Google Cloud Dataproc with Alluxio for faster performance in the ...
 
Google Cloud Study Jam | GDSC NCU
Google Cloud Study Jam | GDSC NCUGoogle Cloud Study Jam | GDSC NCU
Google Cloud Study Jam | GDSC NCU
 
A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite A fresh look at Google’s Cloud by Mandy Waite
A fresh look at Google’s Cloud by Mandy Waite
 
Introduction to Google Cloud Platform
Introduction to Google Cloud PlatformIntroduction to Google Cloud Platform
Introduction to Google Cloud Platform
 
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and HadoopGoogle Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop
 

Recently uploaded

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 

Recently uploaded (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 

IoT at Google Scale

  • 1. IoT @ Google Scale James Chittenden Google Cloud Platform Solutions Engineer jameschi@google.com
  • 2. +James Chittenden (Big Data Cloud Engineer) jameschi@google.com
  • 3. Big Data at Google aka. Data at Google
  • 4. Manage the Entire Lifecycle of Big Data Cloud Logs Google App Engine Google Analytics Premium Cloud Pub/Sub BigQuery Storage (tables) Cloud Bigtable (noSQL) Cloud Storage (files) Cloud Dataflow BigQuery Analytics (SQL) Capture Store Analyze Batch Real time analytics and Alerts Cloud DataStore Process Stream Cloud Dataflow Cloud Monitoring
  • 5. End to End View of the GCP IoT Architecture
  • 6. Device to Device Protocols ● Device Discovery ● Device to Device authentication ● Device Configuration ● Protocol Routing
  • 7. Machine Learning: Pattern Detection and Prediction ● Subscribers scan real time streams and feed data into the Machine Learning Recognition algorithm ● Dataflow Orchestrates streaming algorithms which compare data streams against Experience Database ● Correlators detect known patterns and publish alerts using Cloud Pub/Sub
  • 8. Cloud Storage Archival and Retrieval ● Data is periodically unloaded from Big Table and stored in Cloud Storage for archival ● Data in Cloud Storage can be quickly re-loaded in Big Table should it need to be re- processed.
  • 9. Cloud Pub/Sub Real-time and reliable messaging with Pub/Sub
  • 10. Messaging is a shock-absorber Throughput LatencyAvailability Images by Connie Zhou • Buffer new requests during outages • Prevent overloads that cause outages • Redirect requests to recover from outages • Smooth out spikes in new request rate • Balance load across multiple workers • Balance arrival rate with service rate • Accept requests closer to the network edge • Optimize message flow across regions • Leverage shared efforts to improve protocols
  • 11. Pub/Sub is a change-absorber Sinks TransformsSources Images by Connie Zhou • New data sources can plug into old data flows • New data sources can use new schemas • Common security policies for all sources • Data can be sent to new destinations • Push and Pull delivery are both available • Spans organizational boundaries • Select subsets of messages that matter • Helps manage schema and version changes • Can merge streams into new topics
  • 12. Chat & Mobile Every time your GMail box pops up a new message, it’s because of a push notification to your browser or mobile device. One of the most important real-time information streams in the company is advertising revenue — we use Pub/Sub to broadcast budgets to our entire fleet of search engines Google Cloud Messaging for Android delivers billions of messages a day, reliably and securely for Google’s own mobile apps and the entire developer community Updating search results as you type is a feat of real- time indexing that depends on Pub/Sub to update caches with breaking news Ads & Budgets Instant SearchPush Notifications Pub/Sub at Google
  • 13. HTTP Server Subscriber Pub/Sub System Webhook Delivery Publisher Topic Subscription HTTP Push Delivery Google App Engine Pull Subscriber Subscription Subscription Google RPC Delivery Cloud Dataflow Subscription On-Prem/Cloud Any Environment
  • 14. Subscriber Msg Pub/Sub System Subscriber Msg Pub/Sub System Ack RPC Send RPC Return Ack Push Subscription Pull Subscription
  • 15. “We don’t really run MapReduce at Google anymore” - Urs Hoelzle Google Dataflow
  • 16. Google Technologies SpannerDremelMapReduce Big Table MillWheel 2012 2014+2002 2004 2006 2008 2010 GFS 2013 More! Flumejava Colossus
  • 17. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 Dataflow Goodies
  • 18. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 Pipeline p = Pipeline.create(); p.begin() .apply(TextIO.Read.from(“gs://…”)) .apply(ParDo.of(new ExtractTags()) .apply(Count.create()) .apply(ParDo.of(new ExpandPrefixes()) .apply(Top.largestPerKey(3)) .apply(TextIO.Write.to(“gs://…”)); p.run(); Dataflow Goodies
  • 19. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 Deploy Schedule & Monitor Dataflow Goodies
  • 20. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 800 RPS 1200 RPS 5000 RPS 50 RPS Dataflow Goodies
  • 21. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 Dataflow Goodies
  • 22. Autoscaling mid-job Fully managed - No-Ops Intuitive Data Processing Framework Batch and Stream Processing in one Liquid sharding mid-job 1 2 3 4 5 Pipeline p = Pipeline.create(); p.begin() .apply(TextIO.Read.from(“gs://…”)) .apply(ParDo.of(new ExtractTags()) .apply(Count.create()) .apply(ParDo.of(new ExpandPrefixes()) .apply(Top.largestPerKey(3)) .apply(TextIO.Write.to(“gs://…”)); p.run(); .apply(PubsubIO.Read.from(“input_topic”)) .apply(Window.<Integer>by(FixedWindows.of(5, MINUTES)) .apply(PubsubIO.Write.to(“output_topic”)); Dataflow Goodies
  • 25. Pub/Sub + Dataflow + BigQuery Demo
  • 26. Life of a Pipeline
  • 27.
  • 28. Dataflow Your Data BigQuery Fast ETL Regex JSON UDFs Spreadsheets BI Tools Coworkers Applications + Reports PubSub Cloud Storage BigTable Enterprise Big Data Architecture on Google
  • 29. Plus True Stream Processing Plus Autoscaling and per-minute billing All the benefits of Hadoop-on-Google Plus a Fully-Managed Service Plus New, Intuitive Framework 1 2 3 4 5 Why Dataflow?