This document discusses Google Cloud Platform's Internet of Things (IoT) architecture and services. It describes how IoT data can be captured using protocols and streaming into Google Cloud Pub/Sub. Machine learning algorithms can then detect patterns in real-time streams. Data is also archived in Cloud Storage. Google Cloud Dataflow is highlighted for processing both batch and stream workloads, with features like autoscaling, intuitive programming model, and unified processing of data.
4. Manage the Entire Lifecycle of Big Data
Cloud Logs
Google App
Engine
Google Analytics
Premium
Cloud Pub/Sub
BigQuery Storage
(tables)
Cloud Bigtable
(noSQL)
Cloud Storage
(files)
Cloud Dataflow
BigQuery Analytics
(SQL)
Capture Store Analyze
Batch
Real time analytics
and Alerts
Cloud DataStore
Process
Stream
Cloud Dataflow
Cloud
Monitoring
6. Device to Device Protocols
● Device Discovery
● Device to Device authentication
● Device Configuration
● Protocol Routing
7. Machine Learning: Pattern Detection and Prediction
● Subscribers scan real time
streams and feed data into the
Machine Learning Recognition
algorithm
● Dataflow Orchestrates
streaming algorithms which
compare data streams against
Experience Database
● Correlators detect known
patterns and publish alerts
using Cloud Pub/Sub
8. Cloud Storage Archival and Retrieval
● Data is periodically unloaded
from Big Table and stored in
Cloud Storage for archival
● Data in Cloud Storage can be
quickly re-loaded in Big Table
should it need to be re-
processed.
10. Messaging is a shock-absorber
Throughput LatencyAvailability
Images by Connie
Zhou
• Buffer new requests
during outages
• Prevent overloads that
cause outages
• Redirect requests to
recover from outages
• Smooth out spikes in
new request rate
• Balance load across
multiple workers
• Balance arrival rate
with service rate
• Accept requests closer
to the network edge
• Optimize message
flow across regions
• Leverage shared
efforts to improve
protocols
11. Pub/Sub is a change-absorber
Sinks TransformsSources
Images by Connie
Zhou
• New data sources can
plug into old data
flows
• New data sources can
use new schemas
• Common security
policies for all sources
• Data can be sent to
new destinations
• Push and Pull delivery
are both available
• Spans organizational
boundaries
• Select subsets of
messages that matter
• Helps manage schema
and version changes
• Can merge streams
into new topics
12. Chat & Mobile
Every time your GMail box
pops up a new message,
it’s because of a push
notification to your
browser or mobile
device.
One of the most important
real-time information
streams in the company is
advertising revenue — we
use Pub/Sub to broadcast
budgets to our entire fleet
of search engines
Google Cloud Messaging
for Android delivers
billions of messages a
day, reliably and securely
for Google’s own mobile
apps and the entire
developer community
Updating search results as
you type is a feat of real-
time indexing that
depends on Pub/Sub to
update caches with
breaking news
Ads & Budgets Instant SearchPush Notifications
Pub/Sub at Google
29. Plus True Stream Processing
Plus Autoscaling and per-minute billing
All the benefits of Hadoop-on-Google
Plus a Fully-Managed Service
Plus New, Intuitive Framework
1
2
3
4
5
Why Dataflow?