IoT-Daten: Mehr und schneller ist nicht automatisch besser.
Über optimale Sampling-Strategien, wie man rechnen kann, ob IoT sich rechnet, und warum es nicht immer Deep Learning und Real-Time-Analytics sein muss. (Folien Deutsch/Englisch)
7. IoT cost expectations
many sensors +
complicated analytics +
expensive infrastructure
——————————————
IoT has little benefit
“…because my data scientist said the more the better ”
8. 39% of survey participants
are worried about the cost
of an industrial IoT
solution.
“Why aren’t you doing IoT?”
10. Do I get more peanuts at Maxie Eisen
or at Logenhaus?
0 50 100
“on average”
Maxie Eisen 3 samples
“on average”
Logenhaus
11. 0 50 100
4 samples
Do I get more peanuts at Maxie Eisen
or at Logenhaus?
“on average”
Maxie Eisen
“on average”
Logenhaus
12. 0 50 100
n samples
statistical power through
large numbers of samples
deviation
Do I get more peanuts at Maxie Eisen
or at Logenhaus?
“on average”
Maxie Eisen
“on average”
Logenhaus
13. Statisticians and data scientists LOVE
larger sample sizes!
…but if sampling costs time and resources, we need a
compromise.
15. precision and accuracy
that can be achieved
theoretically
Sampling strategy
precision and accuracy
that is needed to get
a job done
accurate
and precise
not accurate,
but precise
accurate,
not precise
not what
you want
16. • how to cut down on
hardware costs
• how to cut down on
software costs
Sweetening IoT for your customer
A few recommendations from the trenches:
many sensors +
complicated analytics +
expensive infrastructure
——————————————
IoT has little benefit
less
reasonable
17. IoT - is it worth it?
The upgrade of a ‘dumb’ asset to
a ‘smart’ asset is an investment.
time,
money
19. Data sources
Let’s assume the future isn’t going to be
much different than the past…
• log from past site visits: approx. likelihood for maintenance
• a collection of traffic data that’s somewhat representative
21. Maintenance likelihood
• test for dependency
between Monday and
Wednesday tours
none
• test for dependency
within tours
none
The assumption of temporal
uniformity is reasonable.
22. Monte Carlo simulations
p1(need today)
patterns for a
demand-driven tour
‘cost function’:
sum of edges
base
default tour
base
p2(need today)
p3(need today)
p4(need today)
p5(need today)
p6(need today)
23. Travelling salesman problem
what’s the most
reasonable tour
from to ,
visiting all ?
heuristic search
is good enough,
but requires a
distance matrix
24. Traffic harvesting
• based on Google API
• generate a distribution
of travel times for each
edge in the graph,
dependent on time of
day (weekdays only)
25. IoT - is it worth it?
cost
awaiting
confirmation!
weeks
cost
weeks
27. Humans don’t scale that well…
labour:
expensive
sensor:
cheap
While the cost of the sensors is falling (and follows Moore’s
Law), digging them in and out for deployment and
maintenance is a significant cost factor.
28. Can we learn an optimal
deployment and sampling pattern?
•sampling rate of 5-10 min
•data over 2 weeks in May 2015
•overall 2.6 million data points
Can we make customers’ budget go further by
• reducing the number of sensors in a geographic area?
• lowering the sampling rate for better battery life?
30. Correlation and clustering
0
5
10
15
20
0 3 6 9 12
“correlated”
0
5
10
15
20
0 3 6 9 12
“anti-correlated”
0
5
10
15
20
0 3 6 9 12
“independent”
lorry
coach
car
bike
skateboard
hierarchical clustering on
the basis of a feature matrix
31. Good news: temporal occupancy
pattern roughly predicts neighbours
lots in Southampton
lots around
the corner of
each other
750 parking lots
32. A caveat: Is a high-degree of correlation
a function of parking lot size?
finding two lots of 20
spaces that correlate
finding two lots of 3
spaces that correlate
0:00 12:00 23:59
0:00 12:00 23:59
“more likely”
“less likely”
33. Bootstrapping in DBSCAN clusters
Simulation: Swap the occupancy vectors between parking
lots of similar size and test per grid cell if these lots still
correlate
35. Density-Based Spatial Clustering of
Applications with Noise (DBSCAN)
https://en.wikipedia.org/
wiki/DBSCAN#/media/
File:DBSCAN-Illustration.svg
2 parameters:
epsilon (distance)
minPoints (in cluster)
A - core points
B, C - corner points
N - noise point
36. Stratification strategy
3 lots with cc > 0.5
2 spaces
4 spaces
4 spaces
Test:
1. Take occupancy profile of
ONE random 2-space parking
lot and TWO random 4-space
parking lots.
2. Determine cc.
3. Repeat n times and get a cc
distribution for that parking lot
combination.
38. Suggested technology for trials
A temporary survey would have allowed us to make
the same recommendation, including the insight that
the provided 5’ resolution is probably not required.
39. • how to cut down on
hardware costs
• how to cut down on
software costs
Sweetening IoT for your customer
A few recommendations from the trenches:
many sensors +
complicated analytics +
expensive infrastructure
——————————————
IoT has little benefit
less
reasonable
40. My current pet hate: Deep Learning
Deep learning has delivered impressive
results mimicking human reasoning,
strategic thinking and creativity.
At the same time, big players
have released libraries such
that even ‘script kiddies’ can
apply deep learning.
It’s already leading to unreflected use
of deep learning when other methods
would be more appropriate.
41. “I need to do real-time analytics!”
microseconds
to seconds
seconds to
minutes
minutes
to hours
hours to
weeks
on
device
on
stream
in batch
am I falling?
counteract
battery level
should I land?
how many
times did I
stall?
what’s the best
weather for
flying?
in process
in database
operational insight
performance insight
strategic insight
e.g. Kalman filter
e.g. with machine learning
e.g. rules engine
e.g. summary stats
42. Can IoT ever be real-time?
zone 1:
real-time
[us]
zone 2:
real-time
[ms]
zone 3:
real-time
[s]
43. Edge, fog and cloud computing
Edge
Pro:
- immediate compression from raw
data to actionable information
- cuts down traffic
- fast response
Con:
- loses potentially valuable raw data
- developing analytics on embedded
systems requires specialists
- compute costs valuable battery life
Cloud
Pro:
- compute power
- scalability
- familiarity for developers
- integration centre across
all data sources
- cheapest ‘real-time’
option
Con:
- traffic
Fog
Pro:
- same as Edge
- closer to ‘normal’ development work
- gateways often mains-powered
Con:
- loses potentially valuable raw data
44. Some of our examples for
real-time analytics
Choosing the appropriate
method and toolset on
every level.
45. Dr. Boris Adryan
@BorisAdryan
‣ Preliminary surveys and data analysis can help to
minimise the number of sensors and develop an
optimal deployment strategy and sampling schedule.
‣ Super-fast analytics and state-of-the-art methods are
not automatically the most useful solution.
‣ A good understanding on the type of insight that is
required by the business model is essential.
Summary
46. mobile communications series
BORIS ADRYAN
DOMINIK OBERMAIER
PAUL FREMANTLE
IoT
THE
TECHNICAL
FOUNDATIONS
OF
B O S T O N I L O N D O N
www.artechhouse.com
A R T E C H H O U S E
This comprehensive resource presents a technical introduction to
the components, architectures, software, and protocols of IoT.
This book was designed specifically for those interested in researching,
developing, and building IoT. The book covers the physics of electricity
and electromagnetism, laying the foundation for understanding the
components of modern electronics and computing. Readers learn about
the fundamental properties of IoT, along with security and privacy issues
related to developing and maintaining connected products.
From the launch of the Internet from ARPAnet in the 1960s, to recent
connected gadgets, this book highlights the integration of IoT in various
verticals such as industry, smart cities, connected vehicles, and smart
and assisted living. Overall design patterns, issues with UX and UI, and
different network topologies related to architectures of M2M and IoT
solutions are explored. Hardware development, power, sensors, and
embedded systems are discussed in detail. This book offers insight into
the software components that impinge on IoT solutions, their development,
network protocols, backend software, data analytics, and conceptual
interoperability.
Boris Adryan is the head of IoT & Data Analytics at Zuhlke Engineering (Germany)
and the founder of thingslearn Ltd (UK). He holds a Ph.D. in genetics from the
Max Planck Institute for Biophysical Chemistry, and led academic research as
a Royal Society University Research Fellow at the University of Cambridge.
Dominik Obermaier is the cofounder and CTO at dc-square company, where
he created the HiveMQ MQTT broker. He received his B.Sc. in computer science
from the University of Applied Sciences Landshut.
Paul Fremantle cofounded WSO2, where he was instrumental in creating
the Carbon middleware platform. He studied mathematics, philosophy and
computing at Oxford University, gaining B.A. and M.Sc. degrees. He is currently
pursuing his Ph.D. at the University of Portsmouth, focusing on security and
privacy of IoT.
mobile communications series
THETECHNICALFOUNDATIONSOFIoTADRYAN•OBERMAIER•FREMANTLE
Include bar code
ISBN 13: 978-1-63081-025-2
ISBN: 1-63081-025-8
erscheint
Juni oder Juli