Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Industry of Things World - Berlin 19-09-16

512 vues

Publié le

This talk makes the case for a measured use of big data pipelines and analytics methods based on the specific business case: one size doesn't fit all. Rather than buying the fastest stack and the most hyped methods, practitioners interested in analytics for Internet-of-Things deployments can save a lot of money by asking themselves a few questions that I lay out in the talk.

Publié dans : Données & analyses
  • A professional Paper writing services can alleviate your stress in writing a successful paper and take the pressure off you to hand it in on time. Check out, please ⇒ www.HelpWriting.net ⇐
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • Soyez le premier à aimer ceci

Industry of Things World - Berlin 19-09-16

  1. 1. Impact of IoT analytics on the development budget Dr. Boris Adryan @BorisAdryan Industry of Things World, Berlin, 19th September 2016
  2. 2. Dr. Boris Adryan • with Zühlke Engineering since September 2016 • longstanding IoT enthusiast • Founder of thingslearn Ltd. • Board Member & Strategic Advisor for Pycom (microcontrollers), BioSelf (biosensors) and OpenSensors (IoT platform) • before: research group leader for data analytics and machine learning at University of Cambridge, England. @BorisAdryan
  3. 3. I disagree with the notion that data is the new oil. It’s as infinite as the sun, and just like the power of the sun, we’re barely using it at the moment. Mike Gualtieri, Forrester Research “ ”
  4. 4. 5V of Big Data Velocity Veracity Volume Variety Value “doesn’t fit on my local drive” “process deals with hundreds of events per second” “wouldn’t even know how to save this in a RDBMS” “actionable insight” “not sure how current, valid or complete it is”
  5. 5. It’s worth to look at the actual data problem before hiring a ‘big data specialist’ or buying an ‘analytics solution’. IoT = Big Data Sensor devices produce large and small data. You may not immediately know how to deal with them - but that doesn’t automatically make them ‘big data’.
  6. 6. 39% of survey participants are worried about the cost of an industrial IoT solution. “Why aren’t you doing IoT?”
  7. 7. Hardware is often perceived as investment that customers understand and therefore anticipate. This talk is about unfounded IoT fears. There’s an air of magic around data and analytics. This leads to fear of: • having to hire specialists (for both data plumbing and analytics) • having to buy expensive services • losing control over the process due to a lack of understanding
  8. 8. data You want actionable insight. data data here be dragons! whatever you do in your vertical ✓better ✓faster ✓cheaper insight “magic” how to deal and what to do with the data
  9. 9. ✓small (fits on your drive) ✓you know exactly what you’re looking for not a ‘data problem’ ask your programmer ✓large (think data centre-scale) ✓you know exactly what you’re looking for potentially ‘big data’ ask your sysadmin, then your programmer Do you need to employ a specialist? data data
  10. 10. Let’s talk about IoT and the cloud
  11. 11. You have a choice. Actually, too much of it.
  12. 12. “My data problem must be special!” ✓ unstructured data ✓ distributed ingestion and storage My company went to an IoT conference & all I got was this t-shirt and a bunch of buzzwords. Customers fear costs because they’re facing: Or they believe from hear-say that IoT automatically requires: ✓ real-time analytics ✓ sophisticated machine learning
  13. 13. “I receive U NsT Ruc Tur data!”De RDBMS name age Boris 40 name city job Boris Fra… IoT key-value DBs name: Boris age: 40 city: Frankfurt name: Boris job: IoT / data science name: Ilka age: 39 name: Ilka city: Frankfurt job: pharma R&D SQL-ish syntax not a ‘big data’ nor a ‘cloud’ problem NoSQL DBs run on commodity hardware
  14. 14. thing thing thing time thing thing thing thing thing thing thing thing thing thing broker broker broker broker storage storage storage even standard cloud offerings can do distributed ingestion and storage very well “I got too many things!” not a big data ‘problem’
  15. 15. Your apps & corporate design Your products and analytical services Your devices Adapting a PaaS to your needs. Security I/O / broker fast storage device management gateway portal & user management basic analytics Zühlke IoT Platform standard components (still, tedious to configure) your USP
  16. 16. data You want actionable insight. data data here be dragons! whatever you do in your vertical ✓better ✓faster ✓cheaper insight “magic” how to deal and what to do with the data Basic data plumbing and storage is usually not the issue.
  17. 17. The message is that there are known knowns. There are things we know that we know. There are known unknowns. That is to say there are things that we now know we don't know. But there are also unknown unknowns. There are things we don't know we don't know. Donald Rumsfeld ex US Secretary of Defense “ ”
  18. 18. ✓small or large ✓you don’t know what to connect or how to find it (the “known unknowns”) ✓you want to increase operational awareness (the “unknown unknowns”) a ‘data science problem’ We can help to establish a machine learning pipeline to extract relevant information automatically. data data data data datadata data data data data Do you need to employ a specialist? you may just need a one-off solution
  19. 19. unsupervised learning - get an overview what’s in your data set supervised learning - teach the machine to classify data on the basis of some previous training statistical analysis - find rules and outliers on the basis of numerical data What is machine learning? ? y 4 n n 0 2 n n 1 4 y y 4 6 y y 9 6 y y 2 skates bike car bus lorry wheels motor windows seats very relevant for predictive maintenance etc.
  20. 20. data weather forecast airport location # of gates # of runways # of snowploughs airline aircraft BLACK BOX training flights cancelled in the past classifier ranked list of relevant features weight of features thresholds for features performance metric prediction new data How does classification work?
  21. 21. training classifier performance assessment good enough? success! moredatafortraining data no yes Is this reliable? sensitivity “truepositives” 1-specificity “false positives” 0 0.2 0.4 0.6 0.8 1.0 1.0 0.8 0.6 0.4 0.2 worse than random guess
  22. 22. data Where is your classifier located? data data here be dragons! whatever you do in your vertical insight “magic” model building training operation performance tracking on device, cloud or mobile app } R & D } ✓better ✓faster ✓cheaper
  23. 23. Is analytics just data crunching?
  24. 24. sound profile assessment result “predictive maintenance classifier”
  25. 25. “Do I need real-time analytics?” microseconds to seconds seconds to minutes minutes to hours hours to weeks on device on stream in batch am I falling? counteract battery level should I land? how many times did I stall? what’s the best weather for flying? in process in database operational insight performance insight strategic insight e.g. Kalman filter e.g. with machine learning e.g. rules engine e.g. summary stats
  26. 26. Can IoT ever be real-time? zone 1: real-time [us] zone 2: real-time [ms] zone 3: real-time [s]
  27. 27. Edge, fog and cloud computing Edge Pro: - immediate compression from raw data to actionable information - cuts down traffic - fast response Con: - loses potentially valuable raw data - developing analytics on embedded systems requires specialists - compute costs valuable battery life Cloud Pro: - compute power - scalability - familiarity for developers - integration center across all data Con: - traffic Fog Pro: - same as Edge - closer to ‘normal’ development work - gateways often mains-powered Con: - loses potentially valuable raw data
  28. 28. Some of our examples for real-time analytics Choosing the appropriate method and toolset on every level.
  29. 29. Options for cloud-based real-time analytics some features can cost a bit, especially when you don’t really know what you’re doing and want to ‘try it out’. a badly configured SMACK stack on your own commodity hardware can be slow and unreliable your pre-trained classifier
  30. 30. My current pet hate: Deep Learning Deep learning has delivered impressive results mimicking human reasoning, strategic thinking and creativity. At the same time, big players have released libraries such that even ‘script kiddies’ can apply deep learning. It’s already leading to unreflected use of deep learning when other methods would be more appropriate.
  31. 31. Dr. Boris Adryan @BorisAdryan ‣ Super-fast analytics and state-of-the-art methods are not automatically the most useful solution. ‣ A good understanding on the type of insight that is required by the business model is essential. ‣ There are many solutions readily available that might enable IoT projects very cost-effectively. Zühlke can advise on your options around IoT and data analytics, and provide complete solutions where needed. Industry of Things World, Berlin, 19th September 2016 Summary