2. Insights From Internet Of Things & Big Data
Kostya Goldstein
Sr. Program Manager Microsoft Russia
3. Business insights through big data
Microsoft’s solution to big data
Intelligent systems service, HDInsight
Features & capabilities
Demo – HDInsight
Office as Big Data visualization platform
Self service BI – Features & capabilities
Demo – Power BI
Hackathon
Tessel
IoT Hands On Task
Agenda
8. “”
Internet of Things (IoT)
The network of physical
objects that contain
embedded technology to
communicate and interact
with their internal states or
the external environment.
10. Интернет вещей
Аудио /Видео
Журналы операций
Тексты/Изображения
Настроение
высказываний
Обновления витрин
данных
Новости электронного
правительства
Погода
Вики / БлогиПереходы по
ссылкам
Датчики/ RFID / Устройства
Координаты GPS
WEB 2.0Мобильные
устро-ва
Реклама Взаимодействие
Электроннная
коммерция
Цифровой
маркетинг
Поисковый
маркетинг
Протоколы веб-
серверов
Рекомендации
ERP / CRM
Конвейер
продаж
Кредитор
ы
Зарплата
Запасы
Контакты
Отслеживани
е торгов
терабайты
(1012)
гигабайты
(109)
экзабайты
(1018)
петабайты
(1015)
Скорость | разнообразие | изменчивость
Объем
1980
190,000$
2010
0.07$
1990
9,000$
2000
15$
Стоимость хранения за гигабайт, долл
ERP / CRM WEB 2.0 Интернет вещей
What the big data is?
13. Intelligent Systems Service
Microsoft Solution For Internet Of Things
Drive InsightsAnalytics ReadyCloud and
infrastructure
Devices and
assets
1010101001100011010101011101001101010101010011011101111011100101010000110101010111010011010
1010111010011101010101011010011010101010101001101100010101111010011101010101011011110100111
1010101001100011010101011101001101010101010011011101111011100101010000110101010111010011010
1010111010011101010101011010011010101010101001101100010101111010011101010101011011110100111
Customer
portal Value
StreamInsights
Power BI
HDInsight
Windows Embedded
Connect new and existing
devices using open-source
agents or gateway
technologies
Azure, HDInsight
Store machine-generated
data with data from other
sources in the cloud
Office 365, Power BI
View data, administer
devices, and configure
rules, alerts, and other
actions using out-of-box
or custom portals
Mine insights from your
data to find gaps and
opportunities to make
better decisions and realize
new business value
User
input
AlertsSensors Gateway
Agent
ADevices
14. IoT Services Architecture & Platform Components
ISS (Intelligent
Systems Service)
Agent
Gateway
Event Hub &
Azure Service
Bus
Event Processing
&
Rules Engine
Tables
BLOBS
SQL Azure
HDFS
IF {condition}
THEN {action}
Azure Service
Bus
Design &
Engineerin
g
Manufacturin
g & Supply
Chain
Service &
Maintenanc
e
Customer
Relationshi
p
ISS (IntelligentSystems Service)
ID
Industrial
Equipment
16. How To Generate Value From IoT Data
BIG DATA: Data powered by IoT &
other business systems
BETTER Insights: Transform your
business with better insights.
Unstructured
Structured
Streaming
PB
TB
GB Advanced analytics
Data scientist
Interactivity +
Exploration
Business analyst
Self-service
analysis
BI professional
Decision support
Device operator
17. Big Data
BIG DATA: Data powered by IoT & other
business systems
BETTER Insights: Transform your
business with better insights.
Unstructured
Structured
Streaming
PB
TB
GB Advanced analytics
Data scientist
Interactivity +
Exploration
Business analyst
Self-service
analysis
BI professional
Decision support
Device operator
18. Microsoft’s Big Data Solution Stack
Data Management
and Enrichment
Insight
Familiar end user tool
Unstructured and structured data
Sensors Devices Bots Crawlers ERP CRM LOB APPs
Interactive Reports
With Power View
Excel With
Powerpivot
Predictive Analytics
On MS Azure Cloud
Hadoop
HDInsight Machine
learning
Event
Hubs
Stream
Analytics
Data
Factory
19. Data Management And Enrichment
Data Management
and Enrichment
Insight
Familiar end user tool
Unstructured and structured data
Sensors Devices Bots Crawlers ERP CRM LOB APPs
Interactive Reports
With Power View
Excel With
Powerpivot
Predictive Analytics
On MS Azure Cloud
Hadoop
HDInsight Machine
learning
Event
Hubs
Stream
Analytics
Data
Factory
20. Hadoop And HDInsight Technology Stack
HDInsight Ecosystem
Metadata (Hcatalog)
Graph
(Pegasus)
Scripting
(PIG)
Query
(Hive)
Machine
learning
(Mahout)
Distributed processing
(Man reduce)
Distributed storage (HDFS)
World’s data (Azure
data marketplace)
Windows Azure
storage
AD, system center
Status
processing
(RHadoop
)
Businessintelligence
(Excel,owerview…)
Dataintegration
ODBCSQOOPREST
NoSQLDatabase
(Hbase)
P
D
W
Pipelineworkflo
w(Oozie)
Logfile
aggregation
(Flume)
Top level
interfaces ETL Tools BI Reporting RDBMS
Top level
abstractions
PIG HIVE Sqoop
Distributed
data
processing
Map-Reduce
HBASE
Database with
real time
access
At the base is a
self healing
clustered
storage system
Hadoop distributed file system
(HDFS)
Hadoop Ecosystem
26. The prototypical MapReduce example counts the appearance of each word in a set of documents
function map(String name, String document):
// name: document name
// document: document contents
for each word w in document:
emit (w, 1)
function reduce(String word, Iterator partialCounts):
// word: a word
// partialCounts: a list of aggregated partial counts
sum = 0
for each pc in partialCounts:
sum += ParseInt(pc)
emit (word, sum)
en.wikipedia.org
28. Sample of solving the same task by PIG &HIVE
PIG - Procedural
Users = load 'users' as (name, age, ipaddr);
Clicks = load 'clicks' as (user, url, value);
ValuableClicks = filter Clicks by value > 0;
UserClicks = join Users by name, ValuableClicks by
user;
Geoinfo = load 'geoinfo' as (ipaddr, dma);
UserGeo = join UserClicks by ipaddr, Geoinfo by
ipaddr;
ByDMA = group UserGeo by dma;
ValuableClicksPerDMA = foreach ByDMA generate
group, COUNT(UserGeo);
store ValuableClicksPerDMA into
'ValuableClicksPerDMA';
HIVE-Declarative
insert into ValuableClicksPerDMA
select dma, count(*)
from geoinfo join (select name, ipaddr
from users join clicks on (users.name = clicks.user)
where value > 0;) using ipaddr
group by dma;
https://developer.yahoo.com/blogs/hadoop/comparing-pig-latin-sql-constructing-data-processing-pipelines-444.html
36. Services – Service Bus / Event Hub
Overview
Service Bus
Relay
Queue
Topic
Notification
Event
Hub
Interactive Dashboard(s)Production Line(s)
37. Services – Service Bus / Event Hub
Partitions
Service
Bus
Interactive Dashboard(s)Production Line(s)
* 1 Mio Producers
* 1 MB/sec aggregate
per EventHub
Event Hub
Reader 1
Reader 2
Reader 3
….
Reader 1
Reader 2
Reader 3
….
Consumer
Group
Throughput Units
1 MB/s writes
2 MB/s reads
38. Stream Analytics
Real-time stream processing in the cloud
Stream millions of events per second
Perform real-time analytics
Correlate across multiple streams of data
Reliable performance and predictable results
No hardware to deploy
Rapid development with familiar SQL-like language
40. BIG Data To Better Insights
BIG DATA: Data powered by IoT &
other business systems
BETTER Insights: Transform your
business with better insights.
Unstructured
Structured
Streaming
PB
TB
GB Advanced analytics
Data scientist
Interactivity +
Exploration
Business analyst
Self-service
analysis
BI professional
Decision support
Device operator
41. Q&A
A Powerful New Way To Work With Data
Self-service business intelligence with familiar Excel and the power of the cloud
43. From Internet From File From Database And More…
Easily Discover And Access Data
44. Analyzing Data With Excel
Easily discover and access public and
corporate data with Power Query
Model & analyze 100’s of millions of rows
lightning fast with Power Pivot
Explore and visualize data in new ways with
Power View and Power Map
47. Modules
▪ Accelerometer
▪ Ambient Light + Sound
▪ Audio
▪ Bluetooth Low Energy
▪ Camera
▪ Climate
▪ GPS
▪ GPRS
▪ Infrared
▪ MicroSD Card
▪ nRF24 Module
▪ Relay
▪ RFID
▪ Servo
48. What can you do with a Tessel?
▪ Ambient monitoring: monitor temperature, noise… Detect variations
and take action / notify.
– Is the light on at home?Turn on Hue lights automatically at dark.
▪ Accelerometer: game controllers, activity trackers…
▪ Camera: take pictures on event, motion detection…
▪ Infrared: control yourTV
– Clap your hands to turn it on
▪ Lots of projects ideas: https://projects.tessel.io/projects
49. Node.JS for the Tessel
▪ Node.JS is usually used on the server-side; here we are going to use it
on the client side!
▪ Node.JS is well suited to real-time processing of events, thanks to its
asynchronous nature; this is well adapted to a device whose main job
is to monitor and process events (temperature / noise / light / etc.)
▪ Instead of listening to server-side events (GET, POST, etc.) you will be
listening to module-specific events
▪ Events are handled using callbacks, functions that you pass when
registering for the event
50. Hello World: tessel run blinky.js
// Import the interface to Tessel hardware
var tessel = require('tessel');
// Set the led pins as outputs with initial states
// Truthy initial state sets the pin high
// Falsy sets it low.
var led1 = tessel.led[0].output(1);
var led2 = tessel.led[1].output(0);
setInterval(function () {
console.log("I'm blinking! (Press CTRL + C to stop)");
// Toggle the led states
led1.toggle();
led2.toggle();
}, 100);
51. More getting started: Wi-Fi
▪ Connect to localWiFi – ExpoGeorgia
– User:pav#3
– Pass:201567890
▪ OR
▪ Revert to using phone hotspot
▪ tessel wifi -n "iPhone 6" -p "Pass1234“
52. What can you do with Azure?
▪ In theory, anything you can do in Node.JS
– In practice, some complex modules or projects will cause translation problems
because not all Node constructs are fully supported
– Most notably, the Azure SDKs for Node.JS seem to be causing some problems
– It might be easier to revert to plain old REST APIs when possible
▪ Upload stuff to Azure: Blob Storage
▪ Send monitoring/telemetry to Azure: Service Bus, Event Hubs
– Experiment with different protocols: HTTPS, AMQP, MQTT…
▪ Interact with mobile devices through Mobile Services
– Send notifications
▪ Samples on http://gist.github.com/tomconte and
http://hypernephelist.com
53. Let’s hack!
▪ Grab your hardware
▪ Pair up
– Might be best to have one person who knows JS/Node per pair
▪ Get something done in 4 hours
– Install Node,Tessel module, plug in board, upgrade firmware
– Use Notepad++ / SublimeText /Visual Studio or whatever
– Do the Hello World thing
– Get connected toWi-Fi
– Plug in a module, test it
– Do the lab https://github.com/Dx-ted-emea/iot-labs
– For advanced
▪ HDInsight
▪ Use in ML
▪ Connect to mobile device
– Present your results/learnings/findings in the last 30 minutes