9. Computational Network Toolkit (CNTK)
Vowpal Wabbit XGBoost Rattle
mxnet Weka Apache Drill
CUDA, CUDNN, Nvidia Driver
GPU based Deep
Learning Tools
Use Deep Learning
Toolkit for DSVM
Ubuntu Edition Only
* Microsoft
Cognitive Toolkit
(CNTK)
Y Y
* Tensorflow Y Y
* MXNet Y Y
* Caffe & Caffe2 N Y
* Torch N Y
* Theano N Y
* Keras N Y
* NVidia Digits N Y
* CUDA, CUDNN,
Nvidia Driver
Y Y
13. Platform Services
Infrastructure Services
Web Apps
Mobile
Apps
API
Management
API Apps
Logic Apps
Notification
Hubs
Content
Delivery
Network (CDN)
Media
Services
BizTalk
Services
Hybrid
Connections
Service Bus
Storage
Queues
Hybrid
Operations
Backup
StorSimple
Azure Site
Recovery
Import/Export
SQL
Database
DocumentDB
Redis
Cache
Azure
Search
Storage
Tables
Data
Warehouse Azure AD
Health Monitoring
AD Privileged
Identity
Management
Operational
Analytics
Cloud
Services
Batch
RemoteApp
Service
Fabric
Visual Studio
App
Insights
Azure
SDK
VS Online
Domain Services
HDInsight Machine
Learning
Stream
Analytics
Data
Factory
Event
Hubs
Mobile
Engagement
Data
Lake
IoT Hub
Data
Catalog
Security &
Management
Azure Active
Directory
Multi-Factor
Authentication
Automation
Portal
Key Vault
Store/
Marketplace
VM Image Gallery
& VM Depot
Azure AD
B2C
Scheduler
The Azure Platform
14. 様々なアプリ開発が行われています
Web & mobile Business apps Microservice apps
Development & test Big data & analytics Internet of Things
Backup, recovery
& archive
High Performance
Computing
Digital media
43. Azure SQL Database Query Performance Insight
https://docs.microsoft.com/ja-jp/azure/sql-database/sql-database-query-performance
Query Store is turned on by default for Azure SQL Database
https://azure.microsoft.com/ja-jp/updates/query-store-on-is-the-new-default-for-azure-sql-database/
66. REFERENCE ASSEMBLY WebLogExtASM;
@rs =
EXTRACT
UserID string,
Start DateTime,
End DateTime,
Region string,
SitesVisited string,
PagesVisited string
FROM “/Logs/WebLogRecords.txt”
USING WebLogExtractor ();
@result = SELECT UserID,
(End.Subtract(Start)).TotalSeconds AS Duration
FROM @rs ORDER BY Duration DESC FETCH 10;
OUTPUT @result TO “/Logs/Results/top10.tsv"
USING Outputter.Tsv();
• 型定義は C# の型定義と同じ
• データをファイルから抽出・読み込み
するときに、スキーマが必要
Data Lake Store 内 のファイル
独自形式を解析するカスタム関数
C# の関数
行セット:
(中間テーブルの概念
に近い)
TSV形式で書き込む関数
69. 従来型の処理・分析 Azure Data Lake を中心とした処理・分析
Business
apps
Custom
apps
Sensors
and devices
ADL Store
People
非構造化データも
含めてあらゆる
データを格納
Azure SQL
DW
Azure AD
Power BI
ADF
ADL
Analytics
• 処理・分析業務の大半はデータ準備作業が占める
• 処理・分析業務に手間・時間が必要
Business
apps
Custom
apps
Sensors
and devices
HDInsight
ユーザー管理、認証
データの連携
Power BI
File System
Database
Database
Hadoop
DWH
Data Mart
70. デバイス
Machine
Learning
Stream
Analytics
SQL
Database
Azure
Storage
HDInsight
(Hadoop)
Event Hubs
BIツール
(Power BI など)
機器
制御装置
Stream Analytics
Data Factory
Data Lake
Store
SQL
Data
Warehouse
業務システム
Machine Learning
API
IoT Hub
Document
DB
Data Lake
Analytics service
Revolution R
Enterprise
Recommendations,
customer churn,
forecasting, etc.
Face, vision Speech, text
Cognitive
Services
①大量データの
受け入れ
②リアルタイム処理
データの集約
③データの蓄積
・構造化
・非構造化
・文書
など様々な形式
での保存
④データの加
工・移行
⑥機械学習
⑦Hadoop解析
⑧マイクロソフト技
術を用いた分散
解析
⑨・通常のR
・MSが機能を
追加した企業
向けR
を使用した解析
⑪クラウドベースの Self-
Service BI 機能
⑫機械学習の予測
モデルを業務システ
ムなどから呼び出し
可能にするサービス
外部クラウドなど
からのデータ取り
込み
②リアルタイム処理
データの加工
⑤ディープラーニング
⑩Excelを用いた可視化
95. Microsoft data platform solutions
Product Category Description More Info
SQL Server 2016 RDBMS Earned top spot in Gartner’s Operational Database magic
quadrant. JSON support. Linux TBD
https://www.microsoft.com/en-us/server-
cloud/products/sql-server-2016/
SQL Database RDBMS/DBaaS Cloud-based service that is provisioned and scaled quickly.
Has built-in high availability and disaster recovery. JSON
support
https://azure.microsoft.com/en-
us/services/sql-database/
SQL Data Warehouse MPP RDBMS/DBaaS Cloud-based service that handles relational big data.
Provision and scale quickly. Can pause service to reduce
cost
https://azure.microsoft.com/en-
us/services/sql-data-warehouse/
Analytics Platform System (APS) MPP RDBMS Big data analytics appliance for high performance and
seamless integration of all your data
https://www.microsoft.com/en-us/server-
cloud/products/analytics-platform-
system/
Azure Data Lake Store Hadoop storage Removes the complexities of ingesting and storing all of
your data while making it faster to get up and running with
batch, streaming, and interactive analytics
https://azure.microsoft.com/en-
us/services/data-lake-store/
Azure Data Lake Analytics On-demand analytics job
service/Big Data-as-a-
service
Cloud-based service that dynamically provisions resources
so you can run queries on exabytes of data. Includes U-
SQL, a new big data query language
https://azure.microsoft.com/en-
us/services/data-lake-analytics/
HDInsight PaaS Hadoop
compute/Hadoop
clusters-as-a-service
A managed Apache Hadoop, Spark, R, HBase, Kafka, and
Storm cloud service made easy
https://azure.microsoft.com/en-
us/services/hdinsight/
DocumentDB PaaS NoSQL: Document
Store
Get your apps up and running in hours with a fully
managed NoSQL database service that indexes, stores, and
queries data using familiar SQL syntax
https://azure.microsoft.com/en-
us/services/documentdb/
Azure Table Storage PaaS NoSQL: Key-value
Store
Store large amount of semi-structured data in the cloud https://azure.microsoft.com/en-
us/services/storage/tables/
98. They can all use Azure Analysis
Services
Azure Analysis Services
99. Azure Analysis Services
BI semantic model
Business logic & metrics
Data modeling
Security
Azure Analysis Services
Lifecycle management
In-memory
cache
On-premises
Cloud
Data sources
SQL Database
SQL Data Warehouse
Other data sources
SQL Server
Analytics platform
system
Other data sources
On-premises
Cloud
Client tools
Power BI
Excel
Third party BI tools
Power BI Desktop
104. DOMINOS STORES
Key Vault
Azure Active Directory
Azure Files
SSIS FRANCHISEE
AZURE WEST EUROPE
Azure Function
KEMPSSDE
SSRS
IN- AND EXTERNAL DATA SOURCES
107. Use Bindings in Your Code function.json
"bindings": [
{
"type": "httpTrigger",
"direction": "in",
"webHookType": "genericJson",
"name": "req"
},
{
"type": "http",
"direction": "out",
"name": "res"
},
{
"type": "queue",
"name": "eventOutput",
"queueName": "aievents1",
"connection":"AiStorageConnection",
"direction": "out"
}
]
public static class OrderHandler
{
[FunctionName("OrderWebhook")]
public static async Task<HttpResponseMessage> Run(
[HttpTrigger] HttpRequestMessage req,
[Queue("aievents1", Connection = "AiStorageConnection")]
IAsyncCollector<String> eventOutput,
TraceWriter log)
{
log.Info($"Webhook was triggered!");
string jsonContent = await req.Content.ReadAsStringAsync();
dynamic data = JsonConvert.DeserializeObject(jsonContent);
await eventOutput.AddAsync(
JsonConvert.SerializeObject(GetLogData(data)));
int orderId = PlaceOrder(data);
return req.CreateResponse(HttpStatusCode.OK,
new {orderNumber = orderId });
}
. . .
}
108. • Workflow in the cloud
• Powerful control flow
• Connect functions and
APIs
• Declarative definition to
persist in source control
and drive deployments
109. Logic Apps
Cloud APIs and platform
• Supports over 125 built-in connectors
• Scales to meet your needs
• Enables rapid development
• Extends with custom APIs and
Functions
API connections
• Authenticate once and reuse
113. Microsoft data platform solutions
Product Category Description More Info
SQL Server 2016 RDBMS Earned top spot in Gartner’s Operational Database magic
quadrant. JSON support. Linux TBD
https://www.microsoft.com/en-us/server-
cloud/products/sql-server-2016/
SQL Database RDBMS/DBaaS Cloud-based service that is provisioned and scaled quickly.
Has built-in high availability and disaster recovery. JSON
support
https://azure.microsoft.com/en-
us/services/sql-database/
SQL Data Warehouse MPP RDBMS/DBaaS Cloud-based service that handles relational big data.
Provision and scale quickly. Can pause service to reduce cost
https://azure.microsoft.com/en-
us/services/sql-data-warehouse/
Analytics Platform System (APS) MPP RDBMS Big data analytics appliance for high performance and
seamless integration of all your data
https://www.microsoft.com/en-us/server-
cloud/products/analytics-platform-system/
Azure Data Lake Store Hadoop storage Removes the complexities of ingesting and storing all of your
data while making it faster to get up and running with batch,
streaming, and interactive analytics
https://azure.microsoft.com/en-
us/services/data-lake-store/
Azure Data Lake Analytics On-demand analytics job
service/Big Data-as-a-
service
Cloud-based service that dynamically provisions resources so
you can run queries on exabytes of data. Includes U-SQL, a
new big data query language
https://azure.microsoft.com/en-
us/services/data-lake-analytics/
HDInsight PaaS Hadoop
compute/Hadoop
clusters-as-a-service
A managed Apache Hadoop, Spark, R, HBase, Kafka, and
Storm cloud service made easy
https://azure.microsoft.com/en-
us/services/hdinsight/
DocumentDB PaaS NoSQL: Document
Store
Get your apps up and running in hours with a fully managed
NoSQL database service that indexes, stores, and queries
data using familiar SQL syntax
https://azure.microsoft.com/en-
us/services/documentdb/
Azure Table Storage PaaS NoSQL: Key-value
Store
Store large amount of semi-structured data in the cloud https://azure.microsoft.com/en-
us/services/storage/tables/
114. Microsoft Big Data Portfolio
SQL Server Stretch
Business intelligence
Machine learning analytics
Insights
Azure SQL Database
SQL Server 2016
SQL Server 2016 Fast Track
Azure SQL DW
ADLS & ADLA
DocumentDB
HDInsight
Hadoop
Analytics Platform System
Sequential Scale Out + AcrossScale Up
Key
Relational Non-relational
On-premisesCloud
Microsoft has solutions covering
and connecting all four
quadrants – that’s why SQL
Server is one of the most utilized
databases in the world
115. Azure
Data Lake Store
Azure
Blob Storage
Purpose Optimized for big data analytics General purpose bulk storage
Use Cases Batch, Interactive, Streaming App backend, backup data, media storage for
streaming
Units of Storage Accounts / Folders / Files Accounts / Containers / Blobs
Structure Hierarchical File System Flat namespace
WebHDFS Implements WebHDFS No (WASB)
Security AD SAS keys
Storage Auto Shared/Files chunked Manually manage expansion/Files intact
Service State Generally Available. PolyBase
just supported!
Generally Available
Billing Pay for data stored and for I/O Pay for data stored and for I/O
Region Availability Two US regions (Other regions coming soon) All Azure Regions
ADL Store vs Blob Store
https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-comparison-with-blob-storage
116. Want
Hadoop?
Need exact
same on-
prem
Need
interactive /
streaming?
Mandatory
No strong opinion
Azure Marketplace (IaaS)
• Need all workloads exactly like on-
premises
• Need 100% Hortonworks/Cloudera/MapR
Azure HDInsight
• Most Hadoop workloads
• Fully managed by Microsoft
• Sell HDI + ADLS
• Stickier to Microsoft than VMs
• Can do interactive (Spark) and streaming
(Storm/Spark)
Azure Data Lake Analytics
• Easiest experience for admin: no sense of
clusters, instant scale per job
• Easiest experience for developers: Visual
Studio/U-SQL (C#+SQL)
• Sell ADLA + ADLS
• Batch workloads only
Need everything exactly
like on-prem
Need core
projects Yes Batch is OK
Always present
ADLA if .NET or
Visual Studio Shop
If .NET or
VS shop?
117. Azure SQL DW HDInsight Hive HDInsight Spark Azure Data Lake SQL Server (IaaS)
Volume Petabytes Petabytes Petabytes Petabytes Terabytes
Security Encryption, TD,
Audit
ADLS / Apache
Ranger
ADLS AAD Security
Groups (data)
Encryption, TD
Audit
Languages T-SQL HiveQL SparkSQL, HiveQL,
Scala, Java, Python,
R
U-SQL T-SQL
Extensibility No Yes, .NET/SerDe Yes, Packages Yes, .NET Yes, .NET CLR
External File
Types
ORC, TXT,
Parquet, RCFile
ORC, CSV, Parquet
+ others
Parquet, JSON,
Hive + others
Many ORC, TXT, Parquet,
RCFile
Admin Low-Medium Medium-High Medium-High Low High
Cost Model DWU Nodes & VM Nodes & VM Units/Jobs VM
Schema
Definition
Schema on Write
/ Polybase
Schema on Read Schema on Read Schema on Read Schema on Write /
Polybase
Max DB Size 240TB Comp (5X
= 1PB)
Unlimited 64TB (64 1TB drives)