SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
Hadoop in the Cloud
The What, Why and How
from the Experts
SATO Naoki (@satonaoki)
Azure Technologist
Microsoft Japan
Hadoop
in the Cloud
2
Hadoop
in the Cloud
3
Traditional Hadoop Clusters
4
Challenges with implementing Hadoop
Hadoop Clusters in the Cloud
6
Why Hadoop in the cloud?
Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Hadoop Clusters
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
8
Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Cloud
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
9
Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Hadoop in the Cloud
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
10
Hadoop
in the Cloud
11
Hadoop in the Cloud - Options
Scenarios for deploying Hadoop as hybrid
Traditional Hadoop Clusters – On Prem
14
Hadoop Cluster
Worker Node
HDFS
HDFS HDFS
Tasks Tasks Tasks Tasks Tasks Tasks
Task Tracker
Master Node
Client
Job (jar) file
Job (jar) file
Hadoop Clusters in the Cloud
16
Azure
HDInsight
Hadoop and Spark
as a Service on Azure
Fully managed Hadoop and Spark for the cloud
100% Open Source Hortonworks Data Platform
Clusters up and running in minutes
Managed, monitored and supported by Microsoft
with the industry’s best enterprise SLA
Use familiar BI tools for analysis, or open source
notebooks for interactive data science
63% lower total cost of ownership than deploy
your own Hadoop on-premises*
*IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
HDInsight Cluster Architecture
AzureVirtualNetwork
HTTPS
traffic
ODBC/JDBC WebHCatalog Oozie Ambari
Secure gateway
AuthN
HTTP Proxy
Highly available
Head nodes
Worker nodes
Azure
Data
Lake
Store
Decoupling Compute from Storage
Latency? Consistency?
Bandwidth?
Network
Decoupling Compute from Storage
Network
HDD-like latency
50 Tb+ aggregate
bandwidth[1]
Strong consistency
[1] Azure Flat Network Architecture
Decoupling - Benefits
21
Azure
Data Lake Store
A hyper scale
repository for big data
analytics workloads
Hadoop File System (HDFS) for the cloud
No limits to scale
Store any data in its native format
Enterprise grade access control and encryption
Optimized for analytic workload performance
Customize
cluster?
HDInsight cluster provisioning states
RDP to cluster, update
config files (non-durable)
Ad hoc
Cluster customization options
Hive/Oozie Metastore
Storage accounts & VNET’s
ScriptAction
Via Azure portal
Ready for
deployment
Accepted
Cluster
storage
provisioned
AzureVM
configuration
Running
Timed Out
Error
Cluster
operational
Configuring
HDInsight
Cluster
customization
(custom script
running)
Config values
JAR file placement in
cluster
Via scripting / SDK
No
Yes
Cluster integration options
Each cluster surfaces a REST endpoint for integration,
secured via basic authN over SSL
/thrift – ODBC & JDBC
/Templeton – Job Submission,
Metadata management
/ambari – Cluster health,
monitoring
/oozie – Job orchestration,
scheduling
Hadoop
in the Cloud
24
Cloud Deployments for Big Data
25
Introducing Cortana Intelligence Suite
Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and
devices
Data
Where Big Data is a cornerstone
Action
People
Automated
Systems
Apps
Web
Mobile
Bots
Intelligence
Dashboards &
Visualizations
Cortana
Bot
Framework
Cognitive
Services
Power BI
Information
Management
Event Hubs
Data Catalog
Data Factory
Machine Learning
and Analytics
HDInsight
(Hadoop and
Spark)
Stream Analytics
Intelligence
Data Lake
Analytics
Machine
Learning
Big Data Stores
SQL Data
Warehouse
Data Lake Store
Data
Sources
Apps
Sensors
and
devices
Data
Excel BI
Power BI
Mahout
HiveQL
HIVE
Sqoop Pig
Azure Data Lake Analytics
HBase on
Azure
HDInsight
Big Data Sources
(Raw Unstructured)
Log files
Storm for Azure
HDInsight
Azure
Stream Analytics
Spark Streaming
for Azure
HDInsight
Spark SQL
Spark MLib
Azure Data
Lake Store
U-SQL
Data Orchestration/
Workflow
Azure Data Factory
Oozie for Azure
HDInsight
Kafka for Azure
HDInsight
(future)
SQL Server
Integration Services
Azure
Machine
Learning
R ServerSQL Server
R Services
SSRS
SharePoint
BI
Transactional systems
Azure
SQL DW
SQL Server APS
ETL
Azure
Event Hubs
Data Generation Streaming ConsumptionProcessingStorage
OperationalAnalytical/Exploratory
Data Warehouse
Azure
Website
SSAS
Spark
MLLib
Summary
29
 For more information on HDInsight visit: http://azure.com/hdinsight
 For more information on Data Lake visit: http://azure.com/datalake
http://microsoft-events.jp/mstechsummit/
© 2016 Microsoft Corporation. All rights reserved.

Contenu connexe

En vedette

[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
Naoki (Neo) SATO
 
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
Naoki (Neo) SATO
 
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
Naoki (Neo) SATO
 
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
Naoki (Neo) SATO
 

En vedette (17)

HDFS Federation++
HDFS Federation++HDFS Federation++
HDFS Federation++
 
Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14Hortonworks Hadoop summit 2011 keynote - eric14
Hortonworks Hadoop summit 2011 keynote - eric14
 
HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011HCatalog Hadoop Summit 2011
HCatalog Hadoop Summit 2011
 
初めてのHadoopパッチ投稿 / How to Contribute to Hadoop (Cloudera World Tokyo 2014 LT講演資料)
初めてのHadoopパッチ投稿 / How to Contribute to Hadoop (Cloudera World Tokyo 2014 LT講演資料)初めてのHadoopパッチ投稿 / How to Contribute to Hadoop (Cloudera World Tokyo 2014 LT講演資料)
初めてのHadoopパッチ投稿 / How to Contribute to Hadoop (Cloudera World Tokyo 2014 LT講演資料)
 
[日本DCの本命、大阪でWindows Azureを愛でる会] Windows Azure 概要 & 最新情報
[日本DCの本命、大阪でWindows Azureを愛でる会] Windows Azure 概要 & 最新情報[日本DCの本命、大阪でWindows Azureを愛でる会] Windows Azure 概要 & 最新情報
[日本DCの本命、大阪でWindows Azureを愛でる会] Windows Azure 概要 & 最新情報
 
[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
[Azure Council Experts (ACE) 第10回定例会] Microsoft Azureアップデート情報 (2015/02/19-201...
 
[Azure Council Experts (ACE) 第18回定例会] Microsoft Azureアップデート情報 (2016/06/17-201...
[Azure Council Experts (ACE) 第18回定例会] Microsoft Azureアップデート情報 (2016/06/17-201...[Azure Council Experts (ACE) 第18回定例会] Microsoft Azureアップデート情報 (2016/06/17-201...
[Azure Council Experts (ACE) 第18回定例会] Microsoft Azureアップデート情報 (2016/06/17-201...
 
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
[API Meetup Tokyo #7 ~PaaSとAPIスペシャル~] AzureでMobile / Webアプリのサーバー側をAPI化 (Azure...
 
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
【Build 記念】Windows Azure 最新情報 ~2013 年上半期の総括~ > 「Windows Azure 最新情報」
 
Java/Android開発者のためのWindows Azure入門 (パート2)
Java/Android開発者のためのWindows Azure入門 (パート2)Java/Android開発者のためのWindows Azure入門 (パート2)
Java/Android開発者のためのWindows Azure入門 (パート2)
 
[Java Festa in 札幌 2012] Windows Azure を活用した Windows 8 アプリケーション開発
[Java Festa in 札幌 2012] Windows Azure を活用した Windows 8 アプリケーション開発[Java Festa in 札幌 2012] Windows Azure を活用した Windows 8 アプリケーション開発
[Java Festa in 札幌 2012] Windows Azure を活用した Windows 8 アプリケーション開発
 
[Azure Council Experts (ACE) 第16回定例会] Microsoft Azureアップデート情報 (2016/02/19-201...
[Azure Council Experts (ACE) 第16回定例会] Microsoft Azureアップデート情報 (2016/02/19-201...[Azure Council Experts (ACE) 第16回定例会] Microsoft Azureアップデート情報 (2016/02/19-201...
[Azure Council Experts (ACE) 第16回定例会] Microsoft Azureアップデート情報 (2016/02/19-201...
 
[Rakuten TechTalk] Microsoft Azure (August 20, 2014)
[Rakuten TechTalk] Microsoft Azure (August 20, 2014)[Rakuten TechTalk] Microsoft Azure (August 20, 2014)
[Rakuten TechTalk] Microsoft Azure (August 20, 2014)
 
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
[Azure Council Experts (ACE) 第13回定例会] Microsoft Azureアップデート情報 (2015/08/20-201...
 
OSS on Azure - Microsoft Open Technologies の Ross Gardler さんを囲む会 改め 『Microsof...
OSS on Azure - Microsoft Open Technologies の Ross Gardler さんを囲む会 改め 『Microsof...OSS on Azure - Microsoft Open Technologies の Ross Gardler さんを囲む会 改め 『Microsof...
OSS on Azure - Microsoft Open Technologies の Ross Gardler さんを囲む会 改め 『Microsof...
 
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
[Azure Council Experts (ACE) 第12回定例会] Microsoft Azureアップデート情報 (2015/06/18-201...
 
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
[Azure Council Experts (ACE) 第15回定例会] Microsoft Azureアップデート情報 (2015/12/11-201...
 

Plus de Naoki (Neo) SATO

How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
Naoki (Neo) SATO
 

Plus de Naoki (Neo) SATO (20)

LLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flowLLMOps with Azure Machine Learning prompt flow
LLMOps with Azure Machine Learning prompt flow
 
Microsoft Copilot, your everyday AI companion (Machine Learning 15minutes! Br...
Microsoft Copilot, your everyday AI companion (Machine Learning 15minutes! Br...Microsoft Copilot, your everyday AI companion (Machine Learning 15minutes! Br...
Microsoft Copilot, your everyday AI companion (Machine Learning 15minutes! Br...
 
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machin...
 
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
 
30分でわかるマイクロサービスアーキテクチャ 第2版
30分でわかるマイクロサービスアーキテクチャ 第2版30分でわかるマイクロサービスアーキテクチャ 第2版
30分でわかるマイクロサービスアーキテクチャ 第2版
 
[Machine Learning 15minutes! Broadcast #67] Azure AI - Build 2022 Updates and...
[Machine Learning 15minutes! Broadcast #67] Azure AI - Build 2022 Updates and...[Machine Learning 15minutes! Broadcast #67] Azure AI - Build 2022 Updates and...
[Machine Learning 15minutes! Broadcast #67] Azure AI - Build 2022 Updates and...
 
[Machine Learning 15minutes! #61] Azure OpenAI Service
[Machine Learning 15minutes! #61] Azure OpenAI Service[Machine Learning 15minutes! #61] Azure OpenAI Service
[Machine Learning 15minutes! #61] Azure OpenAI Service
 
[第50回 Machine Learning 15minutes! Broadcast] Azure Machine Learning - Ignite ...
[第50回 Machine Learning 15minutes! Broadcast] Azure Machine Learning - Ignite ...[第50回 Machine Learning 15minutes! Broadcast] Azure Machine Learning - Ignite ...
[第50回 Machine Learning 15minutes! Broadcast] Azure Machine Learning - Ignite ...
 
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
[Developers Festa Sapporo 2020] Microsoft/GitHubが提供するDeveloper Cloud (Develop...
 
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB ...
 
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
[第45回 Machine Learning 15minutes! Broadcast] Azure AI - Build 2020 Updates
 
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
[第43回 Machine Learning 15minutes! × 2] Azure AI Updates
 
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
[Developers Festa Sapporo 2019] Azure Updates - Ignite 2019
 
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
[Serverless OpenHack Tokyo] Azure Serverless (Japanese)
 
[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)[Serverless OpenHack Tokyo] Azure Serverless (English)
[Serverless OpenHack Tokyo] Azure Serverless (English)
 
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
[Azure Council Experts (ACE) 第37回定例会] Microsoft Azureアップデート情報 (2019/08/22-201...
 
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
[db tech showcase Tokyo 2019] Azure Cosmos DB Deep Dive ~ Partitioning, Globa...
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
 
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
[Azure Council Experts (ACE) 第36回定例会] Microsoft Azureアップデート情報 (2019/06/14-201...
 
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
How to work with technology to survive as an engineer (エンジニアとして生き残るためのテクノロジーと...
 

Dernier

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Dernier (20)

call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 

[Hadoop Summit 2016 Tokyo] Hadoop in the Cloud – The What, Why and How from the Experts

  • 1. Hadoop in the Cloud The What, Why and How from the Experts SATO Naoki (@satonaoki) Azure Technologist Microsoft Japan
  • 6. Hadoop Clusters in the Cloud 6
  • 7. Why Hadoop in the cloud?
  • 8. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Hadoop Clusters Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 8
  • 9. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Cloud Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 9
  • 10. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Hadoop in the Cloud Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 10
  • 12. Hadoop in the Cloud - Options
  • 13. Scenarios for deploying Hadoop as hybrid
  • 14. Traditional Hadoop Clusters – On Prem 14 Hadoop Cluster Worker Node HDFS HDFS HDFS Tasks Tasks Tasks Tasks Tasks Tasks Task Tracker Master Node Client Job (jar) file Job (jar) file
  • 15. Hadoop Clusters in the Cloud
  • 16. 16 Azure HDInsight Hadoop and Spark as a Service on Azure Fully managed Hadoop and Spark for the cloud 100% Open Source Hortonworks Data Platform Clusters up and running in minutes Managed, monitored and supported by Microsoft with the industry’s best enterprise SLA Use familiar BI tools for analysis, or open source notebooks for interactive data science 63% lower total cost of ownership than deploy your own Hadoop on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
  • 17. HDInsight Cluster Architecture AzureVirtualNetwork HTTPS traffic ODBC/JDBC WebHCatalog Oozie Ambari Secure gateway AuthN HTTP Proxy Highly available Head nodes Worker nodes Azure Data Lake Store
  • 18. Decoupling Compute from Storage Latency? Consistency? Bandwidth? Network
  • 19. Decoupling Compute from Storage Network HDD-like latency 50 Tb+ aggregate bandwidth[1] Strong consistency [1] Azure Flat Network Architecture
  • 21. 21 Azure Data Lake Store A hyper scale repository for big data analytics workloads Hadoop File System (HDFS) for the cloud No limits to scale Store any data in its native format Enterprise grade access control and encryption Optimized for analytic workload performance
  • 22. Customize cluster? HDInsight cluster provisioning states RDP to cluster, update config files (non-durable) Ad hoc Cluster customization options Hive/Oozie Metastore Storage accounts & VNET’s ScriptAction Via Azure portal Ready for deployment Accepted Cluster storage provisioned AzureVM configuration Running Timed Out Error Cluster operational Configuring HDInsight Cluster customization (custom script running) Config values JAR file placement in cluster Via scripting / SDK No Yes
  • 23. Cluster integration options Each cluster surfaces a REST endpoint for integration, secured via basic authN over SSL /thrift – ODBC & JDBC /Templeton – Job Submission, Metadata management /ambari – Cluster health, monitoring /oozie – Job orchestration, scheduling
  • 25. Cloud Deployments for Big Data 25
  • 26. Introducing Cortana Intelligence Suite Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services Power BI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop and Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores SQL Data Warehouse Data Lake Store Data Sources Apps Sensors and devices Data
  • 27. Where Big Data is a cornerstone Action People Automated Systems Apps Web Mobile Bots Intelligence Dashboards & Visualizations Cortana Bot Framework Cognitive Services Power BI Information Management Event Hubs Data Catalog Data Factory Machine Learning and Analytics HDInsight (Hadoop and Spark) Stream Analytics Intelligence Data Lake Analytics Machine Learning Big Data Stores SQL Data Warehouse Data Lake Store Data Sources Apps Sensors and devices Data
  • 28. Excel BI Power BI Mahout HiveQL HIVE Sqoop Pig Azure Data Lake Analytics HBase on Azure HDInsight Big Data Sources (Raw Unstructured) Log files Storm for Azure HDInsight Azure Stream Analytics Spark Streaming for Azure HDInsight Spark SQL Spark MLib Azure Data Lake Store U-SQL Data Orchestration/ Workflow Azure Data Factory Oozie for Azure HDInsight Kafka for Azure HDInsight (future) SQL Server Integration Services Azure Machine Learning R ServerSQL Server R Services SSRS SharePoint BI Transactional systems Azure SQL DW SQL Server APS ETL Azure Event Hubs Data Generation Streaming ConsumptionProcessingStorage OperationalAnalytical/Exploratory Data Warehouse Azure Website SSAS Spark MLLib
  • 30.  For more information on HDInsight visit: http://azure.com/hdinsight  For more information on Data Lake visit: http://azure.com/datalake
  • 32.
  • 33. © 2016 Microsoft Corporation. All rights reserved.