Soumettre la recherche
Mettre en ligne
HDP2.5 Updates
•
2 j'aime
•
1,992 vues
Yuta Imai
Suivre
ビッグデータ分析技術勉強会@NHNテコラス(http://futureofdata.connpass.com/event/40079/)にて紹介したHDP2.5のアップデートです。
Lire moins
Lire la suite
Technologie
Signaler
Partager
Signaler
Partager
1 sur 58
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Hadoop in adtech
Hadoop in adtech
Yuta Imai
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
DataWorks Summit/Hadoop Summit
#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
Koji Kawamura
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
DataWorks Summit/Hadoop Summit
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
Yifeng Jiang
Recommandé
Hadoop in adtech
Hadoop in adtech
Yuta Imai
Apache NiFi 1.0 in Nutshell
Apache NiFi 1.0 in Nutshell
DataWorks Summit/Hadoop Summit
#HSTokyo16 Apache Spark Crash Course
#HSTokyo16 Apache Spark Crash Course
DataWorks Summit/Hadoop Summit
そのデータフロー NiFiで楽にしてあげましょう
そのデータフロー NiFiで楽にしてあげましょう
Koji Kawamura
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
Enabling Apache Zeppelin and Spark for Data Science in the Enterprise
DataWorks Summit/Hadoop Summit
Why is my Hadoop cluster slow?
Why is my Hadoop cluster slow?
DataWorks Summit/Hadoop Summit
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
Crash Course HS16Melb - Hands on Intro to Spark & Zeppelin
DataWorks Summit/Hadoop Summit
Hortonworks Data Cloud for AWS 1.11 Updates
Hortonworks Data Cloud for AWS 1.11 Updates
Yifeng Jiang
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
Daniel Madrigal
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
Hortonworks
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Yifeng Jiang
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
Aengus Rooney
Next Generation Execution for Apache Storm
Next Generation Execution for Apache Storm
DataWorks Summit
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Hortonworks
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
DataWorks Summit
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
DataWorks Summit/Hadoop Summit
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Hortonworks
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
Hortonworks
State of the Union with Shaun Connolly
State of the Union with Shaun Connolly
Hortonworks
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
DataWorks Summit/Hadoop Summit
Tracing your security telemetry with Apache Metron
Tracing your security telemetry with Apache Metron
DataWorks Summit/Hadoop Summit
Contenu connexe
Tendances
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
DataWorks Summit/Hadoop Summit
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
Daniel Madrigal
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
Hortonworks
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Yifeng Jiang
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
Aengus Rooney
Next Generation Execution for Apache Storm
Next Generation Execution for Apache Storm
DataWorks Summit
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Hortonworks
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
DataWorks Summit
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
DataWorks Summit/Hadoop Summit
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Hortonworks
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
alanfgates
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
DataWorks Summit/Hadoop Summit
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Hortonworks
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
DataWorks Summit/Hadoop Summit
Apache Spark Crash Course
Apache Spark Crash Course
DataWorks Summit
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
Hortonworks
State of the Union with Shaun Connolly
State of the Union with Shaun Connolly
Hortonworks
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
Tendances
(20)
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
IoT Crash Course Hadoop Summit SJ
IoT Crash Course Hadoop Summit SJ
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
Hortonworks Data In Motion Series Part 4
Hortonworks Data In Motion Series Part 4
Introduction to Hortonworks Data Cloud for AWS
Introduction to Hortonworks Data Cloud for AWS
Hortonworks Big Data Career Paths and Training
Hortonworks Big Data Career Paths and Training
Next Generation Execution for Apache Storm
Next Generation Execution for Apache Storm
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Discover HDP 2.2: Apache Falcon for Hadoop Data Governance
Enabling Diverse Workload Scheduling in YARN
Enabling Diverse Workload Scheduling in YARN
Dataflow with Apache NiFi - Crash Course - HS16SJ
Dataflow with Apache NiFi - Crash Course - HS16SJ
Double Your Hadoop Hardware Performance with SmartSense
Double Your Hadoop Hardware Performance with SmartSense
Hive edw-dataworks summit-eu-april-2017
Hive edw-dataworks summit-eu-april-2017
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
Enabling the Real Time Analytical Enterprise
Enabling the Real Time Analytical Enterprise
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
Apache Spark Crash Course
Apache Spark Crash Course
Webinar Series Part 5 New Features of HDF 5
Webinar Series Part 5 New Features of HDF 5
State of the Union with Shaun Connolly
State of the Union with Shaun Connolly
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Similaire à HDP2.5 Updates
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
DataWorks Summit/Hadoop Summit
Tracing your security telemetry with Apache Metron
Tracing your security telemetry with Apache Metron
DataWorks Summit/Hadoop Summit
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
Steve Loughran
Spark Summit EU talk by Steve Loughran
Spark Summit EU talk by Steve Loughran
Spark Summit
Building a Smarter Home with Apache NiFi and Spark
Building a Smarter Home with Apache NiFi and Spark
DataWorks Summit/Hadoop Summit
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
skumpf
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Big Data Spain
Introduction to Hadoop
Introduction to Hadoop
Timothy Spann
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Mac Moore
Big data spain keynote nov 2016
Big data spain keynote nov 2016
alanfgates
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
DataWorks Summit
Apache Spark and Object Stores
Apache Spark and Object Stores
Steve Loughran
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Wangda Tan
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
DataWorks Summit
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Wangda Tan
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
DataWorks Summit/Hadoop Summit
Apache Metron in the Real World
Apache Metron in the Real World
DataWorks Summit
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
DataWorks Summit
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Owen O'Malley
Similaire à HDP2.5 Updates
(20)
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Tracing your security telemetry with Apache Metron
Tracing your security telemetry with Apache Metron
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
Spark Summit EU talk by Steve Loughran
Spark Summit EU talk by Steve Loughran
Building a Smarter Home with Apache NiFi and Spark
Building a Smarter Home with Apache NiFi and Spark
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Fast Spark Access To Your Complex Data - Avro, JSON, ORC, and Parquet
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
The Enterprise and Connected Data, Trends in the Apache Hadoop Ecosystem by A...
Introduction to Hadoop
Introduction to Hadoop
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Big data spain keynote nov 2016
Big data spain keynote nov 2016
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Apache Spark and Object Stores
Apache Spark and Object Stores
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Dataworks Berlin Summit 18' - Apache hadoop YARN State Of The Union
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Dataworks Berlin Summit 18' - Deep learning On YARN - Running Distributed Te...
Why is my Hadoop* job slow?
Why is my Hadoop* job slow?
Apache Metron in the Real World
Apache Metron in the Real World
Apache Hadoop YARN: Present and Future
Apache Hadoop YARN: Present and Future
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Fast Access to Your Data - Avro, JSON, ORC, and Parquet
Plus de Yuta Imai
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Yuta Imai
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Yuta Imai
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
Yuta Imai
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
Yuta Imai
OLAP options on Hadoop
OLAP options on Hadoop
Yuta Imai
Apache ambari
Apache ambari
Yuta Imai
Spark at Scale
Spark at Scale
Yuta Imai
Dynamic Resource Allocation in Apache Spark
Dynamic Resource Allocation in Apache Spark
Yuta Imai
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Yuta Imai
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Yuta Imai
Benchmark and Metrics
Benchmark and Metrics
Yuta Imai
Hadoop and Kerberos
Hadoop and Kerberos
Yuta Imai
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
Yuta Imai
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Yuta Imai
Amazon Machine Learning
Amazon Machine Learning
Yuta Imai
Global Gaming On AWS
Global Gaming On AWS
Yuta Imai
Digital marketing on AWS
Digital marketing on AWS
Yuta Imai
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
Yuta Imai
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
Yuta Imai
CloudFront経由でのCORS利用
CloudFront経由でのCORS利用
Yuta Imai
Plus de Yuta Imai
(20)
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Node-RED on device to Apache NiFi on cloud, via SORACOM Canal, with no Internet
Deep Learning On Apache Spark
Deep Learning On Apache Spark
Hadoop/Spark セルフサービス系の事例まとめ
Hadoop/Spark セルフサービス系の事例まとめ
IoTアプリケーションで利用するApache NiFi
IoTアプリケーションで利用するApache NiFi
OLAP options on Hadoop
OLAP options on Hadoop
Apache ambari
Apache ambari
Spark at Scale
Spark at Scale
Dynamic Resource Allocation in Apache Spark
Dynamic Resource Allocation in Apache Spark
Apache Hiveの今とこれから - 2016
Apache Hiveの今とこれから - 2016
Hadoop最新事情とHortonworks Data Platform
Hadoop最新事情とHortonworks Data Platform
Benchmark and Metrics
Benchmark and Metrics
Hadoop and Kerberos
Hadoop and Kerberos
Spark Streaming + Amazon Kinesis
Spark Streaming + Amazon Kinesis
オンラインゲームの仕組みと工夫
オンラインゲームの仕組みと工夫
Amazon Machine Learning
Amazon Machine Learning
Global Gaming On AWS
Global Gaming On AWS
Digital marketing on AWS
Digital marketing on AWS
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
EC2のストレージどう使う? -Instance Storageを理解して高速IOを上手に活用!-
クラウドネイティブなアーキテクチャでサクサク解析
クラウドネイティブなアーキテクチャでサクサク解析
CloudFront経由でのCORS利用
CloudFront経由でのCORS利用
Dernier
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Precisely
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Mark Billinghurst
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Alex Barbosa Coqueiro
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
LoriGlavin3
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
Dilum Bandara
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
Lorenzo Miniero
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
gvaughan
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Alfredo García Lavilla
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Stephanie Beckett
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
Sergiu Bodiu
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Databarracks
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
charlottematthew16
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
Fwdays
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
null - The Open Security Community
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
Lonnie McRorey
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Lars Bell
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
Dernier
(20)
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
How to write a Business Continuity Plan
How to write a Business Continuity Plan
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
HDP2.5 Updates
1.
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Data Pla.orm Updates Yuta Imai, Hortonworks
2.
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Data Pla.orm
3.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Data Pla.orm: Release Strategy More frequent releases of Spark, Hive, Ambari and other Apache Data Access projects Extended Services Longer release arcs for core Apache Hadoop components: HDFS, YARN and MapReduce Hadoop Core 2016 2017 2016
2017
4.
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HORTONWORKS DATA PLATFORM Hadoop & YARN Flume Oozie HDP 2.3 is Apache Hadoop; not “based on” Hadoop Pig Hive Tez Sqoop Cloudbreak Ambari Slider KaRa Knox Solr Zookeeper Spark Falcon Ranger HBase Atlas Accumulo Storm Phoenix 4.10.2 DATA MGMT DATA ACCESS
GOVERNANCE & INTEGRATION OPERATIONS SECURITY HDP 2.2 Dec 2014 HDP 2.1 April 2014 HDP 2.0 Oct 2013 HDP 2.2 Dec 2014 HDP 2.1 April 2014 HDP 2.0 Oct 2013 0.12.0 0.12.0 0.12.1 0.13.0 0.4.0 1.4.4 1.4.4 3.3.2 3.4.5 0.4.0 0.5.0 0.14.0 0.14.0 3.4.6 0.5.0 0.4.0 0.9.3 0.5.2 4.0.0 4.7.2 1.2.1 0.60.0 0.98.4 4.2.0 1.6.1 0.6.0 1.5.2 1.4.5 4.1.0 2.0.0 1.4.0 1.5.1 4.0.0 1.3.1 1.5.1 1.4.4 3.4.5 2.2.0 2.4.0 2.6.0 2.7.1 1.4.6 1.0.0 0.6.0 0.5.0 2.1.0 0.8.2 3.4.6 1.5.2 5.2.1 0.80.0 0.5.0 1.7.0 4.4.0 0.10.0 0.6.1 0.7.0 1.2.1 0.15.0 HDP 2.3 Oct 2015 4.2.0 0.96.1 0.98.0 0.9.1 0.8.1 1.4.1 1.1.2 2.7.3 1.4.6 1.3.0 0.9.0 0.6.0 2.4.0 0.10.0 3.4.6 1.5.2 5.5.1 0.91.0 0.7.0 1.7.0 4.7.0 1.0.1 0.10.0 0.7.0 1.2.1+ 2.1*** 0.16.0 HDP 2.5* 2H2016 4.2.0 1.6.2+ 2.0** 1.1.2 2.7.1 1.4.6 1.2.0 0.6.0 0.5.0 2.2.1 0.9.0 3.4.6 1.5.2 5.2.1 0.80.0 0.5.0 1.7.0 4.4.0 0.10.0 0.6.1 0.7.0 1.2.1 0.15.0 HDP 2.4 Mar 2016 4.2.0 1.6.0 1.1.2 Zeppelin Ongoing Innovadon in Apache 0.6.0 * HDP 2.5 – Shows current Apache branches being used. Final component version subject to change based on Apache release process. ** Spark 1.6.2+ Spark 2.0 – HDP 2.5 support installaEon of both Spark 1.6.2 and Spark 2.0. Spark 2.0 is Technical Preview within HDP 2.5. *** Hive 2.1 is Technical Preview within HDP 2.5.
5.
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Data Pla.orm 2.5 Key Highlights • InteracYve Query in Seconds: Hive with LLAP (Technical Preview ) •
Enterprise Spark at Scale: Apache Zeppelin Notebook for Spark • Real-Time ApplicaYons: Storm and HBase/Phoenix • Streamlined OperaYons: Apache Ambari • Dynamic Security: Apache Atlas + Ranger IntegraYon • Hortonworks Data Cloud (Technical Preview) • Hortonworks HDB (Apache HAWQ)
6.
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Interacdve Query in Seconds Hive with LLAP Technical Preview
7.
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved LLAP
8.
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hive 2 with LLAP Enable Interacdve Query In Seconds Developer ProducYvity: InteracYve query in seconds Ease of Use and AdopYon : 100% compaYble with Hive SQL Enterprise Readiness: Linear scaling at Terabytes volume of data Streamlined OperaYons: LLAP integraYon with Ambari with automated dashboards
9.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Why LLAP? •
People like Hive • Disk->Mem is gehng further away – Cloud Storage isn’t co-located – Disks are connected to the CPU via network • Security landscape is changing – Cells & Columns are the new security boundary, not files – Safely masking columns needs a process boundary • Concurrency, Performance & Scale are at conflict – Concurrency at 100k queries/hour – Latencies at 2-5 seconds/query – Petabyte scale warehouses (with terabytes of “hot” data) Node LLAP Process Cache Query Fragment HDFS Query Fragment
10.
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is
LLAP? • Hybrid model combining daemons and containers for fast, concurrent execution of analytical workloads (e.g. Hive SQL queries) • Concurrent queries without specialized YARN queue setup • MulY-threaded execuYon of vectorized operator pipelines • Asynchronous IO and efficient in-memory caching • Relational view of the data available thru the API • High performance scans, execuYon code pushdown • Centralized data security Node LLAP Process Cache Query Fragment HDFS Query Fragment
11.
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hive 2 with LLAP: Architecture Overview Deep Storage YARN Cluster LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors LLAP Daemon Query Executors Query Coordinators Coord- inator Coord- inator Coord- inator HiveServer2 (Query Endpoint) ODBC / JDBC SQL Queries
In-Memory Cache (Shared Across All Users) HDFS and CompaYble S3 WASB Isilon
12.
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved MR vs
Tez vs Tez+LLAP M M M R R M M R M M R M M R HDFS HDFS HDFS T T T R R R T T T R M M M R R R M M R R HDFS In-Memory columnar cache Map – Reduce Intermediate results in HDFS Tez Optimized Pipeline Tez with LLAP Resident process on Nodes Map tasks read HDFS
13.
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved So… M M
M R R R M M R R Tez
14.
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved AM So… T T
T R R R T T T R M M M R R R M M R R Tez Tez with LLAP (auto) auto
15.
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved AM AM So… T
T T R R R T T T R M M M R R R M M R R Tez Tez with LLAP (auto) T T T R R R T T T R Tez with LLAP (all) all auto
16.
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hive 2 with LLAP: Preliminary Numbers 0 10 20 30 40 50 60 70 80 q3 q7
q12 q13 q19 q21 q26 q27 q42 q43 q45 q52 q55 q60 q73 q84 q89 q91 q98 Hive2.0 and LLAP: TPC-DS at 10 TB Scale, 18 Nodes Hive2.0-Tez LLAP Min query dme: Query 55: 2.38s
17.
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved ACID
18.
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Key Features: EDW Offload à ACID GA for Streaming and SQL: –
50+ stabilizaYon fixes. – Tested at mulY-terabyte scale with simultaneous ingest, delete and query. Ã Berer BI Tool CompaYbility through Expanded OLAP CapabiliYes: – MulY parYYon-by, mulY order-by. – Order by UDF/UDAF. – Null order specificaYon (nulls first or nulls last). Ã Faster ETL with More Scalable ParYYon Loads: – 2x faster dynamic parYYon loads. Ã Procedural Extensions (Tech Preview): – Procedural structures: loops, if/else. – Determine min/max parYYon values. – Copy data from external sources like FTP. – Simplifies ETL / data load processes.
19.
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HCatalog Stream
Mutation API ORC ORC ORC ORC ORC ORC HDFS Table Bucket Bucket Bucket ORC
20.
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved SQL
Compliance
21.
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Data Types SQL Features
File Formats Futures Numeric Core SQL Features Columnar Procedural Extensions (PL/SQL) FLOAT/DOUBLE Date, Time and ArithmeYcal FuncYons ORCFile Primary Key / Foreign Key DECIMAL INNER, OUTER, CROSS and SEMI Joins Parquet Non-Equijoin INT/TINYINT/SMALLINT/BIGINT Derived Table Subqueries Text Scalable Cross Product BOOLEAN Correlated + Uncorrelated Subqueries CSV Enhanced OLAP String UNION ALL Logfile CHAR / VARCHAR UDFs, UDAFs, UDTFs Nested / Complex ACID MERGE STRING Common Table Expressions Avro MulY Subquery BINARY UNION DISTINCT JSON Comparison to sub-select Date, Time Advanced Analydcs XML INTERSECT and EXCEPT DATE OLAP and Windowing FuncYons Custom Formats TIMESTAMP CUBE and Grouping Sets Other Features Interval Types Nested Data Analydcs XPath AnalyYcs Complex Types Nested Data Traversal ARRAY Lateral Views MAP ACID Transacdons STRUCT INSERT / UPDATE / DELETE UNION Apache Hive: Journey to SQL:2011 Analydcs Legend ExisYng Projected: HDP 3.0 Projected: HDP 2.5 Track Hive SQL Complete: HIVE-13554
22.
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Enterprise Spark at Scale
23.
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Zeppelin GA: The Data Science Notebook Web-based data science notebook InteracYve data ingesYon and data exploraYon Easy sharing and collaboraYon Secure with single sign-on and encrypYon
24.
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
25.
25 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Spark 2.0 (Technical Preview) Structuring Spark: DataFrames, Datasets and Streaming InteracYve data ingesYon and data exploraYon Easy sharing and collaboraYon Secure with single sign-on and encrypYon
26.
26 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Dynamic Security Policies Apache Atlas and Ranger Integradon
27.
27 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Atlas + Ranger - Powerful Together
28.
28 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Dynamic Masking and Row Level Filtering Dept SSN
CC No Name DOB MRN Policy ID 01 232323233 4539067047629850 John Doe 9/12/1969 8233054331 nj23j424 02 333287465 5391304868205600 Jane Doe 9/13/1969 3736885376 cadsd984 Ranger Policy Enforcement Dept SSN CC No MRN Name 01 xxxxx3233 4539 xxxx xxxx xxxx null John Doe 02 xxxxx7465 5391 xxxx xxxx xxxx null Jane Doe Dept SSN Name MRN 01 232323233 John Doe 8233054331 MarkeYng groups sees CC and SSN as masked values and MRN is nullified Dept employee only sees data specific to that department
29.
29 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Sqoop Teradata Connector Apache KaRa Expanded Native
Connector: Dataset Lineage Custom Acdvity Reporter Metadata Repository RDBMS
30.
30 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache Atlas Enables Business Catalog for Ease of Use à Organize data assets along business terms –
AuthoritaYve: Hierarchical business Taxonomy CreaYon – Agile modeling: Model Conceptual, Logical, Physical assets – DefiniYon and assignment of tags like PII (Personally IdenYfiable InformaYon) Ã Comprehensive features for compliance – MulYple user profiles including Data Steward and Business Analysts – Object audiYng to track “Who did it” – Metadata Versioning to track ”what did they do” Key Benefits: Easy way to create business Taxonomy Useful for mulYple user types including Data Steward and Business Analysts Comprehensive features for compliance
31.
31 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Business Catalog Model and explore metadata via the new Business Catalog in Apache Atlas Data Steward
32.
32 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Real Time Applicadons powered by Storm and HBase/Phoenix
33.
33 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What’s New in Storm Developer ProducYvity: Sliding and tumbling windowing support Developer ProducYvity: New connectors for search and NoSQL Database Enterprise Readiness: AutomaYc back pressure Streamlined OperaYons: Resource aware scheduling and Storm view for Ambari
34.
34 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What’s New in HBase and Phoenix Developer ProducYvity: Phoenix and Hive IntegraYon to run HBASE queries in HIVE Enterprise Readiness: Incremental Back up/Restore Enterprise Readiness: Performance boost for high-scale loads Developer ProducYvity: Ad Hoc AnalyYcs with connector to any ODBC BI tool
35.
35 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Streamlined Operadons Apache Ambari
36.
36 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Streamlined Operadons Phase 1: Advanced Metrics VisualizaYon & Dashboarding Ambari Metrics System A M
B A R I Grafana Goal: Quickly understand cluster health metrics and key performance indicators ⬢ Capabilides – Centralized Dashboarding focusing on component Health & Performance – Ad-Hoc Graph CreaYon ⬢ Pre-Built Dashboards – HDFS – YARN – HBase ⬢ Core Technologies – Ambari Metrics System – Grafana
37.
37 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari now includes pre-built dashboards for visualizing the most important cluster health.
38.
38 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Streamlined Operadons Phase 2: Consolidated Cluster AcYvity ReporYng Goal: Quickly visualize and report on how business users and tenants are using the cluster, top 10 queue’s, users, most Bme consuming jobs ⬢ Capabilides –
Top K AcYvity ReporYng – Chargeback ⬢ Services Covered – YARN – MapReduce – Hive/Tez – Spark – HDFS ⬢ Core Technologies – Hortonworks SmartSense – Apache Zeppelin SmartSense A M B A R I Ambari Metrics System Zeppelin
39.
39 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Acdvity Explorer: Cluster Udlizadon Repordng
40.
40 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Preview: Streamlined Operadons Investments Solr A M
B A R I Log Search Phase 3: Centralized & Contextual Log Search Goal: When issues arise, be able to quickly find issues across all HDP components ⬢ Capabilides – Rapid Search of all HDP component logs – Search across Yme ranges, log levels, and for keywords ⬢ Core Technologies: – Apache Ambari – Apache Solr – Apache Ambari Log Search
41.
41 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks Data Cloud
42.
42 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Abstract: Governance
and Security in Cloud Today’s transportaYon marketplace is compeYYve and quickly evolving. Ouen, unexpected regulaYons can pose a serious risk to operaYons and the borom line. With Hortonworks Data Cloud (HDC), we’ll show how to gain agility in adapYng to new challenges that can turn problems into opportuniYes. • Quickly provision a new analyYc cloud enviroment • Classify and Tag assets to find and understand your data • Security and Audits service to meet compliance requirements
43.
43 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
44.
44 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Learn More http://hortonworks.github.io/hdp-aws/index.html
45.
45 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks HDB Powered by Apache HAWQ
46.
46 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What is HDB / Apache HAWQ ? Hadoop-native SQL
query engine and advanced analytics MPP database that offers high-performance interactive ANSI SQL query execution and machine learning for Data Analysts & Data Scientists who want to find insights from large/complex datasets. HORTONWORKS HDBpowered by Apache HAWQ
47.
47 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Hortonworks HDB Powered By Apache HAWQ 1. Interactive
query performance • Query performance in seconds • Compatible with any ANSI SQL compliant BI Tool • Larger number of concurrent users 2. MADlib big data Machine Learning in SQL for data scientists and data analysts • Classification e.g. predict loan default • Regression e.g. predict value of a sale • Clustering e.g. marketing campaign segmentation, … 3. Data federation using HAWQ Extension Framework • SQL queries against other data sources BI Tool X BI Tool Y BI Tool Z HDP HORTONWORKS DATA PLATFORM HORTONWORKS HDB SQL-89 SQL-92 SQL-2003
48.
48 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Advanced Analytics Performance Exceptional
MPP performance, low latency, high scalability, ACID reliability, fault tolerance Most Complete Language Compliance Higher degree of SQL compatibility, SQL-92, 99, 2003, OLAP, leverage existing SQL skills Best-in-class Query Optimizer Maximize performance and do advanced queries with confidence Elastic Architecture for Scalability Scale-up/down or scale-in/out, expand/ shrink clusters on the fly Tightly integrated w/ MADlib Machine Learning Advanced MPP analytics, data science at scale, directly on Hadoop data HDB / HAWQ Advantages MAD
49.
49 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New in HDF 2.0
50.
50 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New Features of HDF 2.0 Ã Enterprise producYvity via streamlined operaYons – Ambari IntegraYon of Apache NiFi, Kava, Storm – Apache Ranger authorizaYon – Modernized, more intuiYve UI – MulY-tenancy of dataflows Ã
170+ processors, 30% more than in Apache NiFi 1.0 à Edge intelligence with Apache MiNiFi à Increased security opYons with Apache Kava 0.10 à 10X streaming analyYcs performance, windowing and producYvity tools with Apache Storm 1.0
51.
51 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Ambari Integradon
52.
52 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Comprehensive Storm-Ambari Views
53.
53 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Muld-tenant Authorizadon Read Permission
54.
54 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Muld-tenant Authorizadon NO Read Permission (talk about levels, where you can assign permissions)
55.
55 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HDF 2.0 has 170+ Processors, 30% Increase from HDF 1.2 Hash Extract Merge Duplicate Scan GeoEnrich Replace Convert Split Translate Route Content Route Context Route Text Control Rate Distribute Load Generate Table Fetch Jolt Transform JSON Prioridzed Delivery Encrypt Tail Evaluate Execute HL7 FTP UDP XML SFTP HTTP Syslog Email HTML Image AMQP MQTT All Apache project logos are trademarks of the ASF and the respecYve projects. Fetch
56.
56 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Edge Intelligence with Apache MiNiFi à Guaranteed delivery Ã
Data buffering ‒ Backpressure ‒ Pressure release à PrioriYzed queuing à Flow specific QoS ‒ Latency vs. throughput ‒ Loss tolerance à Data provenance à Recovery / recording a rolling log of fine-grained history à Designed for extension r à Small Footprint (~40MB)r Key Features
57.
57 © Hortonworks Inc. 2011 – 2016. All Rights Reserved New Stream Processing Features HDF 2.0 Ã New Storm Connectors Ã
Storm-Kava Spout using new client APIs à Storm Distributed Log Search à Storm Dynamic Worker Profiling à Kava Grafana IntegraYon à Storm Grafana IntegraYon à Improved Nimbus HA à Storm AutomaYc Back Pressure à Storm Distributed cache à Storm Windowing and State Management à Storm Performance improvements à Improved Kava SASL à Storm Topology Event inspector à Storm Resource Aware Scheduling à Storm Dynamic Log Levels à Pacemaker Storm Daemon à Kava Rack Awareness Developer Producdvity Enterprise Readiness Operadonal Simplicity
58.
58 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You
Télécharger maintenant