SlideShare une entreprise Scribd logo
1  sur  19
Analytics on your data in place
Steve Watt, Red Hat

CC flickr Barta IV

@wattsteve
Hadoop at Red Hat

@wattsteve
But tonight I have my community hat on

CC flickr wcdumonts

@wattsteve
Hadoop in 2007
Platform Layers Technologies
Computational
Runtimes
FileSystems

HDFS or Amazon S3

Infrastructures

CC flickr wwarby

MapReduce, HBase

x86 or Amazon EC2

@wattsteve
Hadoop in 2013
Platform Layers

Technologies

Computational
Runtimes

YARN, GiRAPH, MapReduce,
HBase, Phoenix,
Spark/BDAS, Drill, Impala,
Stinger

FileSystems

HDFS + 13 Other Hadoop
FileSystems

Infrastructures

System on a Chip, x86,
Virtualization and Cloud

CC flickr lowfatbrains

@wattsteve
Observation #1: The Hadoop FileSystem Interface is
the keystone of the entire Ecosystem

CC flickr grufnik

@wattsteve
Observation #2: Moving data around just to analyze it
is slow and expensive. Especially if it requires a redundant
repository

.

CC flickr traftery

@wattsteve
So how does this work?
By leveraging Hadoop’s pluggable FileSystem architecture
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
Hadoop FileSystem Plugin

Hadoop FileSystem

FileSystem
Implementation

@wattsteve
Hadoop FileSystem Configuration for HDFS

Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
HDFS Plugin

Hadoop FileSystem

HDFS

@wattsteve
What are some examples of where big
data is stored?
- Object Stores
- NoSQL Stores
- Distributed FileSystems
- Network Filers
- Databases

CC flickr birdwatcher63

@wattsteve
Network Filer Example
Hadoop FileSystem Configuration for GlusterFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
GlusterFS Plugin

Hadoop FileSystem

@wattsteve
Network Filer - Apache Hadoop on GlusterFS
Hadoop

Resource

Master Services

Manager

Management
Server

plugin

SWIFT

Hadoop

Node

Node

Node

Workers

Manager

Manager

Manager

plugin

plugin

plugin

NFS

FUSE

GlusterFS
FUSE

FUSE

FUSE

Trusted Peer

Trusted Peer

DAS Brick

DAS Brick

DAS Brick

Server 1

Server 2

Server 50

...

Trusted Peer

@wattsteve
Object Store Example
Hadoop FileSystem Configuration for SWIFT
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
SWIFT Plugin

Hadoop FileSystem

SWIFT

@wattsteve
NoSQL Example
Hadoop FileSystem Configuration for CassandraFS
Hadoop FS Clients

MapReduce

HBase

YARN

Any Application

Hadoop FileSystem Interface
CassandraFS Plugin

Hadoop FileSystem

@wattsteve
NoSQL - Apache Hadoop on CassandraFS

@wattsteve
We are working on filesystem tests within
Apache Hadoop-Common and Apache BigTop
as well as opening up ecosystem tools

CC flickr syume

@wattsteve
@wattsteve
@wattsteve
Closing Remarks
1. The amount of Hadoop FileSystems available
to you continues to increase
2. This is good! A vibrant ecosystem gives you
choice
3. Evaluate the option of analyzing your data in
place before deploying new environments

CC flickr zoomboy1

@wattsteve

Contenu connexe

Tendances

Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudAlluxio, Inc.
 
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...BMonica1
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Nick Galbreath
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduceHadoop User Group
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21Hadoop User Group
 
B.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEB.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEBMonica1
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixJeff Magnusson
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformNavneet Gupta
 
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Ocean Project
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map ReduceUrvashi Kataria
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache HadoopSteve Watt
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latestWes McKinney
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkDataWorks Summit
 
Facebook Hadoop Data & Applications
Facebook Hadoop Data & ApplicationsFacebook Hadoop Data & Applications
Facebook Hadoop Data & Applicationsdzhou
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptxMoldovan Radu Adrian
 

Tendances (20)

Tame that Beast
Tame that BeastTame that Beast
Tame that Beast
 
Enabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid CloudEnabling Apache Spark for Hybrid Cloud
Enabling Apache Spark for Hybrid Cloud
 
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...
 
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
Care and Feeding of Large Scale Graphite Installations - DevOpsDays Austin 2013
 
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReducePublic Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
Public Terabyte Dataset Project: Web crawling with Amazon Elastic MapReduce
 
2 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-212 hadoop@e bay-hug-2010-07-21
2 hadoop@e bay-hug-2010-07-21
 
B.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCEB.MONICA II M.SC COMPUTER SCIENCE
B.MONICA II M.SC COMPUTER SCIENCE
 
Putting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at NetflixPutting Lipstick on Apache Pig at Netflix
Putting Lipstick on Apache Pig at Netflix
 
Big Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data PlatformBig Data Ingestion @ Flipkart Data Platform
Big Data Ingestion @ Flipkart Data Platform
 
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
Hardware- and Network-Enhanced Software Systems for Cloud Computing, OW2 Open...
 
Big data and tools
Big data and tools Big data and tools
Big data and tools
 
Report Hadoop Map Reduce
Report Hadoop Map ReduceReport Hadoop Map Reduce
Report Hadoop Map Reduce
 
Introduction to Apache Hadoop
Introduction to Apache HadoopIntroduction to Apache Hadoop
Introduction to Apache Hadoop
 
Apache Arrow and Python: The latest
Apache Arrow and Python: The latestApache Arrow and Python: The latest
Apache Arrow and Python: The latest
 
Searching At Scale
Searching At ScaleSearching At Scale
Searching At Scale
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Hadoop - Apache Hive
Hadoop - Apache HiveHadoop - Apache Hive
Hadoop - Apache Hive
 
Implementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache SparkImplementing the Lambda Architecture efficiently with Apache Spark
Implementing the Lambda Architecture efficiently with Apache Spark
 
Facebook Hadoop Data & Applications
Facebook Hadoop Data & ApplicationsFacebook Hadoop Data & Applications
Facebook Hadoop Data & Applications
 
Big data advance topics - part 2.pptx
Big data   advance topics - part 2.pptxBig data   advance topics - part 2.pptx
Big data advance topics - part 2.pptx
 

Similaire à Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusionedSteve Watt
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusionedBigDataCamp
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQpivotalny
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSDataWorks Summit
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo pptPhil Young
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarnDatalayer
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big dealeduarderwee
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony NguyenThanh Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Thanh Nguyen
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahoMartin Ferguson
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSHortonworks
 
Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptxmohaaalsa
 
Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Marcel Krcah
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHitendra Kumar
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick GuideAsim Jalis
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010nzhang
 

Similaire à Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup (20)

Hadoop for the disillusioned
Hadoop for the disillusionedHadoop for the disillusioned
Hadoop for the disillusioned
 
4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned4 hadoop for-the-disillusioned
4 hadoop for-the-disillusioned
 
SQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQSQL and Machine Learning on Hadoop using HAWQ
SQL and Machine Learning on Hadoop using HAWQ
 
HDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFSHDFS tiered storage: mounting object stores in HDFS
HDFS tiered storage: mounting object stores in HDFS
 
Big Data Journey
Big Data JourneyBig Data Journey
Big Data Journey
 
Hadoop demo ppt
Hadoop demo pptHadoop demo ppt
Hadoop demo ppt
 
20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn20140202 fosdem-nosql-devroom-hadoop-yarn
20140202 fosdem-nosql-devroom-hadoop-yarn
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Big data or big deal
Big data or big dealBig data or big deal
Big data or big deal
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
field_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentahofield_guide_to_hadoop_pentaho
field_guide_to_hadoop_pentaho
 
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFSDiscover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
Discover HDP 2.1: Apache Hadoop 2.4.0, YARN & HDFS
 
Lecture-20.pptx
Lecture-20.pptxLecture-20.pptx
Lecture-20.pptx
 
Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)Hadoop in Practice (SDN Conference, Dec 2014)
Hadoop in Practice (SDN Conference, Dec 2014)
 
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log ProcessingHadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
 
Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick Guide
 
Hadoop_arunam_ppt
Hadoop_arunam_pptHadoop_arunam_ppt
Hadoop_arunam_ppt
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 
Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010Hive @ Hadoop day seattle_2010
Hive @ Hadoop day seattle_2010
 

Dernier

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 

Dernier (20)

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 

Steve Watt, Chief Architect, Hadoop and Big Data, Red Hat - 21st BDL meetup

  • 1. Analytics on your data in place Steve Watt, Red Hat CC flickr Barta IV @wattsteve
  • 2. Hadoop at Red Hat @wattsteve
  • 3. But tonight I have my community hat on CC flickr wcdumonts @wattsteve
  • 4. Hadoop in 2007 Platform Layers Technologies Computational Runtimes FileSystems HDFS or Amazon S3 Infrastructures CC flickr wwarby MapReduce, HBase x86 or Amazon EC2 @wattsteve
  • 5. Hadoop in 2013 Platform Layers Technologies Computational Runtimes YARN, GiRAPH, MapReduce, HBase, Phoenix, Spark/BDAS, Drill, Impala, Stinger FileSystems HDFS + 13 Other Hadoop FileSystems Infrastructures System on a Chip, x86, Virtualization and Cloud CC flickr lowfatbrains @wattsteve
  • 6. Observation #1: The Hadoop FileSystem Interface is the keystone of the entire Ecosystem CC flickr grufnik @wattsteve
  • 7. Observation #2: Moving data around just to analyze it is slow and expensive. Especially if it requires a redundant repository . CC flickr traftery @wattsteve
  • 8. So how does this work? By leveraging Hadoop’s pluggable FileSystem architecture Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface Hadoop FileSystem Plugin Hadoop FileSystem FileSystem Implementation @wattsteve
  • 9. Hadoop FileSystem Configuration for HDFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface HDFS Plugin Hadoop FileSystem HDFS @wattsteve
  • 10. What are some examples of where big data is stored? - Object Stores - NoSQL Stores - Distributed FileSystems - Network Filers - Databases CC flickr birdwatcher63 @wattsteve
  • 11. Network Filer Example Hadoop FileSystem Configuration for GlusterFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface GlusterFS Plugin Hadoop FileSystem @wattsteve
  • 12. Network Filer - Apache Hadoop on GlusterFS Hadoop Resource Master Services Manager Management Server plugin SWIFT Hadoop Node Node Node Workers Manager Manager Manager plugin plugin plugin NFS FUSE GlusterFS FUSE FUSE FUSE Trusted Peer Trusted Peer DAS Brick DAS Brick DAS Brick Server 1 Server 2 Server 50 ... Trusted Peer @wattsteve
  • 13. Object Store Example Hadoop FileSystem Configuration for SWIFT Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface SWIFT Plugin Hadoop FileSystem SWIFT @wattsteve
  • 14. NoSQL Example Hadoop FileSystem Configuration for CassandraFS Hadoop FS Clients MapReduce HBase YARN Any Application Hadoop FileSystem Interface CassandraFS Plugin Hadoop FileSystem @wattsteve
  • 15. NoSQL - Apache Hadoop on CassandraFS @wattsteve
  • 16. We are working on filesystem tests within Apache Hadoop-Common and Apache BigTop as well as opening up ecosystem tools CC flickr syume @wattsteve
  • 19. Closing Remarks 1. The amount of Hadoop FileSystems available to you continues to increase 2. This is good! A vibrant ecosystem gives you choice 3. Evaluate the option of analyzing your data in place before deploying new environments CC flickr zoomboy1 @wattsteve

Notes de l'éditeur

  1. Now lets look at some examples