Introduction to Apache Hadoop

•

8 j'aime•1,649 vues

Steve Watt

Apache Hadoop Presentation by Steve Watt at Data Day Austin 2011

Technologie

Introduction to Apache Hadoop Steve Watt - IBM Big Data Lead @wattsteve #datadayaustin http://stevewatt.blogspot.com

The Origins of Hadoop ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

The Origins of Hadoop ,[object Object],[object Object],[object Object]

So what exactly is Apache Hadoop ? It is a cluster technology with a single master and multiple slaves, designed for commodity hardware It consists of two runtimes, the Hadoop distributed file system ( HDFS ) and Map/Reduce As data is copied onto the HDFS, it ensures the data is blocked and replicated to other machines (node) to provide redundancy Self contained jobs are written in Map/Reduce and submitted to the cluster. The jobs run in parallel on each of the machines in the cluster, processing the data on the local machine ( data locality ). Hadoop may execute or re-execute a job on any node in the cluster. Node failures are automatically handled by the framework.

Hadoop – The Hadoop Cluster - Distributed File System - Map/Reduce

Hadoop - Map/Reduce ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Hadoop – Map/Reduce – JobTracker Details

Examples of Industry using Hadoop ,[object Object],[object Object],[object Object],[object Object]

The Hadoop Ecosystem ClusterChef / Apache Whirr Hadoop Pig / WuKong Cassandra / HBase Offline Systems (Analytics) Online Systems (OLTP @ Scale) BigSheets / DataMeer Hive Provisioning Nutch / SQOOP / Flume Scripting DBA Non-Programmer Load Tooling https://github.com/tomwhite/hadoop-ecosystem/raw/master/hadoop-ecosystem.dot.png

Installing and Running Hadoop - Demo ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Tendances

Hadoop Shamama Kamal

MapReduce basicChirag Ahuja

Hadoop TechnologyAtul Kushwaha

Geek campjdhok

Hadoop trainting in hyderabad@kelly technologiesKelly Technologies

Report Hadoop Map ReduceUrvashi Kataria

Pig, Making Hadoop EasyNick Dimiduk

introduction to data processing using Hadoop and PigRicardo Varela

Hw09 Hadoop Development At Facebook Hive And HdfsCloudera, Inc.

Hadoop hive presentationArvind Kumar

Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...BMonica1

Hadoop online-trainingGeohedrick

Hadoop seminarKrishnenduKrishh

Map ReduceMichel Bruley

Migrating structured data between Hadoop and RDBMSBouquet

Hadoop ArchitectureDr. C.V. Suresh Babu

Introduction to Hadoop TechnologyManish Borkar

Hadoop Seminar ReportAtul Kushwaha

Hadoop TechnologiesKannappan Sirchabesan

AnjuAnju Shekhawat

Tendances (20)

Hadoop

MapReduce basic

Hadoop Technology

Geek camp

Hadoop trainting in hyderabad@kelly technologies

Report Hadoop Map Reduce

Pig, Making Hadoop Easy

introduction to data processing using Hadoop and Pig

Hw09 Hadoop Development At Facebook Hive And Hdfs

Hadoop hive presentation

Hadoop foundation for analytics,B Monica II M.sc computer science ,BON SECOUR...

Hadoop online-training

Hadoop seminar

Map Reduce

Migrating structured data between Hadoop and RDBMS

Hadoop Architecture

Introduction to Hadoop Technology

Hadoop Seminar Report

Hadoop Technologies

Anju

En vedette

Aprendizaje social26844369

PET scansTheLastAngels

Maxima and Minima 2 Applications of Derivativesayman diab

China luxury industry market demand and investment forecast report, 2013 2017Qianzhan Intelligence

China dredging engineering industry development prospect and investment strat...Qianzhan Intelligence

China luxury industry market demand and investment forecast report, 2013 2017Qianzhan Intelligence

El besoCarlos .

China animal husbandry indepth research and investment forecast reportQianzhan Intelligence

China rfid industry market forecast and investment strategy planning report, ...Qianzhan Intelligence

China pharmaceutical excipients industry indepth research and investment stra...Qianzhan Intelligence

Ephata 620Vu Mai JMV

China animal husbandry indepth research and investment forecast reportQianzhan Intelligence

Plano Punto LineaAlex Rodriguez

Study on the Attitude of Medical Partitioners towardAnjum Kazimi

China jewelry industry consumption demand and market competition and investme...Qianzhan Intelligence

China tourism industry market forecast and investment strategy planning, 2013...Qianzhan Intelligence

Job analysis-bharat-employmentBharat Employment Services Pvt. Ltd.

China smart home industry development prospect and investment opportunities r...Qianzhan Intelligence

glue.things – a Mashup Platform for wiring the Internet of Things with the In...Robert Kleinfeld

En vedette (19)

Aprendizaje social

PET scans

Maxima and Minima 2 Applications of Derivatives

China luxury industry market demand and investment forecast report, 2013 2017

China dredging engineering industry development prospect and investment strat...

China luxury industry market demand and investment forecast report, 2013 2017

El beso

China animal husbandry indepth research and investment forecast report

China rfid industry market forecast and investment strategy planning report, ...

China pharmaceutical excipients industry indepth research and investment stra...

Ephata 620

China animal husbandry indepth research and investment forecast report

Plano Punto Linea

Study on the Attitude of Medical Partitioners toward

China jewelry industry consumption demand and market competition and investme...

China tourism industry market forecast and investment strategy planning, 2013...

Job analysis-bharat-employment

China smart home industry development prospect and investment opportunities r...

glue.things – a Mashup Platform for wiring the Internet of Things with the In...

Similaire à Introduction to Apache Hadoop

Hadoop Big Data A big pictureJ S Jodha

THE SOLUTION FOR BIG DATATarak Tar

Hadoop and Mapreduce Introductionrajsandhu1989

Big data pptThirunavukkarasu Ps

Hadoop - OverviewJay

Hadoop and BigData - July 2016Ranjith Sekar

EclipseCon Keynote: Apache Hadoop - An IntroductionCloudera, Inc.

Hadoop live online trainingHarika583

Unit 5Ravi Kumar

Hadoop bigdata overviewharithakannan

Hadoop ecosystem framework n hadoop in live environmentDelhi/NCR HUG

Learn what is Hadoop-and-BigDataThanusha154

hadoop-spark.pptNouhaElhaji1

Hadoop introductionChirag Ahuja

Presentation sreenu dwh-servicesSreenu Musham

Hadoop infoNikita Sure

Big dataAbilash Mavila

hadoopDeep Mehta

Apache hadoop, hdfs and map reduce OverviewNisanth Simon

Similaire à Introduction to Apache Hadoop (20)

Hadoop Big Data A big picture

THE SOLUTION FOR BIG DATA

Hadoop and Mapreduce Introduction

Big data ppt

Hadoop - Overview

Hadoop and BigData - July 2016

EclipseCon Keynote: Apache Hadoop - An Introduction

Hadoop live online training

Unit 5

Hadoop bigdata overview

Hadoop ecosystem framework n hadoop in live environment

Learn what is Hadoop-and-BigData

hadoop-spark.ppt

Hadoop introduction

Presentation sreenu dwh-services

Hadoop info

Big data

hadoop

Apache hadoop, hdfs and map reduce Overview

Plus de Steve Watt

Building Clustered Applications with Kubernetes and DockerSteve Watt

Hadoop for the disillusionedSteve Watt

Hadoop file systemsSteve Watt

Apache con 2013-hadoopSteve Watt

Apache con 2012 taking the guesswork out of your hadoop infrastructureSteve Watt

Mining the Web for Information using HadoopSteve Watt

Tech4Africa - Opportunities around Big DataSteve Watt

Bridging Structured and Unstructred Data with Apache Hadoop and VerticaSteve Watt

Final deckSteve Watt

Web Crawling and Data Gathering with Apache NutchSteve Watt

ExtractivSteve Watt

Plus de Steve Watt (12)

Building Clustered Applications with Kubernetes and Docker

Hadoop for the disillusioned

Hadoop file systems

Apache con 2013-hadoop

Apache con 2012 taking the guesswork out of your hadoop infrastructure

Mining the Web for Information using Hadoop

Tech4Africa - Opportunities around Big Data

Bridging Structured and Unstructred Data with Apache Hadoop and Vertica

Final deck

Web Crawling and Data Gathering with Apache Nutch

Extractiv

Dernier

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent

08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

Slack Application Development 101 Slidespraypatel2

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Salesforce Community Group Quito, Salesforce 101Paola De la Torre

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Presentation on how to chat with PDF using ChatGPT code interpreternaman860154

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh

Dernier (20)

My Hashitalk Indonesia April 2024 Presentation

Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

Boost PC performance: How more available memory can improve productivity

The Codex of Business Writing Software for Real-World Solutions 2.pptx

Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men

Data Cloud, More than a CDP by Matt Robison

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Injustice - Developers Among Us (SciFiDevCon 2024)

SQL Database Design For Developers at php[tek] 2024

Slack Application Development 101 Slides

Scaling API-first – The story of a global engineering organization

Salesforce Community Group Quito, Salesforce 101

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Unblocking The Main Thread Solving ANRs and Frozen Frames

Presentation on how to chat with PDF using ChatGPT code interpreter

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi

Introduction to Apache Hadoop

1. Introduction to Apache Hadoop Steve Watt - IBM Big Data Lead @wattsteve #datadayaustin http://stevewatt.blogspot.com

2. The Origins of Hadoop

5. So what exactly is Apache Hadoop ? It is a cluster technology with a single master and multiple slaves, designed for commodity hardware It consists of two runtimes, the Hadoop distributed file system ( HDFS ) and Map/Reduce As data is copied onto the HDFS, it ensures the data is blocked and replicated to other machines (node) to provide redundancy Self contained jobs are written in Map/Reduce and submitted to the cluster. The jobs run in parallel on each of the machines in the cluster, processing the data on the local machine ( data locality ). Hadoop may execute or re-execute a job on any node in the cluster. Node failures are automatically handled by the framework.

6. Hadoop – The Hadoop Cluster - Distributed File System - Map/Reduce

9. Hadoop - Map/Reduce on the Cluster

10. Hadoop - Map/Reduce Logical Flow

11. Hadoop – Map/Reduce – JobTracker Details

12. Hadoop – Map/Reduce – Job Details

13.

14. The Hadoop Ecosystem ClusterChef / Apache Whirr Hadoop Pig / WuKong Cassandra / HBase Offline Systems (Analytics) Online Systems (OLTP @ Scale) BigSheets / DataMeer Hive Provisioning Nutch / SQOOP / Flume Scripting DBA Non-Programmer Load Tooling https://github.com/tomwhite/hadoop-ecosystem/raw/master/hadoop-ecosystem.dot.png

15.

Notes de l'éditeur

Credit – Doug Cutting for Slide information
Credit Tom White for picure

Introduction to Apache Hadoop

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (19)

Similaire à Introduction to Apache Hadoop

Similaire à Introduction to Apache Hadoop (20)

Plus de Steve Watt

Plus de Steve Watt (12)

Dernier

Dernier (20)

Introduction to Apache Hadoop

Notes de l'éditeur