Soumettre la recherche
Mettre en ligne
hadoop architecture -Big data hadoop
•
0 j'aime
•
58 vues
J
jasikadogra
Suivre
It includes big data notes.
Lire moins
Lire la suite
Formation
Signaler
Partager
Signaler
Partager
1 sur 53
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
DataWorks Summit
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of Ozone
Erik Krogen
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
DataWorks Summit/Hadoop Summit
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Zhe Zhang
What's new in hadoop 3.0
What's new in hadoop 3.0
Heiko Loewe
HDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
Recommandé
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
HDFS Erasure Code Storage - Same Reliability at Better Storage Efficiency
DataWorks Summit
Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of Ozone
Erik Krogen
HBase Accelerated: In-Memory Flush and Compaction
HBase Accelerated: In-Memory Flush and Compaction
DataWorks Summit/Hadoop Summit
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Less is More: 2X Storage Efficiency with HDFS Erasure Coding
Zhe Zhang
What's new in hadoop 3.0
What's new in hadoop 3.0
Heiko Loewe
HDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
HBase and HDFS: Understanding FileSystem Usage in HBase
HBase and HDFS: Understanding FileSystem Usage in HBase
enissoz
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
Cross-DC Fault-Tolerant ViewFileSystem @ Twitter
DataWorks Summit/Hadoop Summit
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Manish Chopra
Parallel HDF5 Developments
Parallel HDF5 Developments
The HDF-EOS Tools and Information Center
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
DataWorks Summit/Hadoop Summit
Caching and Buffering in HDF5
Caching and Buffering in HDF5
The HDF-EOS Tools and Information Center
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
The Apache Software Foundation
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
DataWorks Summit
Performance Tuning in HDF5
Performance Tuning in HDF5
The HDF-EOS Tools and Information Center
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
Ceph Community
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
DataWorks Summit
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Intel® Software
HDF Tools Tutorial
HDF Tools Tutorial
The HDF-EOS Tools and Information Center
Optimizing Hive Queries
Optimizing Hive Queries
Owen O'Malley
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
inside-BigData.com
HDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary Results
The HDF-EOS Tools and Information Center
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Chris Nauroth
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
HDF5 I/O Performance
HDF5 I/O Performance
The HDF-EOS Tools and Information Center
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
Leons Petražickis
Hadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapa
kapa rohit
Contenu connexe
Tendances
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Manish Chopra
Parallel HDF5 Developments
Parallel HDF5 Developments
The HDF-EOS Tools and Information Center
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
DataWorks Summit
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
DataWorks Summit/Hadoop Summit
Caching and Buffering in HDF5
Caching and Buffering in HDF5
The HDF-EOS Tools and Information Center
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
The Apache Software Foundation
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
DataWorks Summit
Performance Tuning in HDF5
Performance Tuning in HDF5
The HDF-EOS Tools and Information Center
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
Ceph Community
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
DataWorks Summit
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Intel® Software
HDF Tools Tutorial
HDF Tools Tutorial
The HDF-EOS Tools and Information Center
Optimizing Hive Queries
Optimizing Hive Queries
Owen O'Malley
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
inside-BigData.com
HDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary Results
The HDF-EOS Tools and Information Center
HDFS- What is New and Future
HDFS- What is New and Future
DataWorks Summit
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Chris Nauroth
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Databricks
HDF5 I/O Performance
HDF5 I/O Performance
The HDF-EOS Tools and Information Center
Tendances
(20)
Difference between hadoop 2 vs hadoop 3
Difference between hadoop 2 vs hadoop 3
Parallel HDF5 Developments
Parallel HDF5 Developments
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
HDFS Erasure Coding in Action
HDFS Erasure Coding in Action
Caching and Buffering in HDF5
Caching and Buffering in HDF5
ORC 2015: Faster, Better, Smaller
ORC 2015: Faster, Better, Smaller
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Compression Options in Hadoop - A Tale of Tradeoffs
Compression Options in Hadoop - A Tale of Tradeoffs
Performance Tuning in HDF5
Performance Tuning in HDF5
Reference Architecture: Architecting Ceph Storage Solutions
Reference Architecture: Architecting Ceph Storage Solutions
In-memory Caching in HDFS: Lower Latency, Same Great Taste
In-memory Caching in HDFS: Lower Latency, Same Great Taste
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
HDF Tools Tutorial
HDF Tools Tutorial
Optimizing Hive Queries
Optimizing Hive Queries
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
HDF4 and HDF5 Performance Preliminary Results
HDF4 and HDF5 Performance Preliminary Results
HDFS- What is New and Future
HDFS- What is New and Future
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
Apache Spark on Supercomputers: A Tale of the Storage Hierarchy with Costin I...
HDF5 I/O Performance
HDF5 I/O Performance
Similaire à hadoop architecture -Big data hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
Leons Petražickis
Hadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapa
kapa rohit
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
Data Con LA
HADOOP.pptx
HADOOP.pptx
Bharathi567510
Lecture 2 part 1
Lecture 2 part 1
Jazan University
Hdfs architecture
Hdfs architecture
Aisha Siddiqa
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Uwe Printz
Hadoop training in bangalore
Hadoop training in bangalore
Kelly Technologies
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
DataWorks Summit
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
Heiko Loewe
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
Derek Chen
Hadoop data management
Hadoop data management
Subhas Kumar Ghosh
big data ppt.ppt
big data ppt.ppt
ssuserec53e73
Ozone and HDFS's Evolution
Ozone and HDFS's Evolution
DataWorks Summit
Tutorial Haddop 2.3
Tutorial Haddop 2.3
Atanu Chatterjee
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
NIKHILGR3
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
DanishMahmood23
Hadoop -HDFS.ppt
Hadoop -HDFS.ppt
RamyaMurugesan12
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
DataWorks Summit
Hadoop Architecture and HDFS
Hadoop Architecture and HDFS
Edureka!
Similaire à hadoop architecture -Big data hadoop
(20)
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
Hadoop HDFS by rohitkapa
Hadoop HDFS by rohitkapa
Aziksa hadoop architecture santosh jha
Aziksa hadoop architecture santosh jha
HADOOP.pptx
HADOOP.pptx
Lecture 2 part 1
Lecture 2 part 1
Hdfs architecture
Hdfs architecture
Hadoop 3.0 - Revolution or evolution?
Hadoop 3.0 - Revolution or evolution?
Hadoop training in bangalore
Hadoop training in bangalore
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
Big Data in Container; Hadoop Spark in Docker and Mesos
Big Data in Container; Hadoop Spark in Docker and Mesos
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
Hadoop data management
Hadoop data management
big data ppt.ppt
big data ppt.ppt
Ozone and HDFS's Evolution
Ozone and HDFS's Evolution
Tutorial Haddop 2.3
Tutorial Haddop 2.3
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
Lecture10_CloudServicesModel_MapReduceHDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
Topic 9a-Hadoop Storage- HDFS.pptx
Hadoop -HDFS.ppt
Hadoop -HDFS.ppt
Ozone and HDFS’s evolution
Ozone and HDFS’s evolution
Hadoop Architecture and HDFS
Hadoop Architecture and HDFS
Plus de jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing
Email marketing
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing- Excelrange
Email marketing- Excelrange
jasikadogra
FB marketing notes- Excelrange
FB marketing notes- Excelrange
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing
Email marketing
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing
Email marketing
jasikadogra
FB NOTES
FB NOTES
jasikadogra
Email marketing Notes - excelrange
Email marketing Notes - excelrange
jasikadogra
Email marketing Notes - excelrange
Email marketing Notes - excelrange
jasikadogra
Email marketing notes- excelrange
Email marketing notes- excelrange
jasikadogra
Email marketing Notes- Excelrange
Email marketing Notes- Excelrange
jasikadogra
Email marketing
Email marketing
jasikadogra
Facebook Ads Notes
Facebook Ads Notes
jasikadogra
Wordpress tutorial - Excelrange
Wordpress tutorial - Excelrange
jasikadogra
digital marketing curriculum
digital marketing curriculum
jasikadogra
Plus de jasikadogra
(20)
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing
Email marketing
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing- Excelrange
Email marketing- Excelrange
FB marketing notes- Excelrange
FB marketing notes- Excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing
Email marketing
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing
Email marketing
FB NOTES
FB NOTES
Email marketing Notes - excelrange
Email marketing Notes - excelrange
Email marketing Notes - excelrange
Email marketing Notes - excelrange
Email marketing notes- excelrange
Email marketing notes- excelrange
Email marketing Notes- Excelrange
Email marketing Notes- Excelrange
Email marketing
Email marketing
Facebook Ads Notes
Facebook Ads Notes
Wordpress tutorial - Excelrange
Wordpress tutorial - Excelrange
digital marketing curriculum
digital marketing curriculum
Dernier
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
Thiyagu K
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
jbellavia9
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
PECB
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
TechSoup
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
Poh-Sun Goh
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
MaryamAhmad92
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
EduSkills OECD
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
Jayanti Pande
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
Ramakrishna Reddy Bijjam
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
RAM LAL ANAND COLLEGE, DELHI UNIVERSITY.
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
Thiyagu K
microwave assisted reaction. General introduction
microwave assisted reaction. General introduction
Maksud Ahmed
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
TechSoup
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
Celine George
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
christianmathematics
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
Celine George
Dernier
(20)
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
microwave assisted reaction. General introduction
microwave assisted reaction. General introduction
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
hadoop architecture -Big data hadoop
1.
© 2011 IBM
Corporation Hadoop architecture IBM Information Management – Cloud Computing Center of Competence IBM Canada Labs
2.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 2
3.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 3
4.
© 2011 IBM
Corporation 4 Node 1 Terminology review
5.
© 2011 IBM
Corporation 5 Node 2 Node 1 Terminology review
6.
© 2011 IBM
Corporation 6 Node 2 Node n … Node 1 Terminology review
7.
© 2011 IBM
Corporation 7 Rack 1 Node 2 Node n … Node 1 Terminology review
8.
© 2011 IBM
Corporation 8 Rack 1 Node 2 Node n … Node 1 Node 2 Node n … Rack 2 Node 1 Terminology review
9.
© 2011 IBM
Corporation 9 Rack 1 Node 2 Node n … Node 1 Node 2 Node n … Rack 2 Node 1 Node 2 Node n … Rack n Node 1 … Terminology review
10.
© 2011 IBM
Corporation 10 Hadoop cluster Rack 1 Node 2 Node n … Node 1 Node 2 Node n … Rack 2 Node 1 Node 2 Node n … Rack n Node 1 … Terminology review
11.
© 2011 IBM
Corporation Hadoop architecture • Two main components: – Hadoop Distributed File System (HDFS) 11 – MapReduce Engine
12.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 12
13.
© 2011 IBM
Corporation Hadoop distributed file system (HDFS) 13 • Hadoop file system that runs on top of existing file system • Designed to handle very large files with streaming data access patterns • Uses blocks to store a file or parts of a file
14.
© 2011 IBM
Corporation HDFS - Blocks 14 • File Blocks – 64MB (default), 128MB (recommended) – compare to 4KB in UNIX – Behind the scenes, 1 HDFS block is supported by multiple operating system (OS) blocks 128 MB OS Blocks HDFS Block . . .
15.
© 2011 IBM
Corporation HDFS - Blocks 15 • Fits well with replication to provide fault tolerance and availability • Advantages of blocks: – Fixed size – easy to calculate how many fit on a disk – A file can be larger than any single disk in the network – If a file or a chunk of the file is smaller than the block size, only needed space is used. Eg: 420MB file is split as: 128 MB 128 MB128 MB 36 MB
16.
© 2011 IBM
Corporation HDFS - Replication • Blocks with data are replicated to multiple nodes • Allows for node failure without data loss 16 Node 1 Node 2 Node 3
17.
© 2011 IBM
Corporation Writing a file to HDFS 17
18.
© 2011 IBM
Corporation Writing a file to HDFS 18
19.
© 2011 IBM
Corporation Writing a file to HDFS 19
20.
© 2011 IBM
Corporation Writing a file to HDFS 20
21.
© 2011 IBM
Corporation Writing a file to HDFS 21
22.
© 2011 IBM
Corporation Writing a file to HDFS 22
23.
© 2011 IBM
Corporation Writing a file to HDFS 23
24.
© 2011 IBM
Corporation Writing a file to HDFS 24
25.
© 2011 IBM
Corporation Writing a file to HDFS 25
26.
© 2011 IBM
Corporation Writing a file to HDFS 26
27.
© 2011 IBM
Corporation Writing a file to HDFS 27
28.
© 2011 IBM
Corporation HDFS Command line interface 28 • File System Shell (fs) • Invoked as follows: hadoop fs <args> • Example: • Listing the current directory in hdfs hadoop fs –ls .
29.
© 2011 IBM
Corporation HDFS Command line interface 29 • FS shell commands take paths URIs as argument • URI format: scheme://authority/path • Scheme: • For the local filesystem, the scheme is file • For HDFS, the scheme is hdfs hadoop fs –copyFromLocal file://myfile.txt hdfs://localhost/user/keith/myfile.txt • Scheme and authority are optional • Defaults are taken from configuration file core-site.xml
30.
© 2011 IBM
Corporation HDFS Command line interface 30 • Many POSIX-like commands • cat, chgrp, chmod, chown, cp, du, ls, mkdir, mv, rm, stat, tail • Some HDFS-specific commands • copyFromLocal, copyToLocal, get, getmerge, put, setrep
31.
© 2011 IBM
Corporation HDFS – Specific commands 31 • copyFromLocal / put • Copy files from the local file system into fs hadoop fs -copyFromLocal <localsrc> .. <dst> hadoop fs -put <localsrc> .. <dst> Or
32.
© 2011 IBM
Corporation HDFS – Specific commands 32 • copyToLocal / get • Copy files from fs into the local file system hadoop fs -copyToLocal [-ignorecrc] [-crc] <src> <localdst> hadoop fs -get [-ignorecrc] [-crc] <src> <localdst> Or
33.
© 2011 IBM
Corporation HDFS – Specific commands 33 • getMerge • Get all the files in the directories that match the source file pattern • Merge and sort them to only one file on local fs • <src> is kept hadoop fs -getmerge <src> <localdst>
34.
© 2011 IBM
Corporation HDFS – Specific commands 34 • setRep • Set the replication level of a file. • The -R flag requests a recursive change of replication level for an entire tree. • If -w is specified, waits until new replication level is achieved. hadoop fs -setrep [-R] [-w] <rep> <path/file>
35.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 35
36.
© 2011 IBM
Corporation MapReduce engine 36 • Technology from Google • A MapReduce program consists of map and reduce functions • A MapReduce job is broken into tasks that run in parallel
37.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 37
38.
© 2011 IBM
Corporation Types of nodes - Overview 38 • HDFS nodes – NameNode – DataNode • MapReduce nodes – JobTracker – TaskTracker • There are other nodes not discussed in this course
39.
© 2011 IBM
Corporation Types of nodes - Overview 39
40.
© 2011 IBM
Corporation Types of nodes - Overview 40
41.
© 2011 IBM
Corporation Types of nodes - Overview 41
42.
© 2011 IBM
Corporation Types of nodes - Overview 42
43.
© 2011 IBM
Corporation Types of nodes - NameNode 43 • NameNode – Only one per Hadoop cluster – Manages the filesystem namespace and metadata – Single point of failure, but mitigated by writing state to multiple filesystems – Single point of failure: Don’t use inexpensive commodity hardware for this node, large memory requirements
44.
© 2011 IBM
Corporation Types of nodes - DataNode 44 • DataNode – Many per Hadoop cluster – Manages blocks with data and serves them to clients – Periodically reports to name node the list of blocks it stores – Use inexpensive commodity hardware for this node
45.
© 2011 IBM
Corporation Types of nodes - JobTracker 45 • JobTracker node – One per Hadoop cluster – Receives job requests submitted by client – Schedules and monitors MapReduce jobs on task trackers
46.
© 2011 IBM
Corporation Types of nodes - TaskTracker 46 • TaskTracker node – Many per Hadoop cluster – Executes MapReduce operations – Reads blocks from DataNodes
47.
© 2011 IBM
Corporation Agenda • Terminology review • HDFS • MapReduce • Type of nodes • Topology awareness 47
48.
© 2011 IBM
Corporation Topology awareness (or Rack awareness) 48 Bandwidth becomes progressively smaller in the following scenarios:
49.
© 2011 IBM
Corporation Topology awareness 49 Bandwidth becomes progressively smaller in the following scenarios: 1.Process on the same node.
50.
© 2011 IBM
Corporation Bandwidth becomes progressively smaller in the following scenarios: 1.Process on the same node 2.Different nodes on the same rack Topology awareness 50
51.
© 2011 IBM
Corporation Bandwidth becomes progressively smaller in the following scenarios: 1.Process on the same node 2.Different nodes on the same rack 3.Nodes on different racks in the same data center Topology awareness 51
52.
© 2011 IBM
Corporation Bandwidth becomes progressively smaller in the following scenarios: 1.Process on the same node 2.Different nodes on the same rack 3.Nodes on different racks in the same data center 4.Nodes in different data centers Topology awareness 52
53.
© 2011 IBM
Corporation Thank you!
Télécharger maintenant