YouTube Link: https://youtu.be/aBCDy-dJE0Y
**Big Data Hadoop Certification Training: https://www.edureka.co/big-data-hadoop-training-certification **
This Edureka PPT on Hadoop Cluster will provide you with detailed knowledge about Hadoop and its Architecture along with it. This video will help you to set up a multi-node cluster on your own. This PPT covers the following topics:
What is a Hadoop Cluster?
Advantages of a Hadoop Cluster
Facebook’s Hadoop Cluster
Hadoop Cluster Architecture
Setting up a Hadoop Cluster
Hadoop Cluster Management System
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
3. ADVANTAGES OF HADOOP CLUSTER
FACEBOOK HADOOP CLUSTER
www.edureka.co
WHAT IS A HADOOP CLUSTER?
ARCHITECTURE OF HADOOP CLUSTER
SETTING UP A HADOOP CLUSTER
MANAGING A HADOOP CLUSTER
6. www.edureka.co
WHAT IS A
CLUSTER?
Computer Cluster
• A computer cluster is a set of loosely or tightly
connected computers.
• They work together so that, in many respects and
viewed as a single system.
• Computer clusters have each node set to perform the
same task, controlled and scheduled by software.
7. www.edureka.co
AI is a technique that enables machines
to mimic human behaviour.
WHAT IS A HADOOP CLUSTER?
Master
Slaves
8. www.edureka.co
WHAT IS A
HADOOP
CLUSTER?
Hadoop Cluster
• A Hadoop cluster is a set of connected
commodity computers.
• They work together so that, in many respects and
viewed as a single system.
• Hadoop clusters have each node set to perform the
same task, controlled and scheduled by the Master.
10. www.edureka.co
The Major advantages of Hadoop Cluster are as follows:
• Scalable
• Cost effective
• Flexible
• Fast
• Resilient to failure
Advantages of Hadoop Cluster
12. www.edureka.co
FACEBOOK HADOOP CLUSTER
• Facebook’s Cluster is known as
the Beefiest Hadoop cluster.
• 4,000 machines and storing more than
hundreds of millions of gigabytes
• Launched in the year 2004
• 2.38 billion accounts
13. www.edureka.co
Facebook Hadoop Cluster
• The developers can freely write map-reduce programs
in any language.
• SQL has been integrated to process extensive data sets
• Searching, Log processing, Recommendation system,
starting from Data warehousing, to Video and Image
analysis
18. www.edureka.co
AI is a technique that enables machines
to mimic human behaviour.
NAMENODE
DATANODES
HADOOP CLUSTER ARCHITECTURE
19. www.edureka.co
AI is a technique that enables machines
to mimic human behaviour.
Name Node
• Master daemon manages the Data Nodes.
• Records the metadata of all the files
• Receives Heartbeat and a block report from Data
Nodes.
Data Node
• Slave daemons runs on slave machine
• The actual data is stored on Data Nodes
• Responsible for serving read & write requests.
NAMENODE SECONDARY
NAMENODE
FS-image
Edit Log
Edit Log
(New)
FS-image
Edit Log
FS-image
(Final)
HADOOP CLUSTER ARCHITECTURE
20. www.edureka.co
• YARN ( Yet Another Resource Negotiator ) provides ability to run Non-MapReduce application.
• YARN framework is responsible for doing Cluster Resource Management.
HADOOP CLUSTER ARCHITECTURE
23. www.edureka.co
• Rack Awareness Algorithm reduces latency as well as provide fault tolerance by replicating data block.
• Rack Awareness Algorithm says that the first replica of a block will be stored on a local rack & the next two
replicas will be stored on a different (remote) rack.
HADOOP CLUSTER ARCHITECTURE
32. www.edureka.co
AI is a technique that enables machines
to mimic human behaviour.
MANAGE A HADOOP CLUSTER
33. www.edureka.co
• Hadoop is both a command line interface as
well as an API.
• It does not require any tool in specific for
managing and monitoring utilities.
• There are some options available such as:
1. Ambari
2. HortonWorks