Configuringahadoop

1. Decide On Cluster Layout


There are four components of Hadoop which we would like to
spread out across the cluster:
◦ Data nodes – actually store and manage data;
◦ Naming node – acts as a catalogue service, showing what data is stored
where;
◦ Job tracker – tracks and manages submitted MapReduce tasks;
◦ Task tracker – low level worker that is issued jobs from job tracker.



Lets go with the following setup. This is fairly typical in terms
of data nodes and task trackers across the cluster, and one
instance of the naming node and job tracker:
Node

Hostname

Component

Master

ec2-23-22-133-70

Naming Node
Job Tracker

Slave 1

ec2-23-20-53-36

Data Node
Task Tracker

Slave 2

ec2-184-73-42-163

Data Node
Task Tracker

2a. Configure Server Names


Logout of all of the machines and log back into the master
server;



The hadoop configuration will be located here on the server:
cd /home/ubuntu/hadoop-1.0.3/conf



Open the file ‘masters’ and replace the word ‘localhost’ with
the hostname of the server that you have allocated to master:
vi masters



Open the file ‘slaves’ and replace the word ‘localhost’ with the
2 hostnames of the server that you have been allocated on 2
separate lines:
vi slaves

Configuringahadoop

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (12)

Similaire à Configuringahadoop

Similaire à Configuringahadoop (20)

Dernier

Dernier (20)

Configuringahadoop