SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
Setup Hadoop 2.x (2.2.0) on Ubuntu
In this tutorial I am going to guide you through setting up hadoop 2.2.0 environment on
Ubuntu.

Prerequistive
$ sudo apt-get install openjdk-7-jdk
$ java -version
java version "1.7.0_25"
OpenJDK Runtime Environment (IcedTea 2.3.12) (7u25-2.3.12-4ubuntu3)
OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
$ cd /usr/lib/jvm
$ ln -s java-7-openjdk-amd64 jdk
$ sudo apt-get install openssh-server

Add Hadoop Group and User
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser
$ sudo adduser hduser sudo

After user is created, re-login into ubuntu using hduser

Setup SSH Certificate
$ ssh-keygen -t rsa -P ''
...
Your identification has been saved in /home/hduser/.ssh/id_rsa.
Your public key has been saved in /home/hduser/.ssh/id_rsa.pub.
...
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$ ssh localhost

Setup Hadoop Environment Variables
$cd ~
$vi .bashrc
paste following to the end of the file
#Hadoop variables
export JAVA_HOME=/usr/lib/jvm/jdk/
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
###end of paste

$ cd /usr/local/hadoop/etc/hadoop
$ vi hadoop-env.sh
#modify JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/jdk/

Re-login into Ubuntu using hdser and check hadoop version
$ hadoop version
Hadoop 2.2.0
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768
Compiled by hortonmu on 2013-10-07T06:28Z
Compiled with protoc 2.5.0
From source with checksum 79e53ce7994d1628b240f09af91e1af4
This command was run using /usr/local/hadoop2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar

At this point, hadoop is installed.
Configure Hadoop
$ cd /usr/local/hadoop/etc/hadoop
$ vi core-site.xml
#Paste following between <configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>

$ vi yarn-site.xml
#Paste following between <configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

$ mv mapred-site.xml.template mapred-site.xml
$ vi mapred-site.xml
#Paste following between <configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

$ cd ~
$ mkdir -p mydata/hdfs/namenode
$ mkdir -p mydata/hdfs/datanode
$ cd /usr/local/hadoop/etc/hadoop
$ vi hdfs-site.xml
Paste following between <configuration> tag
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/mydata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/mydata/hdfs/datanode</value>
</property>

Format Namenode
hduser@ubuntu40:~$ hdfs namenode -format

Start Hadoop Service
$ start-dfs.sh
....
$ start-yarn.sh
....
hduser@ubuntu40:~$ jps
If everything is sucessful, you should see following services running
2583 DataNode
2970 ResourceManager
3461 Jps
3177 NodeManager
2361 NameNode
2840 SecondaryNameNode

Run Hadoop Example
hduser@ubuntu: cd /usr/local/hadoop
hduser@ubuntu:/usr/local/hadoop$ hadoop jar
./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5
Number of Maps = 2
Samples per Map = 5
13/10/21 18:41:03 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
13/10/21 18:41:04 INFO client.RMProxy: Connecting to ResourceManager at
/0.0.0.0:8032
13/10/21 18:41:04 INFO input.FileInputFormat: Total input paths to process :
2
13/10/21 18:41:04 INFO mapreduce.JobSubmitter: number of splits:2
13/10/21 18:41:04 INFO Configuration.deprecation: user.name is deprecated.
Instead, use mapreduce.job.user.name
...
Hadoop FileSystem (HDFS) Tutorial
In this tutorial I will show some common commands for HDFS operations.
If you don't have Hadoop setup in your linux, you can follow Hadoop Setup Guide
Log into Linux, "hduser" is the login used in following examples.
Start Hadoop If it's not running
$ start-dfs.sh
....
$ start-yarn.sh

Create someFile.txt in your home directory
hduser@ubuntu:~$ vi someFile.txt
Paste any text you want in to the file and save it.

Create Home Directory In HDFS (If it doesn't exist)
hduser@ubuntu:~$ hadoop fs -mkdir -p /user/hduser

Copy file someFile.txt from local disk to the user’s directory in HDFS.
hduser@ubuntu:~$ hadoop fs -copyFromLocal someFile.txt someFile.txt

Get a directory listing of the user’s home directory in HDFS
hduser@ubuntu:~$ hadoop fs –ls

Found 1 items
-rw-r--r-1 hduser supergroup

5 2013-10-27 17:57 someFile.txt

Display the contents of the HDFS file /user/hduser/someFile.txt
hduser@ubuntu:~$ hadoop fs –cat /user/hduser/someFile.txt

Get a directory listing of the HDFS root directory
hduser@ubuntu:~$ hadoop fs –ls /

copy that file to the local disk, named as someFile2.txt
hduser@ubuntu:~$ hadoop fs –copyToLocal /user/hduser/someFile.txt
someFile2.txt

Delete the file from hadoop hdfs
hduser@ubuntu:~$ hadoop fs –rm someFile.txt
Deleted someFile.txt

Contenu connexe

Tendances

Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club Ukraine
Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club UkraineKubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club Ukraine
Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club UkraineElixir Club
 
Streamline your development environment with docker
Streamline your development environment with dockerStreamline your development environment with docker
Streamline your development environment with dockerGiacomo Bagnoli
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Bud Siddhisena
 
Really useful linux commands
Really useful linux commandsReally useful linux commands
Really useful linux commandsMichael J Geiser
 
Check the version with fixes. Link in description
Check the version with fixes. Link in descriptionCheck the version with fixes. Link in description
Check the version with fixes. Link in descriptionPrzemyslaw Koltermann
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Jérôme Petazzoni
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Shilpa Hemaraj
 
Towards the perfect Drupal Dev Machine
Towards the perfect Drupal Dev MachineTowards the perfect Drupal Dev Machine
Towards the perfect Drupal Dev MachineKrimson
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
 
Using docker for data science - part 2
Using docker for data science - part 2Using docker for data science - part 2
Using docker for data science - part 2Calvin Giles
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data scienceCalvin Giles
 
Demystifying Container Escapes
Demystifying Container EscapesDemystifying Container Escapes
Demystifying Container EscapesVaibhav Gupta
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System AdministratorsAllen Wittenauer
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteAllen Wittenauer
 
Usage Note of PlayCap
Usage Note of PlayCapUsage Note of PlayCap
Usage Note of PlayCapWilliam Lee
 
Linux lv ms step by step
Linux lv ms step by stepLinux lv ms step by step
Linux lv ms step by stepsudakarman
 
Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013grim_radical
 

Tendances (20)

Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club Ukraine
Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club UkraineKubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club Ukraine
Kubernetes + Docker + Elixir - Alexei Sholik, Andrew Dryga | Elixir Club Ukraine
 
Streamline your development environment with docker
Streamline your development environment with dockerStreamline your development environment with docker
Streamline your development environment with docker
 
Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)Recipe of a linux Live CD (archived)
Recipe of a linux Live CD (archived)
 
Really useful linux commands
Really useful linux commandsReally useful linux commands
Really useful linux commands
 
Dtalk shell
Dtalk shellDtalk shell
Dtalk shell
 
Check the version with fixes. Link in description
Check the version with fixes. Link in descriptionCheck the version with fixes. Link in description
Check the version with fixes. Link in description
 
Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9 Docker Introduction + what is new in 0.9
Docker Introduction + what is new in 0.9
 
Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners Install and Configure Ubuntu for Hadoop Installation for beginners
Install and Configure Ubuntu for Hadoop Installation for beginners
 
Towards the perfect Drupal Dev Machine
Towards the perfect Drupal Dev MachineTowards the perfect Drupal Dev Machine
Towards the perfect Drupal Dev Machine
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
 
testing-nfs
testing-nfstesting-nfs
testing-nfs
 
Using docker for data science - part 2
Using docker for data science - part 2Using docker for data science - part 2
Using docker for data science - part 2
 
Using python and docker for data science
Using python and docker for data scienceUsing python and docker for data science
Using python and docker for data science
 
Log
LogLog
Log
 
Demystifying Container Escapes
Demystifying Container EscapesDemystifying Container Escapes
Demystifying Container Escapes
 
Apache Hadoop for System Administrators
Apache Hadoop for System AdministratorsApache Hadoop for System Administrators
Apache Hadoop for System Administrators
 
Apache Hadoop Shell Rewrite
Apache Hadoop Shell RewriteApache Hadoop Shell Rewrite
Apache Hadoop Shell Rewrite
 
Usage Note of PlayCap
Usage Note of PlayCapUsage Note of PlayCap
Usage Note of PlayCap
 
Linux lv ms step by step
Linux lv ms step by stepLinux lv ms step by step
Linux lv ms step by step
 
Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013
 

En vedette

نظرة على الطب بعين الفن
نظرة على الطب بعين الفننظرة على الطب بعين الفن
نظرة على الطب بعين الفنHuda Matbouli
 
Loose weight in 20 steps
Loose weight in 20 steps Loose weight in 20 steps
Loose weight in 20 steps HealthYotta
 
Inl Presentation Panama
Inl Presentation   PanamaInl Presentation   Panama
Inl Presentation Panamaattydsg
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop EasyNick Dimiduk
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigMilind Bhandarkar
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopZheng Shao
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course pptNjain85
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 

En vedette (12)

نظرة على الطب بعين الفن
نظرة على الطب بعين الفننظرة على الطب بعين الفن
نظرة على الطب بعين الفن
 
Loose weight in 20 steps
Loose weight in 20 steps Loose weight in 20 steps
Loose weight in 20 steps
 
Inl Presentation Panama
Inl Presentation   PanamaInl Presentation   Panama
Inl Presentation Panama
 
Pig, Making Hadoop Easy
Pig, Making Hadoop EasyPig, Making Hadoop Easy
Pig, Making Hadoop Easy
 
Practical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & PigPractical Problem Solving with Apache Hadoop & Pig
Practical Problem Solving with Apache Hadoop & Pig
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
 
Hadoop for beginners free course ppt
Hadoop for beginners   free course pptHadoop for beginners   free course ppt
Hadoop for beginners free course ppt
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similaire à Setup and run hadoop distrubution file system example 2.2

Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows habeebulla g
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Nag Arvind Gudiseva
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationAnkit Desai
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Titus Damaiyanti
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14jijukjoseph
 
Docker security
Docker securityDocker security
Docker securityJanos Suto
 
Hadoop cluster 安裝
Hadoop cluster 安裝Hadoop cluster 安裝
Hadoop cluster 安裝recast203
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Manish Chopra
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linuxVicent Selfa
 
Docker 基本概念與指令操作
Docker  基本概念與指令操作Docker  基本概念與指令操作
Docker 基本概念與指令操作NUTC, imac
 
Hands on Virtualization with Ganeti
Hands on Virtualization with GanetiHands on Virtualization with Ganeti
Hands on Virtualization with GanetiOSCON Byrum
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 
Single node setup
Single node setupSingle node setup
Single node setupKBCHOW123
 
Two single node cluster to one multinode cluster
Two single node cluster to one multinode clusterTwo single node cluster to one multinode cluster
Two single node cluster to one multinode clustersushantbit04
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabMichelle Holley
 
Pursue container architecture with mincs
Pursue container architecture with mincsPursue container architecture with mincs
Pursue container architecture with mincsYuki Nishiwaki
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installationhabeebulla g
 

Similaire à Setup and run hadoop distrubution file system example 2.2 (20)

Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
 
Hadoop completereference
Hadoop completereferenceHadoop completereference
Hadoop completereference
 
Run wordcount job (hadoop)
Run wordcount job (hadoop)Run wordcount job (hadoop)
Run wordcount job (hadoop)
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
 
Docker security
Docker securityDocker security
Docker security
 
Hadoop cluster 安裝
Hadoop cluster 安裝Hadoop cluster 安裝
Hadoop cluster 安裝
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
 
Docker 基本概念與指令操作
Docker  基本概念與指令操作Docker  基本概念與指令操作
Docker 基本概念與指令操作
 
Hands on Virtualization with Ganeti
Hands on Virtualization with GanetiHands on Virtualization with Ganeti
Hands on Virtualization with Ganeti
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Single node setup
Single node setupSingle node setup
Single node setup
 
Two single node cluster to one multinode cluster
Two single node cluster to one multinode clusterTwo single node cluster to one multinode cluster
Two single node cluster to one multinode cluster
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
 
DPDK in Containers Hands-on Lab
DPDK in Containers Hands-on LabDPDK in Containers Hands-on Lab
DPDK in Containers Hands-on Lab
 
Pursue container architecture with mincs
Pursue container architecture with mincsPursue container architecture with mincs
Pursue container architecture with mincs
 
Hadoop installation
Hadoop installationHadoop installation
Hadoop installation
 

Setup and run hadoop distrubution file system example 2.2

  • 1. Setup Hadoop 2.x (2.2.0) on Ubuntu In this tutorial I am going to guide you through setting up hadoop 2.2.0 environment on Ubuntu. Prerequistive $ sudo apt-get install openjdk-7-jdk $ java -version java version "1.7.0_25" OpenJDK Runtime Environment (IcedTea 2.3.12) (7u25-2.3.12-4ubuntu3) OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode) $ cd /usr/lib/jvm $ ln -s java-7-openjdk-amd64 jdk $ sudo apt-get install openssh-server Add Hadoop Group and User $ sudo addgroup hadoop $ sudo adduser --ingroup hadoop hduser $ sudo adduser hduser sudo After user is created, re-login into ubuntu using hduser Setup SSH Certificate $ ssh-keygen -t rsa -P '' ... Your identification has been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub. ... $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ ssh localhost Setup Hadoop Environment Variables $cd ~ $vi .bashrc paste following to the end of the file #Hadoop variables export JAVA_HOME=/usr/lib/jvm/jdk/ export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL ###end of paste $ cd /usr/local/hadoop/etc/hadoop $ vi hadoop-env.sh
  • 2. #modify JAVA_HOME export JAVA_HOME=/usr/lib/jvm/jdk/ Re-login into Ubuntu using hdser and check hadoop version $ hadoop version Hadoop 2.2.0 Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768 Compiled by hortonmu on 2013-10-07T06:28Z Compiled with protoc 2.5.0 From source with checksum 79e53ce7994d1628b240f09af91e1af4 This command was run using /usr/local/hadoop2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar At this point, hadoop is installed. Configure Hadoop $ cd /usr/local/hadoop/etc/hadoop $ vi core-site.xml #Paste following between <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> $ vi yarn-site.xml #Paste following between <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> $ mv mapred-site.xml.template mapred-site.xml $ vi mapred-site.xml #Paste following between <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> $ cd ~ $ mkdir -p mydata/hdfs/namenode $ mkdir -p mydata/hdfs/datanode $ cd /usr/local/hadoop/etc/hadoop $ vi hdfs-site.xml Paste following between <configuration> tag
  • 3. <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/mydata/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hduser/mydata/hdfs/datanode</value> </property> Format Namenode hduser@ubuntu40:~$ hdfs namenode -format Start Hadoop Service $ start-dfs.sh .... $ start-yarn.sh .... hduser@ubuntu40:~$ jps If everything is sucessful, you should see following services running 2583 DataNode 2970 ResourceManager 3461 Jps 3177 NodeManager 2361 NameNode 2840 SecondaryNameNode Run Hadoop Example hduser@ubuntu: cd /usr/local/hadoop hduser@ubuntu:/usr/local/hadoop$ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5 Number of Maps = 2 Samples per Map = 5 13/10/21 18:41:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Wrote input for Map #0 Wrote input for Map #1 Starting Job 13/10/21 18:41:04 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032 13/10/21 18:41:04 INFO input.FileInputFormat: Total input paths to process : 2 13/10/21 18:41:04 INFO mapreduce.JobSubmitter: number of splits:2 13/10/21 18:41:04 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name ...
  • 4. Hadoop FileSystem (HDFS) Tutorial In this tutorial I will show some common commands for HDFS operations. If you don't have Hadoop setup in your linux, you can follow Hadoop Setup Guide Log into Linux, "hduser" is the login used in following examples. Start Hadoop If it's not running $ start-dfs.sh .... $ start-yarn.sh Create someFile.txt in your home directory hduser@ubuntu:~$ vi someFile.txt Paste any text you want in to the file and save it. Create Home Directory In HDFS (If it doesn't exist) hduser@ubuntu:~$ hadoop fs -mkdir -p /user/hduser Copy file someFile.txt from local disk to the user’s directory in HDFS. hduser@ubuntu:~$ hadoop fs -copyFromLocal someFile.txt someFile.txt Get a directory listing of the user’s home directory in HDFS hduser@ubuntu:~$ hadoop fs –ls Found 1 items -rw-r--r-1 hduser supergroup 5 2013-10-27 17:57 someFile.txt Display the contents of the HDFS file /user/hduser/someFile.txt hduser@ubuntu:~$ hadoop fs –cat /user/hduser/someFile.txt Get a directory listing of the HDFS root directory hduser@ubuntu:~$ hadoop fs –ls / copy that file to the local disk, named as someFile2.txt hduser@ubuntu:~$ hadoop fs –copyToLocal /user/hduser/someFile.txt someFile2.txt Delete the file from hadoop hdfs hduser@ubuntu:~$ hadoop fs –rm someFile.txt Deleted someFile.txt