安装Apache Hadoop的轻松

Installing Hadoop on
Ubuntu 16
INSTALL OPEN JDK
1

Install Java
 Do I have Java? Type on terminal: java -version
 If I see the output below, then I don’t have java installed, follow instructions next
slide
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
2

Install Java
 Type:
 sudo apt-get install openjdk-8-jdk
 Type Y to continue the installation process (it will take a while to complete the
installation)
3

Do I have java?
 To confirm java ins installed on my Ubuntu system type:
 java –version
 You will see output below
4

Install Openssh
 Is mandatory to install openssh server:
sudo apt-get install openssh-server
 If ssh server is installed then
generate keys, run command below:
ssh-keygen -t rsa
 Enter file, press enter
 Enter passphrase, press enter
 Enter same passphrase again press
 enter
5

SSH Keys
 Now we will copy the key to the user and host, in my case my user is hadoop and
host is hadoopdev
 ssh-copy-id hadoop@hadoopdev
6

Download and Install
Hadoop
DOWNLOAD HADOOP FROM APACHE WEB PAGE
7

Download Apache Hadoop
 Type in the terminal the following command to create new folder within my home
linux folder, in this case/home/Hadoop/:
 mkdir hadoop_install
 Then go into this new folder:
 cd hadoop_install
 And copy the command below:
 wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-
2.7.3.tar.gz
8

Download Apache Hadoop
 You will see windows reflecting the progress of the download
9

Unzip Hadoop folder
 Once download is complete
 Type the following command:
 tar -xvf hadoop-2.7.3.tar.gz
 Now you will see 2 folders, the new directory is called hadoop-2.7.3:
10

Setup bashrc
 This is the java location (very important for next steps):
 Edit bashrc
 Type:
 Sudo gedit ~/.bashrc
11

Setup ~/.bashrc
 Add this lines to the .bashrc
 Pls note on previous slide the java path is displayed, need to point bashrc to the
actual java path
 #HADOOP VARIABLES START
 export JAVA_HOME=/usr/lib/jvm/ java-1.8.0-openjdk-amd64
 export HADOOP_INSTALL=/home/hadoop/hadoop_install
 export PATH=$PATH:$HADOOP_INSTALL/bin
 export PATH=$PATH:$HADOOP_INSTALL/sbin
12

Testing hadoop installation
 Type the following command to refresh ~/.bashrc changes (no need to restart)
 source ~/.basrch
 Type the command below (if at this point you see an output like this you’re
doing well)
hadoop version
13

Setup single node
INSTALL OPEN JDK
14

Point your java to hadoop conf file
 Go to the path:
 /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
 Edit the file:
 sudo gedit Hadoop-env.sh
15

Modifying hadoop-env.sh
 Modify the value for Java Home in the file: hadoop-env.sh
16

Modify core-site.xml
 Create a folder called tmp in /home/hadoop/hadoop_install
 Add the following text to the core-site.xml , file is on the path:
/home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop_install/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system.</description>
</property>
</configuration>
17

Modify mapred-site.xml
 By default there is a file called: mapred-site.xml.template, needs to be renamed to
mapred-site.xml and then add the code below:
 File is on path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs at. </description>
</property>
18

Modify hdfs-site.xml
 We need to créate 2 new folders which will contain name node and data node:
 I placed these 2 folders on: /home/hadoop/hadoop_install/
19

Modify hdfs-site.xml
Add the code below in the file hdfs-site.xml, the paths for namnode and datanode are the 2 new folders
you just created on previous slide.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoop_install/namenode</value>
</property>
<property>
<name>dfs.data.node.name.dir</name>
<value>file:///home/hadoop/hadoop_install/datanode</value>
</property>
</configuration>
#hdfs-site.xml is located on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
20

Format the namenode
 Run the following command:
 hadoop namenode –format
21

Format the namenode part 2
 If everything is ok you will see message below:
22

Running Hadoop Single node
 Run the command:
 startall.sh
 Then execute the command:
 jps, you will see the following output
23

Stop Cluster
 We run stop-all.sh
24

Web Interface: localhost:50070
 In the browser go to: localhost:50070
25

Applies for:
 This installation runs under:
 Ubuntu 16
 Hadoop 2.7.3
 Virtual Machine:
 2 Processors
 2 Gb Ram
 2 Network Interface, 1 as Bridge, 2nd as Nat
26

You need help?
 Contact name:
 Enrique Davila Gutierrez
 Enrique.davila@Gmail.com
27

安装Apache Hadoop的轻松

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (16)

En vedette

En vedette (20)

Similaire à 安装Apache Hadoop的轻松

Similaire à 安装Apache Hadoop的轻松 (20)

Plus de Enrique Davila

Plus de Enrique Davila (6)

Dernier

Dernier (20)

安装Apache Hadoop的轻松