SlideShare a Scribd company logo
1 of 25
東海大學資工系

Hadoop 2.2.0
Multi-node Installation on Ubuntu

康志強 G02357004
2014/1/3
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
一、前言......................................................................................................................................... 2
二、安裝環境................................................................................................................................. 3
三、安裝步驟................................................................................................................................. 4
1.

安裝環境說明................................................................................................................. 4

2.

設定................................................................................................................................. 5

3.

增加三台機器的 ip 和 hostname 的對應 .................................................................... 7

4.

打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.................................. 8

5.

安裝 JDK ...................................................................................................................... 10

6.

關閉防火牆................................................................................................................... 11

7.

Hadoop 2.2 安裝 ......................................................................................................... 12

8.

Hadoop 2.2 啟動 ......................................................................................................... 18

五、本文的引用網址: ................................................................................................................. 24

1
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
一、前言
略

2
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
二、安裝環境
CPU

Intel Core i7-4470 3.40GHz

RAM

8 GB * 2

HD

128 SSD + 1TB HD

Network

100M/1000M bps Ethernet

OS

Windows7_64-bit

VM Platform

VMware® Workstation10.0.0 build-1295980

VM Guest OS

ubuntu-12.04.3-desktop-amd64

VMRAM

2.0GB

VM HD

40GB

3
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

三、安裝步驟
1. 安裝環境說明
這裡我們建構一個由三台機器組成的叢集
Hostname

User/Password

cloud001

hduser/adm123

cloud002

hduser/adm123

cloud003

hduser/adm123

Cluster 角色
Name node
Secondary Name node
Resource manager
Data node
Node manager
Data node
Node manager

4

OS
ubuntu-12.04.3 64
bits
ubuntu-12.04.3 64
bits
ubuntu-12.04.3 64
bits
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

2. 設定
(1) 修改 hostname,改成 cloud001
vim /etc/hostname

(2) 修改 hduser 權限 :
vim /etc/sudoers

(3) 系统升级到最新
sudo apt-get update

5
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
sudo apt-get upgrade

基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了

6
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

3. 增加三台機器的 ip 和 hostname 的對應
hduser@cloud001:~$ vim /etc/hosts

7
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入
(1) 安裝 SSH
sudo apt-get install ssh

(2) 設置 local 無密碼登陸,在登入目錄下執行下面指令
建立.ssh 目錄,進入
hduser@ubuntu:~$ mkdir .ssh
hduser@ubuntu:~$ cd .ssh
產生金鑰(一直 Enter 就可以)
hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa
把 id_rsa.pub 追加到授權的 key 裡面去
hduser@ubuntu:~/.ssh$cat id_rsa.pub >> authorized_keys
重啟 SSH 服務
hduser@ubuntu:~/.ssh$ service ssh restart
8
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
測試
ssh localhos

9
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

5. 安裝 JDK
下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod
hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz
安裝
hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm
環境變數
hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc
最後面增加
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
輸入下面的命令來使之生效
hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc

10
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
測試
hduser@ubuntu:/usr/lib/jvm$ java -version
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
hduser@ubuntu:/usr/lib/jvm$

6. 關閉防火牆
hduser@ubuntu:/usr/lib/jvm$ sudo ufw disable
Firewall stopped and disabled on system startup
hduser@ubuntu:/usr/lib/jvm$
重啟生效

11
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

7. Hadoop 2.2 安裝
(1) 下載檔案 hadoop-2.2.tar.gz,解壓到/home/hduser 路径下
hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gz
hduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz
(2) hadoop 配置
配置之前,需要在 cloud001 新增以下資料夾
/home/hduser/dfs/name
/home/hduser/dfs/data
/home/hduser/temp

修改相關設定擋案內容,清單如下
~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh
~/hadoop-2.2.0/etc/hadoop/yarn-env.sh
~/hadoop-2.2.0/etc/hadoop/slaves
~/hadoop-2.2.0/etc/hadoop/core-site.xml
~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml
~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp)
~/hadoop-2.2.0/etc/hadoop/yarn-site.xml
修改 hadoop-env.sh
修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)

修改 yarn-env.sh
修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45)

修改 slaves (這個文件裡面 KEEP 所有 slave 節點)
寫入以下內容:
cloud002

12
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
cloud003
修改 core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cloud001:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hduser/temp</value>
<description>Abase for other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>
13
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

修改 hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>cloud001:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
14
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
修改 mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>cloud001:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>cloud001:19888</value>
</property>
</configuration>
修改 yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>

15
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>cloud001:8040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>cloud001:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>cloud001:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>cloud001:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>cloud001:8088</value>
</property>
</configuration>
設定環境變數
hduser@cloud001:~$ vim ~/.bashrc
16
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

最後面貼上
export HADOOP_HOME=/home/hduser/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"

(3) clone imagecloud001 to cloud002 & cloud003 ,然後修改 hostname

17
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

8. Hadoop 2.2 啟動
(1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode
./bin/hdfs namenode –format

18
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(2) 啟動 hdfs
./sbin/start-dfs.sh
此時在 001 上面運行的進程有:namenode secondarynamenode
002 和 003 上面運行的進程有:datanode

(3) 啟動 yarn
./sbin/start-yarn.sh
此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager
002 和 003 上面運行的進程有:datanode nodemanaget
19
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

20
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
(4) 查看叢集狀態
./bin/hdfs dfsadmin –report

(5) 查看文件組成
./bin/hdfs fsck / -files –blocks

21
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

(6) 查看 HDFS

(7) 查看 RM

22
Hadoop 2.2.0 (multi-node) Installation on Ubuntu

23
Hadoop 2.2.0 (multi-node) Installation on Ubuntu
五、本文的引用網址:

1. http://blog.csdn.net/licongcong_0224/article/details/12972889
2. http://blog.csdn.net/focusheart/article/details/14005893(單機板)
3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7)
4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介)
5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介)

24

More Related Content

What's hot

Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2
IMC Institute
 
Domino9on centos6
Domino9on centos6Domino9on centos6
Domino9on centos6
a8us
 

What's hot (20)

Hadoop completereference
Hadoop completereferenceHadoop completereference
Hadoop completereference
 
HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성HADOOP 실제 구성 사례, Multi-Node 구성
HADOOP 실제 구성 사례, Multi-Node 구성
 
Hadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup InsightsHadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup Insights
 
Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2
 
Lab docker
Lab dockerLab docker
Lab docker
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
How to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHubHow to create a secured multi tenancy for clustered ML with JupyterHub
How to create a secured multi tenancy for clustered ML with JupyterHub
 
High Availability Server with DRBD in linux
High Availability Server with DRBD in linuxHigh Availability Server with DRBD in linux
High Availability Server with DRBD in linux
 
How to go the extra mile on monitoring
How to go the extra mile on monitoringHow to go the extra mile on monitoring
How to go the extra mile on monitoring
 
Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016Introduction to Stacki at Atlanta Meetup February 2016
Introduction to Stacki at Atlanta Meetup February 2016
 
Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013Puppet: Eclipsecon ALM 2013
Puppet: Eclipsecon ALM 2013
 
How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...How to create a multi tenancy for an interactive data analysis with jupyter h...
How to create a multi tenancy for an interactive data analysis with jupyter h...
 
Component pack 6006 install guide
Component pack 6006 install guideComponent pack 6006 install guide
Component pack 6006 install guide
 
Automated infrastructure is on the menu
Automated infrastructure is on the menuAutomated infrastructure is on the menu
Automated infrastructure is on the menu
 
NFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center OperationsNFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center Operations
 
Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4Out of the box replication in postgres 9.4
Out of the box replication in postgres 9.4
 
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph EnterpriseRed Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
Red Hat Enterprise Linux OpenStack Platform on Inktank Ceph Enterprise
 
Docker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in PragueDocker and friends at Linux Days 2014 in Prague
Docker and friends at Linux Days 2014 in Prague
 
Domino9on centos6
Domino9on centos6Domino9on centos6
Domino9on centos6
 
MySQL Replication: Demo Réplica en Español
MySQL Replication: Demo Réplica en EspañolMySQL Replication: Demo Réplica en Español
MySQL Replication: Demo Réplica en Español
 

Similar to Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu

Setup and run hadoop distrubution file system example 2.2
Setup and run hadoop  distrubution file system example  2.2Setup and run hadoop  distrubution file system example  2.2
Setup and run hadoop distrubution file system example 2.2
Mounir Benhalla
 
Medooze MCU Video Multiconference Server Installation and configuration guide...
Medooze MCU Video Multiconference Server Installation and configuration guide...Medooze MCU Video Multiconference Server Installation and configuration guide...
Medooze MCU Video Multiconference Server Installation and configuration guide...
sreeharsha43
 
Single node setup
Single node setupSingle node setup
Single node setup
KBCHOW123
 
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
Sachin Jaypatre
 

Similar to Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu (20)

Snort-IPS-Tutorial
Snort-IPS-TutorialSnort-IPS-Tutorial
Snort-IPS-Tutorial
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
 
#VirtualDesignMaster 3 Challenge 4 – James Brown
#VirtualDesignMaster 3 Challenge 4 – James Brown#VirtualDesignMaster 3 Challenge 4 – James Brown
#VirtualDesignMaster 3 Challenge 4 – James Brown
 
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage ServiceQuick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
Quick-Start Guide: Deploying Your Cloudian HyperStore Hybrid Storage Service
 
Setup and run hadoop distrubution file system example 2.2
Setup and run hadoop  distrubution file system example  2.2Setup and run hadoop  distrubution file system example  2.2
Setup and run hadoop distrubution file system example 2.2
 
Install and configure linux
Install and configure linuxInstall and configure linux
Install and configure linux
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
 
Hadoop installation on windows
Hadoop installation on windows Hadoop installation on windows
Hadoop installation on windows
 
Installing & Configuring IBM Domino 9 on CentOS
Installing & Configuring IBM Domino 9 on CentOSInstalling & Configuring IBM Domino 9 on CentOS
Installing & Configuring IBM Domino 9 on CentOS
 
Dev stacklabguide
Dev stacklabguideDev stacklabguide
Dev stacklabguide
 
Devstack lab guide
Devstack lab guideDevstack lab guide
Devstack lab guide
 
Installing nagios core_from_source
Installing nagios core_from_sourceInstalling nagios core_from_source
Installing nagios core_from_source
 
Medooze MCU Video Multiconference Server Installation and configuration guide...
Medooze MCU Video Multiconference Server Installation and configuration guide...Medooze MCU Video Multiconference Server Installation and configuration guide...
Medooze MCU Video Multiconference Server Installation and configuration guide...
 
Usage Note of Apache Thrift for C++ Java PHP Languages
Usage Note of Apache Thrift for C++ Java PHP LanguagesUsage Note of Apache Thrift for C++ Java PHP Languages
Usage Note of Apache Thrift for C++ Java PHP Languages
 
Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04Hadoop 2.4 installing on ubuntu 14.04
Hadoop 2.4 installing on ubuntu 14.04
 
How To Install OpenFire in CentOS 7
How To Install OpenFire in CentOS 7How To Install OpenFire in CentOS 7
How To Install OpenFire in CentOS 7
 
Single node setup
Single node setupSingle node setup
Single node setup
 
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
 
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
Zenoss core beta_installation_guide_r5.0.0b2_d99.14.253
 
Erp 2.50 openbravo environment installation openbravo-wiki
Erp 2.50 openbravo environment installation   openbravo-wikiErp 2.50 openbravo environment installation   openbravo-wiki
Erp 2.50 openbravo environment installation openbravo-wiki
 

More from 康志強 大人 (8)

AWS Lambda Multi-Cloud Practices
AWS Lambda Multi-Cloud PracticesAWS Lambda Multi-Cloud Practices
AWS Lambda Multi-Cloud Practices
 
AWS CloudFront、S3 Streamming
AWS CloudFront、S3 StreammingAWS CloudFront、S3 Streamming
AWS CloudFront、S3 Streamming
 
Running Hadoop on Amazon EC2
Running Hadoop on Amazon EC2Running Hadoop on Amazon EC2
Running Hadoop on Amazon EC2
 
Tomcat ssl 設定
Tomcat ssl 設定Tomcat ssl 設定
Tomcat ssl 設定
 
FreeNAS installation and setup for shared storage (1/2)
FreeNAS installation and setup for shared storage (1/2)FreeNAS installation and setup for shared storage (1/2)
FreeNAS installation and setup for shared storage (1/2)
 
CloudStack Installation on Ubuntu
CloudStack Installation on UbuntuCloudStack Installation on Ubuntu
CloudStack Installation on Ubuntu
 
OpenSTACK Installation on Ubuntu
OpenSTACK Installation on UbuntuOpenSTACK Installation on Ubuntu
OpenSTACK Installation on Ubuntu
 
JackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduceJackHare- a framework for SQL to NoSQL translation using MapReduce
JackHare- a framework for SQL to NoSQL translation using MapReduce
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Hadoop 2.2.0 Multi-node cluster Installation on Ubuntu

  • 1. 東海大學資工系 Hadoop 2.2.0 Multi-node Installation on Ubuntu 康志強 G02357004 2014/1/3
  • 2. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言......................................................................................................................................... 2 二、安裝環境................................................................................................................................. 3 三、安裝步驟................................................................................................................................. 4 1. 安裝環境說明................................................................................................................. 4 2. 設定................................................................................................................................. 5 3. 增加三台機器的 ip 和 hostname 的對應 .................................................................... 7 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入.................................. 8 5. 安裝 JDK ...................................................................................................................... 10 6. 關閉防火牆................................................................................................................... 11 7. Hadoop 2.2 安裝 ......................................................................................................... 12 8. Hadoop 2.2 啟動 ......................................................................................................... 18 五、本文的引用網址: ................................................................................................................. 24 1
  • 3. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 一、前言 略 2
  • 4. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 二、安裝環境 CPU Intel Core i7-4470 3.40GHz RAM 8 GB * 2 HD 128 SSD + 1TB HD Network 100M/1000M bps Ethernet OS Windows7_64-bit VM Platform VMware® Workstation10.0.0 build-1295980 VM Guest OS ubuntu-12.04.3-desktop-amd64 VMRAM 2.0GB VM HD 40GB 3
  • 5. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 三、安裝步驟 1. 安裝環境說明 這裡我們建構一個由三台機器組成的叢集 Hostname User/Password cloud001 hduser/adm123 cloud002 hduser/adm123 cloud003 hduser/adm123 Cluster 角色 Name node Secondary Name node Resource manager Data node Node manager Data node Node manager 4 OS ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits ubuntu-12.04.3 64 bits
  • 6. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 2. 設定 (1) 修改 hostname,改成 cloud001 vim /etc/hostname (2) 修改 hduser 權限 : vim /etc/sudoers (3) 系统升级到最新 sudo apt-get update 5
  • 7. Hadoop 2.2.0 (multi-node) Installation on Ubuntu sudo apt-get upgrade 基本上先把 cloud001 裝好,再 clone 成 002,003 後,改 hotname 就可以了 6
  • 8. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 3. 增加三台機器的 ip 和 hostname 的對應 hduser@cloud001:~$ vim /etc/hosts 7
  • 9. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 4. 打通 cloud001 到 cloud002、cloud003 的 SSH 無密碼登入 (1) 安裝 SSH sudo apt-get install ssh (2) 設置 local 無密碼登陸,在登入目錄下執行下面指令 建立.ssh 目錄,進入 hduser@ubuntu:~$ mkdir .ssh hduser@ubuntu:~$ cd .ssh 產生金鑰(一直 Enter 就可以) hduser@ubuntu:~/.ssh$ ssh-keygen -t rsa 把 id_rsa.pub 追加到授權的 key 裡面去 hduser@ubuntu:~/.ssh$cat id_rsa.pub >> authorized_keys 重啟 SSH 服務 hduser@ubuntu:~/.ssh$ service ssh restart 8
  • 10. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 ssh localhos 9
  • 11. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 5. 安裝 JDK 下載 jdk-7u45-linux-x64.tar.gz,copy 到 /usr/lib/jvm, 執行 chmod hduser@ubuntu:/usr/lib/jvm$ chmod 755 jdk-7u45-linux-x64.gz 安裝 hduser@ubuntu:/usr/lib/jvm$ sudo tar zxvf ./jdk-7u45-linux-x64.gz -C /usr/lib/jvm 環境變數 hduser@ubuntu:/usr/lib/jvm$ vim ~/.bashrc 最後面增加 export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45 export JRE_HOME=${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export PATH=${JAVA_HOME}/bin:$PATH 輸入下面的命令來使之生效 hduser@ubuntu:/usr/lib/jvm$ source ~/.bashrc 10
  • 12. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 測試 hduser@ubuntu:/usr/lib/jvm$ java -version java version "1.7.0_45" Java(TM) SE Runtime Environment (build 1.7.0_45-b18) Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode) hduser@ubuntu:/usr/lib/jvm$ 6. 關閉防火牆 hduser@ubuntu:/usr/lib/jvm$ sudo ufw disable Firewall stopped and disabled on system startup hduser@ubuntu:/usr/lib/jvm$ 重啟生效 11
  • 13. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 7. Hadoop 2.2 安裝 (1) 下載檔案 hadoop-2.2.tar.gz,解壓到/home/hduser 路径下 hduser@ubuntu:~$ chmod 755 hadoop-2.2.0.tar.gz hduser@ubuntu:~$ tar zxvf hadoop-2.2.0.tar.gz (2) hadoop 配置 配置之前,需要在 cloud001 新增以下資料夾 /home/hduser/dfs/name /home/hduser/dfs/data /home/hduser/temp 修改相關設定擋案內容,清單如下 ~/hadoop-2.2.0/etc/hadoop/hadoop-env.sh ~/hadoop-2.2.0/etc/hadoop/yarn-env.sh ~/hadoop-2.2.0/etc/hadoop/slaves ~/hadoop-2.2.0/etc/hadoop/core-site.xml ~/hadoop-2.2.0/etc/hadoop/hdfs-site.xml ~/hadoop-2.2.0/etc/hadoop/mapred-site.xml (不存在,直接 rename mapred-site.xml.temp) ~/hadoop-2.2.0/etc/hadoop/yarn-site.xml 修改 hadoop-env.sh 修改 JAVA_HOME 值(export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 yarn-env.sh 修改 JAVA_HOME 值(exportJAVA_HOME=/usr/lib/jvm/jdk1.7.0_45) 修改 slaves (這個文件裡面 KEEP 所有 slave 節點) 寫入以下內容: cloud002 12
  • 14. Hadoop 2.2.0 (multi-node) Installation on Ubuntu cloud003 修改 core-site.xml <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://cloud001:9000</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/home/hduser/temp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>hadoop.proxyuser.hduser.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hduser.groups</name> <value>*</value> </property> </configuration> 13
  • 15. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 hdfs-site.xml <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>cloud001:9001</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/hduser/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/hduser/dfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration> 14
  • 16. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 修改 mapred-site.xml <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>cloud001:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>cloud001:19888</value> </property> </configuration> 修改 yarn-site.xml <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> 15
  • 17. Hadoop 2.2.0 (multi-node) Installation on Ubuntu <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>cloud001:8040</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>cloud001:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>cloud001:8025</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>cloud001:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>cloud001:8088</value> </property> </configuration> 設定環境變數 hduser@cloud001:~$ vim ~/.bashrc 16
  • 18. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 最後面貼上 export HADOOP_HOME=/home/hduser/hadoop-2.2.0 export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib" (3) clone imagecloud001 to cloud002 & cloud003 ,然後修改 hostname 17
  • 19. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 8. Hadoop 2.2 啟動 (1) 進入安裝目錄: cd ~/hadoop-2.2.0/,格式化 namenode ./bin/hdfs namenode –format 18
  • 20. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (2) 啟動 hdfs ./sbin/start-dfs.sh 此時在 001 上面運行的進程有:namenode secondarynamenode 002 和 003 上面運行的進程有:datanode (3) 啟動 yarn ./sbin/start-yarn.sh 此時在 001 上面運行的進程有:namenode secondarynamenoderesourcemanager 002 和 003 上面運行的進程有:datanode nodemanaget 19
  • 21. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 20
  • 22. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (4) 查看叢集狀態 ./bin/hdfs dfsadmin –report (5) 查看文件組成 ./bin/hdfs fsck / -files –blocks 21
  • 23. Hadoop 2.2.0 (multi-node) Installation on Ubuntu (6) 查看 HDFS (7) 查看 RM 22
  • 24. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 23
  • 25. Hadoop 2.2.0 (multi-node) Installation on Ubuntu 五、本文的引用網址: 1. http://blog.csdn.net/licongcong_0224/article/details/12972889 2. http://blog.csdn.net/focusheart/article/details/14005893(單機板) 3. http://dawndiy.com/archives/155/ (Linux 下安装配置 JDK7) 4. http://www.ithome.com.tw/itadm/article.php?c=73978&s=1 (Hadoop 簡介) 5. http://www.runpc.com.tw/content/cloud_content.aspx?id=105318 (Hadoop 簡介) 24