SlideShare une entreprise Scribd logo
1  sur  7
Deploy Hadoop on Cluster
Install Hadoop in distributed mode
This document explains how to setup Hadoop on real cluster. Here one node will act as master and rest
(two) as slave. To get real power of Hadoop Multi-node cluster is used in the productions. In this
document we will use 3 machines to deploy Hadoop cluster
2
Contents
1. Recommended Platform .................................................................................................................4
2. Prerequisites:.................................................................................................................................4
3. Install java 7 (recommended oracle java) ........................................................................................4
3.1Update the source list.................................................................................................................4
3.2 Install Java: ...............................................................................................................................4
4. Add entry of master and slaves in hosts file:.....................................................................................4
5. Configure SSH.................................................................................................................................4
5.1 Install Open SSH Server-Client....................................................................................................4
5.2 Generate key pairs ....................................................................................................................4
5.3 Configure password-less SSH......................................................................................................4
5.4 Check by SSH to slaves...............................................................................................................5
5. Download Hadoop..........................................................................................................................5
5.1 Download Hadoop.....................................................................................................................5
6. Install Hadoop................................................................................................................................5
6.1 Untar Tar ball............................................................................................................................5
6.2 Go to HADOOP_HOME_DIR........................................................................................................5
7. Setup Configuration:.......................................................................................................................5
7.1 Edit configuration file conf/hadoop-env.sh and set JAVA_HOME..................................................5
7.2 Edit configuration file conf/core-site.xml and add following entries:.............................................5
7.3 Edit configuration file conf/hdfs-site.xml and add following entries:.............................................5
7.4 Edit configuration file conf/mapred-site.xml and add following entries:........................................6
7.5 Edit configuration file conf/masters and add entry of secondary-master.......................................6
7.6 Edit configuration file conf/slaves and add entry of slaves ...........................................................6
7.7 Set environment variables .........................................................................................................6
8. Setup Hadoop on slaves..................................................................................................................6
8.1 Repeat the step-3 and step-4 on all the slaves.............................................................................6
8.2 Create tar ball of configured Hadoop-setup and copy to all the slaves: .........................................6
8.3 Untar configured Hadoop-setup on all the slaves ........................................................................6
9. Start The Cluster.............................................................................................................................6
9.1 Format the name node:.............................................................................................................6
9.2 Now start Hadoop services.........................................................................................................7
9.2.1 Start HDFS services .............................................................................................................7
3
9.2.2 Start Map-Reduce services ..................................................................................................7
9.3. Check daemons status, by running jps command:.......................................................................7
9.3.1 On master ..........................................................................................................................7
9.3.2 On slaves-01:......................................................................................................................7
9.3.3 On slaves-02:......................................................................................................................7
10. Stop the cluster ............................................................................................................................7
10.1 Stop mapreduce services .........................................................................................................7
10.2 Stop HDFS services ..................................................................................................................7
4
1. Recommended Platform
• OS: Ubuntu 12.04 or later (you can use other OS (cent OS, Redhat, etc))
• Hadoop: Cloudera distribution for Apache hadoop CDH3U6 (you can use Apache hadoop (0.20.X
/ 1.X))
2. Prerequisites:
• Java (oracle java is recommended for production)
• Password-less SSH setup (Hadoop need passwordless ssh from master to all the slaves, this is
required for remote script invocations)
Run following commands on the Master of Hadoop Cluster
3. Install java 7 (recommended oracle java)
3.1Update the source list
sudo apt-get update sudo apt-get install python-
software-properties sudo add-apt-repository
ppa:webupd8team/java sudo apt-get update
3.2 Install Java:
sudo apt-get install oracle-java7-installer
4. Add entry of master and slaves in hosts file:
Edit hosts file and following add entries
sudo nano /etc/hosts MASTER-IP
master
SLAVE01-IP slave-01
SLAVE02-IP slave-02
(In place of MASTER-IP, SLAVE01-IP, SLAVE02-IP put the value of corresponding IP)
5. Configure SSH
5.1 Install Open SSH Server-Client
sudo apt-get install openssh-server openssh-client
5.2 Generate key pairs
ssh-keygen -t rsa -P ""
5.3 Configure password-less SSH
Copy the contents of “$HOME/.ssh/id_rsa.pub” of master to “$HOME/.ssh/authorized_keys” all the
slaves.
5
5.4 Check by SSH to slaves
ssh slave-01 ssh
slave-02
5. Download Hadoop
5.1 Download Hadoop
http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u6.tar.gz
6. Install Hadoop
6.1 Untar Tar ball
tar xzf hadoop-0.20.2-cdh3u6.tar.gz
6.2 Go to HADOOP_HOME_DIR
cd hadoop-0.20.2-cdh3u6/
7. Setup Configuration:
7.1 Edit configuration file conf/hadoop-env.sh and set JAVA_HOME
export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/jdk1.7.0_65)
7.2 Edit configuration file conf/core-site.xml and add following entries:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop_admin/hdata/hadoop-${user.name}</value>
</property>
</configuration>
7.3 Edit configuration file conf/hdfs-site.xml and add following entries:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
6
</configuration>
7.4 Edit configuration file conf/mapred-site.xml and add following entries:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>master:9001</value>
</property>
</configuration>
7.5 Edit configuration file conf/masters and add entry of secondary-master
slave-01
IP/Alias of node, where secondary-master will run
7.6 Edit configuration file conf/slaves and add entry of slaves
slave-01 slave-02
7.7 Set environment variables
Update ~/.bashrc and set or update the HADOOP_HOME and PATH shell variables as follows:
nano ~/.bashrc
export HADOOP_HOME=/home/hadoop/hadoop-0.20.2-cdh3u6
export PATH=$PATH:$HADOOP_HOME/bin Hadoop is
setup on master.
8. Setup Hadoop on slaves.
8.1 Repeat the step-3 and step-4 on all the slaves
Step-3: “install Java”
Step-4: “Add entry of master, slaves in hosts file”
8.2 Create tar ball of configured Hadoop-setup and copy to all the slaves:
tar czf hadoop.tar.gz hadoop-0.20.2-cdh3u6
scp hadoop.tar.gz slave01:~ scp
hadoop.tar.gz slave02:~
8.3 Untar configured Hadoop-setup on all the slaves
tar xzf hadoop.tar.gz
Run this command on all the slaves
9. Start The Cluster
9.1 Format the name node:
$bin/hadoop namenode –format
This activity should be done once when you install hadoop, else It will delete all your data from HDFS
7
9.2 Now start Hadoop services
9.2.1 Start HDFS services
$bin/start-dfs.sh
Run this command on master
9.2.2 Start Map-Reduce services
$bin/start-mapred.sh
Run this command on master
9.3. Check daemons status, by running jps command:
9.3.1 On master
$jps
NameNode
JobTracker
9.3.2 On slaves-01:
$jps
TaskTracker
DataNode SecondaryNameNode
9.3.3 On slaves-02:
$jps
TaskTracker
DataNode
10. Stop the cluster
10.1 Stop mapreduce services
$bin/start-mapred.sh
Run this command on master
10.2 Stop HDFS services
$bin/start-dfs.sh
Run this command on master

Contenu connexe

Tendances

Construction ofanoracle10glinuxserver 0.5
Construction ofanoracle10glinuxserver 0.5Construction ofanoracle10glinuxserver 0.5
Construction ofanoracle10glinuxserver 0.5sopan sonar
 
Xtrabackup工具使用简介 - 20110427
Xtrabackup工具使用简介 - 20110427Xtrabackup工具使用简介 - 20110427
Xtrabackup工具使用简介 - 20110427Jinrong Ye
 
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...webhostingguy
 
Db2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesDb2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesbupbechanhgmail
 
Mysql wp cluster_quickstart_windows
Mysql wp cluster_quickstart_windowsMysql wp cluster_quickstart_windows
Mysql wp cluster_quickstart_windowsRogério Rocha
 
D space manual 1.5.2
D space manual 1.5.2D space manual 1.5.2
D space manual 1.5.2tvcumet
 
BOOK - IBM Z vse using db2 on linux for system z
BOOK - IBM Z vse using db2 on linux for system zBOOK - IBM Z vse using db2 on linux for system z
BOOK - IBM Z vse using db2 on linux for system zSatya Harish
 
Metatron Technology Consulting 's MySQL to PostgreSQL ...
Metatron Technology Consulting 's MySQL to PostgreSQL ...Metatron Technology Consulting 's MySQL to PostgreSQL ...
Metatron Technology Consulting 's MySQL to PostgreSQL ...webhostingguy
 
Mater,slave on mysql
Mater,slave on mysqlMater,slave on mysql
Mater,slave on mysqlVasudeva Rao
 
WebHost Manager Online Help 1.0
WebHost Manager Online Help 1.0WebHost Manager Online Help 1.0
WebHost Manager Online Help 1.0webhostingguy
 
Jboss4 clustering
Jboss4 clusteringJboss4 clustering
Jboss4 clusteringshahdullah
 
PipelineProject
PipelineProjectPipelineProject
PipelineProjectMark Short
 
Architecting cloud
Architecting cloudArchitecting cloud
Architecting cloudTahsin Hasan
 

Tendances (18)

Construction ofanoracle10glinuxserver 0.5
Construction ofanoracle10glinuxserver 0.5Construction ofanoracle10glinuxserver 0.5
Construction ofanoracle10glinuxserver 0.5
 
Xtrabackup工具使用简介 - 20110427
Xtrabackup工具使用简介 - 20110427Xtrabackup工具使用简介 - 20110427
Xtrabackup工具使用简介 - 20110427
 
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...
EMC NetWorker Module for Microsoft SQL Server Release 5.1 ...
 
Understand
UnderstandUnderstand
Understand
 
Db2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy servicesDb2 udb backup and recovery with ess copy services
Db2 udb backup and recovery with ess copy services
 
Mysql wp cluster_quickstart_windows
Mysql wp cluster_quickstart_windowsMysql wp cluster_quickstart_windows
Mysql wp cluster_quickstart_windows
 
D space manual 1.5.2
D space manual 1.5.2D space manual 1.5.2
D space manual 1.5.2
 
BOOK - IBM Z vse using db2 on linux for system z
BOOK - IBM Z vse using db2 on linux for system zBOOK - IBM Z vse using db2 on linux for system z
BOOK - IBM Z vse using db2 on linux for system z
 
Metatron Technology Consulting 's MySQL to PostgreSQL ...
Metatron Technology Consulting 's MySQL to PostgreSQL ...Metatron Technology Consulting 's MySQL to PostgreSQL ...
Metatron Technology Consulting 's MySQL to PostgreSQL ...
 
Bugzilla guide
Bugzilla guideBugzilla guide
Bugzilla guide
 
Mater,slave on mysql
Mater,slave on mysqlMater,slave on mysql
Mater,slave on mysql
 
WebHost Manager Online Help 1.0
WebHost Manager Online Help 1.0WebHost Manager Online Help 1.0
WebHost Manager Online Help 1.0
 
Book hudson
Book hudsonBook hudson
Book hudson
 
Jboss4 clustering
Jboss4 clusteringJboss4 clustering
Jboss4 clustering
 
Memory Pools for C and C++
Memory Pools for C and C++Memory Pools for C and C++
Memory Pools for C and C++
 
installation_manual
installation_manualinstallation_manual
installation_manual
 
PipelineProject
PipelineProjectPipelineProject
PipelineProject
 
Architecting cloud
Architecting cloudArchitecting cloud
Architecting cloud
 

En vedette

Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advancedChirag Ahuja
 
An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop InstallMike Frampton
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Enrique Davila
 
Introducción a Big Data. HDInsight - Webcast Technet SolidQ
Introducción a Big Data. HDInsight - Webcast Technet SolidQIntroducción a Big Data. HDInsight - Webcast Technet SolidQ
Introducción a Big Data. HDInsight - Webcast Technet SolidQSolidQ
 
Big Data para Dummies
Big Data para DummiesBig Data para Dummies
Big Data para DummiesStratebi
 
Install Apache Hadoop for Development/Production
Install Apache Hadoop for  Development/ProductionInstall Apache Hadoop for  Development/Production
Install Apache Hadoop for Development/ProductionIMC Institute
 
Big data para principiantes
Big data para principiantesBig data para principiantes
Big data para principiantesCarlos Toxtli
 
Install hadoop in a cluster
Install hadoop in a clusterInstall hadoop in a cluster
Install hadoop in a clusterXuhong Zhang
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsSkillspeed
 
Introducción al Big Data
Introducción al Big DataIntroducción al Big Data
Introducción al Big DataDavid Alayón
 
Jamaica
JamaicaJamaica
Jamaicachglat
 
St. Thomas and Peter Island
St. Thomas and Peter IslandSt. Thomas and Peter Island
St. Thomas and Peter Islandchglat
 
2011 05 26 museomemoriaandalucia 4_ay2bachay1bc
2011 05 26 museomemoriaandalucia 4_ay2bachay1bc2011 05 26 museomemoriaandalucia 4_ay2bachay1bc
2011 05 26 museomemoriaandalucia 4_ay2bachay1bcpabloacostarobles
 
Smart phones
Smart phonesSmart phones
Smart phonescmbh1
 
Great Exuma
Great ExumaGreat Exuma
Great Exumachglat
 
Justin Riviera Maya Options
Justin Riviera Maya OptionsJustin Riviera Maya Options
Justin Riviera Maya Optionschglat
 
Lauren Jamaica Options
Lauren Jamaica OptionsLauren Jamaica Options
Lauren Jamaica Optionschglat
 
David St. Lucia Options
David St. Lucia OptionsDavid St. Lucia Options
David St. Lucia Optionschglat
 

En vedette (20)

Mapreduce advanced
Mapreduce advancedMapreduce advanced
Mapreduce advanced
 
An example Hadoop Install
An example Hadoop InstallAn example Hadoop Install
An example Hadoop Install
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
 
Introducción a Big Data. HDInsight - Webcast Technet SolidQ
Introducción a Big Data. HDInsight - Webcast Technet SolidQIntroducción a Big Data. HDInsight - Webcast Technet SolidQ
Introducción a Big Data. HDInsight - Webcast Technet SolidQ
 
Big Data para Dummies
Big Data para DummiesBig Data para Dummies
Big Data para Dummies
 
Install Apache Hadoop for Development/Production
Install Apache Hadoop for  Development/ProductionInstall Apache Hadoop for  Development/Production
Install Apache Hadoop for Development/Production
 
Big data para principiantes
Big data para principiantesBig data para principiantes
Big data para principiantes
 
Ppt recentschoolnieuws
Ppt recentschoolnieuwsPpt recentschoolnieuws
Ppt recentschoolnieuws
 
Install hadoop in a cluster
Install hadoop in a clusterInstall hadoop in a cluster
Install hadoop in a cluster
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
 
Introducción al Big Data
Introducción al Big DataIntroducción al Big Data
Introducción al Big Data
 
facebook^^
facebook^^facebook^^
facebook^^
 
Jamaica
JamaicaJamaica
Jamaica
 
St. Thomas and Peter Island
St. Thomas and Peter IslandSt. Thomas and Peter Island
St. Thomas and Peter Island
 
2011 05 26 museomemoriaandalucia 4_ay2bachay1bc
2011 05 26 museomemoriaandalucia 4_ay2bachay1bc2011 05 26 museomemoriaandalucia 4_ay2bachay1bc
2011 05 26 museomemoriaandalucia 4_ay2bachay1bc
 
Smart phones
Smart phonesSmart phones
Smart phones
 
Great Exuma
Great ExumaGreat Exuma
Great Exuma
 
Justin Riviera Maya Options
Justin Riviera Maya OptionsJustin Riviera Maya Options
Justin Riviera Maya Options
 
Lauren Jamaica Options
Lauren Jamaica OptionsLauren Jamaica Options
Lauren Jamaica Options
 
David St. Lucia Options
David St. Lucia OptionsDavid St. Lucia Options
David St. Lucia Options
 

Similaire à Deploy hadoop cluster

Administrator en
Administrator enAdministrator en
Administrator enCáo Già
 
GNU Gatekeeper 5.11
GNU Gatekeeper 5.11GNU Gatekeeper 5.11
GNU Gatekeeper 5.11J W
 
EMC NetWorker Module for Microsoft SQL Server Administrators ...
EMC NetWorker Module for Microsoft SQL Server Administrators ...EMC NetWorker Module for Microsoft SQL Server Administrators ...
EMC NetWorker Module for Microsoft SQL Server Administrators ...webhostingguy
 
IBM Connections 4.5 bidirectional synchronization
IBM Connections 4.5 bidirectional synchronizationIBM Connections 4.5 bidirectional synchronization
IBM Connections 4.5 bidirectional synchronizationmichele buccarello
 
Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Sal Marcus
 
Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Sal Marcus
 
Performance tuning for ibm tivoli directory server redp4258
Performance tuning for ibm tivoli directory server   redp4258Performance tuning for ibm tivoli directory server   redp4258
Performance tuning for ibm tivoli directory server redp4258Banking at Ho Chi Minh city
 
Implementing IBM SmartCloud Entry on IBM PureFlex System
Implementing IBM SmartCloud Entry on IBM PureFlex SystemImplementing IBM SmartCloud Entry on IBM PureFlex System
Implementing IBM SmartCloud Entry on IBM PureFlex SystemIBM India Smarter Computing
 
digital marketing training in bangalore
digital marketing training in bangaloredigital marketing training in bangalore
digital marketing training in bangaloreVenus Tech Inc.
 
Gnugk manual-2.3.2
Gnugk manual-2.3.2Gnugk manual-2.3.2
Gnugk manual-2.3.2rusbomber
 
Cockpit esp
Cockpit espCockpit esp
Cockpit espmsabry7
 
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...IBM India Smarter Computing
 
Apache Web server Complete Guide
Apache Web server Complete GuideApache Web server Complete Guide
Apache Web server Complete Guidewebhostingguy
 
Apache Web server Complete Guide
Apache Web server Complete GuideApache Web server Complete Guide
Apache Web server Complete Guidewebhostingguy
 

Similaire à Deploy hadoop cluster (20)

hci10_help_sap_en.pdf
hci10_help_sap_en.pdfhci10_help_sap_en.pdf
hci10_help_sap_en.pdf
 
SAP CPI-DS.pdf
SAP CPI-DS.pdfSAP CPI-DS.pdf
SAP CPI-DS.pdf
 
Administrator en
Administrator enAdministrator en
Administrator en
 
GNU Gatekeeper 5.11
GNU Gatekeeper 5.11GNU Gatekeeper 5.11
GNU Gatekeeper 5.11
 
HRpM_UG_731_HDS_M2
HRpM_UG_731_HDS_M2HRpM_UG_731_HDS_M2
HRpM_UG_731_HDS_M2
 
EMC NetWorker Module for Microsoft SQL Server Administrators ...
EMC NetWorker Module for Microsoft SQL Server Administrators ...EMC NetWorker Module for Microsoft SQL Server Administrators ...
EMC NetWorker Module for Microsoft SQL Server Administrators ...
 
IBM Connections 4.5 bidirectional synchronization
IBM Connections 4.5 bidirectional synchronizationIBM Connections 4.5 bidirectional synchronization
IBM Connections 4.5 bidirectional synchronization
 
Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2
 
Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2Maa wp sun_apps11i_db10g_r2-2
Maa wp sun_apps11i_db10g_r2-2
 
Performance tuning for ibm tivoli directory server redp4258
Performance tuning for ibm tivoli directory server   redp4258Performance tuning for ibm tivoli directory server   redp4258
Performance tuning for ibm tivoli directory server redp4258
 
Implementing IBM SmartCloud Entry on IBM PureFlex System
Implementing IBM SmartCloud Entry on IBM PureFlex SystemImplementing IBM SmartCloud Entry on IBM PureFlex System
Implementing IBM SmartCloud Entry on IBM PureFlex System
 
digital marketing training in bangalore
digital marketing training in bangaloredigital marketing training in bangalore
digital marketing training in bangalore
 
Gnugk manual-2.3.2
Gnugk manual-2.3.2Gnugk manual-2.3.2
Gnugk manual-2.3.2
 
Sap setup guide
Sap setup guideSap setup guide
Sap setup guide
 
Cockpit esp
Cockpit espCockpit esp
Cockpit esp
 
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...
DB2 10 for Linux on System z Using z/VM v6.2, Single System Image Clusters an...
 
Db2 virtualization
Db2 virtualizationDb2 virtualization
Db2 virtualization
 
Apache Web server Complete Guide
Apache Web server Complete GuideApache Web server Complete Guide
Apache Web server Complete Guide
 
Apache Web server Complete Guide
Apache Web server Complete GuideApache Web server Complete Guide
Apache Web server Complete Guide
 
Administrator manual-e2
Administrator manual-e2Administrator manual-e2
Administrator manual-e2
 

Plus de Chirag Ahuja

Plus de Chirag Ahuja (9)

Word count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using javaWord count example in hadoop mapreduce using java
Word count example in hadoop mapreduce using java
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Flume
FlumeFlume
Flume
 
Hbase
HbaseHbase
Hbase
 
Pig
PigPig
Pig
 
Hive : WareHousing Over hadoop
Hive :  WareHousing Over hadoopHive :  WareHousing Over hadoop
Hive : WareHousing Over hadoop
 
MapReduce basic
MapReduce basicMapReduce basic
MapReduce basic
 
Hdfs
HdfsHdfs
Hdfs
 
Hadoop introduction
Hadoop introductionHadoop introduction
Hadoop introduction
 

Dernier

04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 

Dernier (20)

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 

Deploy hadoop cluster

  • 1. Deploy Hadoop on Cluster Install Hadoop in distributed mode This document explains how to setup Hadoop on real cluster. Here one node will act as master and rest (two) as slave. To get real power of Hadoop Multi-node cluster is used in the productions. In this document we will use 3 machines to deploy Hadoop cluster
  • 2. 2 Contents 1. Recommended Platform .................................................................................................................4 2. Prerequisites:.................................................................................................................................4 3. Install java 7 (recommended oracle java) ........................................................................................4 3.1Update the source list.................................................................................................................4 3.2 Install Java: ...............................................................................................................................4 4. Add entry of master and slaves in hosts file:.....................................................................................4 5. Configure SSH.................................................................................................................................4 5.1 Install Open SSH Server-Client....................................................................................................4 5.2 Generate key pairs ....................................................................................................................4 5.3 Configure password-less SSH......................................................................................................4 5.4 Check by SSH to slaves...............................................................................................................5 5. Download Hadoop..........................................................................................................................5 5.1 Download Hadoop.....................................................................................................................5 6. Install Hadoop................................................................................................................................5 6.1 Untar Tar ball............................................................................................................................5 6.2 Go to HADOOP_HOME_DIR........................................................................................................5 7. Setup Configuration:.......................................................................................................................5 7.1 Edit configuration file conf/hadoop-env.sh and set JAVA_HOME..................................................5 7.2 Edit configuration file conf/core-site.xml and add following entries:.............................................5 7.3 Edit configuration file conf/hdfs-site.xml and add following entries:.............................................5 7.4 Edit configuration file conf/mapred-site.xml and add following entries:........................................6 7.5 Edit configuration file conf/masters and add entry of secondary-master.......................................6 7.6 Edit configuration file conf/slaves and add entry of slaves ...........................................................6 7.7 Set environment variables .........................................................................................................6 8. Setup Hadoop on slaves..................................................................................................................6 8.1 Repeat the step-3 and step-4 on all the slaves.............................................................................6 8.2 Create tar ball of configured Hadoop-setup and copy to all the slaves: .........................................6 8.3 Untar configured Hadoop-setup on all the slaves ........................................................................6 9. Start The Cluster.............................................................................................................................6 9.1 Format the name node:.............................................................................................................6 9.2 Now start Hadoop services.........................................................................................................7 9.2.1 Start HDFS services .............................................................................................................7
  • 3. 3 9.2.2 Start Map-Reduce services ..................................................................................................7 9.3. Check daemons status, by running jps command:.......................................................................7 9.3.1 On master ..........................................................................................................................7 9.3.2 On slaves-01:......................................................................................................................7 9.3.3 On slaves-02:......................................................................................................................7 10. Stop the cluster ............................................................................................................................7 10.1 Stop mapreduce services .........................................................................................................7 10.2 Stop HDFS services ..................................................................................................................7
  • 4. 4 1. Recommended Platform • OS: Ubuntu 12.04 or later (you can use other OS (cent OS, Redhat, etc)) • Hadoop: Cloudera distribution for Apache hadoop CDH3U6 (you can use Apache hadoop (0.20.X / 1.X)) 2. Prerequisites: • Java (oracle java is recommended for production) • Password-less SSH setup (Hadoop need passwordless ssh from master to all the slaves, this is required for remote script invocations) Run following commands on the Master of Hadoop Cluster 3. Install java 7 (recommended oracle java) 3.1Update the source list sudo apt-get update sudo apt-get install python- software-properties sudo add-apt-repository ppa:webupd8team/java sudo apt-get update 3.2 Install Java: sudo apt-get install oracle-java7-installer 4. Add entry of master and slaves in hosts file: Edit hosts file and following add entries sudo nano /etc/hosts MASTER-IP master SLAVE01-IP slave-01 SLAVE02-IP slave-02 (In place of MASTER-IP, SLAVE01-IP, SLAVE02-IP put the value of corresponding IP) 5. Configure SSH 5.1 Install Open SSH Server-Client sudo apt-get install openssh-server openssh-client 5.2 Generate key pairs ssh-keygen -t rsa -P "" 5.3 Configure password-less SSH Copy the contents of “$HOME/.ssh/id_rsa.pub” of master to “$HOME/.ssh/authorized_keys” all the slaves.
  • 5. 5 5.4 Check by SSH to slaves ssh slave-01 ssh slave-02 5. Download Hadoop 5.1 Download Hadoop http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u6.tar.gz 6. Install Hadoop 6.1 Untar Tar ball tar xzf hadoop-0.20.2-cdh3u6.tar.gz 6.2 Go to HADOOP_HOME_DIR cd hadoop-0.20.2-cdh3u6/ 7. Setup Configuration: 7.1 Edit configuration file conf/hadoop-env.sh and set JAVA_HOME export JAVA_HOME=path to be the root of your Java installation(eg: /usr/lib/jvm/jdk1.7.0_65) 7.2 Edit configuration file conf/core-site.xml and add following entries: <configuration> <property> <name>fs.default.name</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop_admin/hdata/hadoop-${user.name}</value> </property> </configuration> 7.3 Edit configuration file conf/hdfs-site.xml and add following entries: <configuration> <property> <name>dfs.replication</name> <value>2</value> </property>
  • 6. 6 </configuration> 7.4 Edit configuration file conf/mapred-site.xml and add following entries: <configuration> <property> <name>mapred.job.tracker</name> <value>master:9001</value> </property> </configuration> 7.5 Edit configuration file conf/masters and add entry of secondary-master slave-01 IP/Alias of node, where secondary-master will run 7.6 Edit configuration file conf/slaves and add entry of slaves slave-01 slave-02 7.7 Set environment variables Update ~/.bashrc and set or update the HADOOP_HOME and PATH shell variables as follows: nano ~/.bashrc export HADOOP_HOME=/home/hadoop/hadoop-0.20.2-cdh3u6 export PATH=$PATH:$HADOOP_HOME/bin Hadoop is setup on master. 8. Setup Hadoop on slaves. 8.1 Repeat the step-3 and step-4 on all the slaves Step-3: “install Java” Step-4: “Add entry of master, slaves in hosts file” 8.2 Create tar ball of configured Hadoop-setup and copy to all the slaves: tar czf hadoop.tar.gz hadoop-0.20.2-cdh3u6 scp hadoop.tar.gz slave01:~ scp hadoop.tar.gz slave02:~ 8.3 Untar configured Hadoop-setup on all the slaves tar xzf hadoop.tar.gz Run this command on all the slaves 9. Start The Cluster 9.1 Format the name node: $bin/hadoop namenode –format This activity should be done once when you install hadoop, else It will delete all your data from HDFS
  • 7. 7 9.2 Now start Hadoop services 9.2.1 Start HDFS services $bin/start-dfs.sh Run this command on master 9.2.2 Start Map-Reduce services $bin/start-mapred.sh Run this command on master 9.3. Check daemons status, by running jps command: 9.3.1 On master $jps NameNode JobTracker 9.3.2 On slaves-01: $jps TaskTracker DataNode SecondaryNameNode 9.3.3 On slaves-02: $jps TaskTracker DataNode 10. Stop the cluster 10.1 Stop mapreduce services $bin/start-mapred.sh Run this command on master 10.2 Stop HDFS services $bin/start-dfs.sh Run this command on master