SlideShare une entreprise Scribd logo
1  sur  6
HADOOP ON 64 BIT UBUNTU 14.04
In this, we'll install a single-node Hadoop cluster backed by the Hadoop Distributed
File System on Ubuntu.
INSTALLING JAVA
Hadoop framework is written in Java!!
# Update the source list
$ sudo apt-get update
# The OpenJDK project is the default version of Java
$ sudo apt-get install default-jdk
$ java -version
java version "1.7.0_55"
OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)
Adding a dedicated Hadoop user
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser
INSTALLING SSH
ssh has 2 main components :
1).ssh: The command we use to connect to remote machines - the client.
2).sshd: The daemon that is running on the server and allows clients to connect to the
server.
The ssh is pre-enabled on Linux, but in order to start sshd daemon, we need to install ssh
first. Use this command to do that :
$ sudo apt-get install ssh
This will install ssh on our machine. If we get something similar to the following, we can
think it is setup properly
$ which ssh
/usr/bin/ssh
$ which sshd
/usr/sbin/sshd
CREATE AND SETUP SSH CERTIFICATES
Hadoop requires SSH access to manage its nodes, i.e. remote machines plus our local
machine. For our single-node setup of Hadoop, we therefore need to configure SSH access
to localhost.
So, we need to have SSH up and running on our machine and configured it to allow
SSH public key authentication.
Hadoop uses SSH (to access its nodes) which would normally require the user to
enter a password. However, this requirement can be eliminated by creating and setting up
SSH certificates using the following commands. If asked for a filename just leave it blank
and press the enter key to continue.
$ su hduser
$ ssh-keygen -t rsa -P ""
Generating public/private rsa key pair. Enter file in which to save the key
(/home/hduser/.ssh/id_rsa): Created directory '/home/hduser/.ssh'. Your identification has
been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in
/home/hduser/.ssh/id_rsa.pub. The key fingerprint is:
5a:48:c1:c2:18:f7:93:b5:7e:d2:b8:4b:c3:19:7b:bf hduser@k The key's randomart image is:
$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
The second command adds the newly created key to the list of authorized keys so that
Hadoop can use ssh without prompting for a password.
We can check if ssh works:
$ ssh localhost
INSTALL HADOOP
$ wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.4.1/hadoop-
2.4.1.tar.gz
$ tar xvzf hadoop-2.4.1.tar.gz
We want to move the Hadoop installation to the /usr/local/hadoop directory using the
following command:
$ sudo mv hadoop-2.4.1 /usr/local/hadoop
$ sudo chown -R hduser:hadoop hadoop
$ pwd
/usr/local/hadoop
SETUP CONFIGURATION FILES
The following files will have to be modified to complete the Hadoop setup:
1).~/.bashrc
Before editing the .bashrc file in our home directory, we need to find the path where Java
has been installed to set the JAVA_HOME environment variable.
Now we can append the following to the end of ~/.bashrc
#HADOOP VARIABLES START
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
#HADOOP VARIABLES END
2)./usr/local/hadoop/etc/hadoop/hadoop-env.sh
We need to set JAVA_HOME by modifying hadoop-env.sh file
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
3)./usr/local/hadoop/etc/hadoop/core-site.xml
The /usr/local/hadoop/etc/hadoop/core-site.xml file contains configuration properties that
Hadoop uses when starting up. This file can be used to override the default settings that
Hadoop starts with.
$ sudo mkdir -p /app/hadoop/tmp
$ sudo chown hduser:hadoop /app/hadoop/tmp
Open the file and enter the following in between the <configuration></configuration> tag:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/app/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
4)./usr/local/hadoop/etc/hadoop/mapred-site.xml.
By default, the /usr/local/hadoop/etc/hadoop/ folder contains the
/usr/local/hadoop/etc/hadoop/mapred-site.xml.template file which has to be renamed/copied
with the name mapred-site.xml:
$ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml
The mapred-site.xml file is used to specify which framework is being used for
MapReduce. We need to enter the following content in between the
<configuration></configuration> tag:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
5)./usr/local/hadoop/etc/hadoop/yarn-site.xml.
Add the following
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
6)./usr/local/hadoop/etc/hadoop/hdfs-site.xml.
The /usr/local/hadoop/etc/hadoop/hdfs-site.xml file needs to be configured for each host in
the cluster that is being used. It is used to specify the directories which will be used as the
namenode and the datanode on that host.
Before editing this file, we need to create two directories which will contain the namenode
and the datanode for this Hadoop installation. This can be done using the following
commands:,/p>
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode
$ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode
$ sudo chown -R hduser:hadoop /usr/local/hadoop_store
Open the file and enter the following content in between the
<configuration></configuration> tag:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
FORMAT THE NEW HADOOP FILESYSTEM
Now, the Hadoop filesystem needs to be formatted so that we can start to use it. The format
command should be issued with write permission since it creates current directory under
/usr/local/hadoop_store/hdfs/namenode folder:
hduser@k:~$ hadoop namenode -format
STARTING HADOOP
Now it's time to start the newly installed single node cluster.
We can use start-all.sh or (start-dfs.sh and start-yarn.sh)
hduser@k:/home/k$ start-all.sh
We can check if it's really up and running:
hduser@k:/home/k$ jps
6129 Jps
5484 NameNode
5811 SecondaryNameNode
5969 ResourceManager
6094 NodeManager
5610 DataNode

Contenu connexe

Tendances

Hadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup InsightsHadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup InsightsSruthi Kumar Annamnidu
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117exsuns
 
eZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisitedeZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisitedBertrand Dunogier
 
Linux basic for CADD biologist
Linux basic for CADD biologistLinux basic for CADD biologist
Linux basic for CADD biologistAjay Murali
 
HaskellとDebianの辛くて甘い関係
HaskellとDebianの辛くて甘い関係HaskellとDebianの辛くて甘い関係
HaskellとDebianの辛くて甘い関係Kiwamu Okabe
 
Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication holVijay Kumar N
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Manish Chopra
 
Redis and its many use cases
Redis and its many use casesRedis and its many use cases
Redis and its many use casesChristian Joudrey
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215exsuns
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparisonarunkumar sadhasivam
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...NETWAYS
 
Commands documentaion
Commands documentaionCommands documentaion
Commands documentaionTejalNijai
 
Linux command line cheatsheet
Linux command line cheatsheetLinux command line cheatsheet
Linux command line cheatsheetWe Ihaveapc
 
Hive data migration (export/import)
Hive data migration (export/import)Hive data migration (export/import)
Hive data migration (export/import)Bopyo Hong
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBaseAmal Abid
 
Asian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On UblAsian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On Ublnewrforce
 
Setting up LAMP for Linux newbies
Setting up LAMP for Linux newbiesSetting up LAMP for Linux newbies
Setting up LAMP for Linux newbiesShabir Ahmad
 

Tendances (20)

Linux configer
Linux configerLinux configer
Linux configer
 
Hadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup InsightsHadoop Cluster - Basic OS Setup Insights
Hadoop Cluster - Basic OS Setup Insights
 
Hadoop 20111117
Hadoop 20111117Hadoop 20111117
Hadoop 20111117
 
eZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisitedeZ Publish cluster unleashed revisited
eZ Publish cluster unleashed revisited
 
Linux basic for CADD biologist
Linux basic for CADD biologistLinux basic for CADD biologist
Linux basic for CADD biologist
 
HaskellとDebianの辛くて甘い関係
HaskellとDebianの辛くて甘い関係HaskellとDebianの辛くて甘い関係
HaskellとDebianの辛くて甘い関係
 
Postgresql 12 streaming replication hol
Postgresql 12 streaming replication holPostgresql 12 streaming replication hol
Postgresql 12 streaming replication hol
 
Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6Setting up a HADOOP 2.2 cluster on CentOS 6
Setting up a HADOOP 2.2 cluster on CentOS 6
 
RHive tutorial - HDFS functions
RHive tutorial - HDFS functionsRHive tutorial - HDFS functions
RHive tutorial - HDFS functions
 
Redis and its many use cases
Redis and its many use casesRedis and its many use cases
Redis and its many use cases
 
Hadoop 20111215
Hadoop 20111215Hadoop 20111215
Hadoop 20111215
 
Hadoop spark performance comparison
Hadoop spark performance comparisonHadoop spark performance comparison
Hadoop spark performance comparison
 
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
Open Source Backup Conference 2014: Workshop bareos introduction, by Philipp ...
 
Commands documentaion
Commands documentaionCommands documentaion
Commands documentaion
 
Linux command line cheatsheet
Linux command line cheatsheetLinux command line cheatsheet
Linux command line cheatsheet
 
Hive data migration (export/import)
Hive data migration (export/import)Hive data migration (export/import)
Hive data migration (export/import)
 
TP2 Big Data HBase
TP2 Big Data HBaseTP2 Big Data HBase
TP2 Big Data HBase
 
Linux cheat-sheet
Linux cheat-sheetLinux cheat-sheet
Linux cheat-sheet
 
Asian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On UblAsian Spirit 3 Day Dba On Ubl
Asian Spirit 3 Day Dba On Ubl
 
Setting up LAMP for Linux newbies
Setting up LAMP for Linux newbiesSetting up LAMP for Linux newbies
Setting up LAMP for Linux newbies
 

En vedette

Word count program execution steps in hadoop
Word count program execution steps in hadoopWord count program execution steps in hadoop
Word count program execution steps in hadoopjijukjoseph
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nageSantosh Nage
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14jijukjoseph
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guideChetan Khatri
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebookNiranjan Pandey
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Enrique Davila
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdfEdureka!
 
The place we_call_home_zsp6_sp_27_prezentacja
The place we_call_home_zsp6_sp_27_prezentacjaThe place we_call_home_zsp6_sp_27_prezentacja
The place we_call_home_zsp6_sp_27_prezentacjazsp6
 
Amenorrea. fin
Amenorrea. finAmenorrea. fin
Amenorrea. finmundu1d
 
Resumen Ranking Master 2011 a.c.a.m
Resumen Ranking Master 2011 a.c.a.mResumen Ranking Master 2011 a.c.a.m
Resumen Ranking Master 2011 a.c.a.mACAM ATLETISMO
 
Formulario de identificaciòn
Formulario de identificaciònFormulario de identificaciòn
Formulario de identificaciònkode99
 
Lezione2schetchup 111126133835-phpapp01
Lezione2schetchup 111126133835-phpapp01Lezione2schetchup 111126133835-phpapp01
Lezione2schetchup 111126133835-phpapp01Giuliana Finco
 
How Chinese Teens Use Digital: Getting to Know Your Customers of Tomorrow
How Chinese Teens Use Digital: Getting to Know Your Customers of TomorrowHow Chinese Teens Use Digital: Getting to Know Your Customers of Tomorrow
How Chinese Teens Use Digital: Getting to Know Your Customers of TomorrowLabbrand
 

En vedette (20)

Word count program execution steps in hadoop
Word count program execution steps in hadoopWord count program execution steps in hadoop
Word count program execution steps in hadoop
 
Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
 
Apache kafka configuration-guide
Apache kafka configuration-guideApache kafka configuration-guide
Apache kafka configuration-guide
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebook
 
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16
 
Hadoop admin
Hadoop adminHadoop admin
Hadoop admin
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
 
Planul evadarii
Planul evadariiPlanul evadarii
Planul evadarii
 
Careers
CareersCareers
Careers
 
Testing
TestingTesting
Testing
 
The place we_call_home_zsp6_sp_27_prezentacja
The place we_call_home_zsp6_sp_27_prezentacjaThe place we_call_home_zsp6_sp_27_prezentacja
The place we_call_home_zsp6_sp_27_prezentacja
 
Amenorrea. fin
Amenorrea. finAmenorrea. fin
Amenorrea. fin
 
Resumen Ranking Master 2011 a.c.a.m
Resumen Ranking Master 2011 a.c.a.mResumen Ranking Master 2011 a.c.a.m
Resumen Ranking Master 2011 a.c.a.m
 
Formulario de identificaciòn
Formulario de identificaciònFormulario de identificaciòn
Formulario de identificaciòn
 
Anatómia i
Anatómia iAnatómia i
Anatómia i
 
Lezione2schetchup 111126133835-phpapp01
Lezione2schetchup 111126133835-phpapp01Lezione2schetchup 111126133835-phpapp01
Lezione2schetchup 111126133835-phpapp01
 
Esco signing
Esco signingEsco signing
Esco signing
 
How Chinese Teens Use Digital: Getting to Know Your Customers of Tomorrow
How Chinese Teens Use Digital: Getting to Know Your Customers of TomorrowHow Chinese Teens Use Digital: Getting to Know Your Customers of Tomorrow
How Chinese Teens Use Digital: Getting to Know Your Customers of Tomorrow
 
Ramas bun clasa a IV-a 2015
Ramas bun clasa a IV-a 2015Ramas bun clasa a IV-a 2015
Ramas bun clasa a IV-a 2015
 

Similaire à Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)

Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase clientShashwat Shriparv
 
Hadoop cluster 安裝
Hadoop cluster 安裝Hadoop cluster 安裝
Hadoop cluster 安裝recast203
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation Mahantesh Angadi
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Titus Damaiyanti
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configurationSubhas Kumar Ghosh
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopAiden Seonghak Hong
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slidesryancox
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2benjaminwootton
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS filesRupak Roy
 
Single node setup
Single node setupSingle node setup
Single node setupKBCHOW123
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON Padma shree. T
 
Implementing Hadoop on a single cluster
Implementing Hadoop on a single clusterImplementing Hadoop on a single cluster
Implementing Hadoop on a single clusterSalil Navgire
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setupMohammad_Tariq
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with InstallationBig data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installationmellempudilavanya999
 
Hadoop meet Rex(How to construct hadoop cluster with rex)
Hadoop meet Rex(How to construct hadoop cluster with rex)Hadoop meet Rex(How to construct hadoop cluster with rex)
Hadoop meet Rex(How to construct hadoop cluster with rex)Jun Hong Kim
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 

Similaire à Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit) (20)

Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
Hadoop completereference
Hadoop completereferenceHadoop completereference
Hadoop completereference
 
Hadoop cluster 安裝
Hadoop cluster 安裝Hadoop cluster 安裝
Hadoop cluster 安裝
 
Single node hadoop cluster installation
Single node hadoop cluster installation Single node hadoop cluster installation
Single node hadoop cluster installation
 
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
 
02 Hadoop deployment and configuration
02 Hadoop deployment and configuration02 Hadoop deployment and configuration
02 Hadoop deployment and configuration
 
R hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing HadoopR hive tutorial supplement 1 - Installing Hadoop
R hive tutorial supplement 1 - Installing Hadoop
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
Run wordcount job (hadoop)
Run wordcount job (hadoop)Run wordcount job (hadoop)
Run wordcount job (hadoop)
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
Configuring and manipulating HDFS files
Configuring and manipulating HDFS filesConfiguring and manipulating HDFS files
Configuring and manipulating HDFS files
 
Single node setup
Single node setupSingle node setup
Single node setup
 
Drupal from scratch
Drupal from scratchDrupal from scratch
Drupal from scratch
 
ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON ACADGILD:: HADOOP LESSON
ACADGILD:: HADOOP LESSON
 
Implementing Hadoop on a single cluster
Implementing Hadoop on a single clusterImplementing Hadoop on a single cluster
Implementing Hadoop on a single cluster
 
Hadoop single node setup
Hadoop single node setupHadoop single node setup
Hadoop single node setup
 
Big data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with InstallationBig data using Hadoop, Hive, Sqoop with Installation
Big data using Hadoop, Hive, Sqoop with Installation
 
Hadoop meet Rex(How to construct hadoop cluster with rex)
Hadoop meet Rex(How to construct hadoop cluster with rex)Hadoop meet Rex(How to construct hadoop cluster with rex)
Hadoop meet Rex(How to construct hadoop cluster with rex)
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Lumen
LumenLumen
Lumen
 

Plus de Nag Arvind Gudiseva

Git as version control for Analytics project
Git as version control for Analytics projectGit as version control for Analytics project
Git as version control for Analytics projectNag Arvind Gudiseva
 
Creating executable JAR from Eclipse IDE
Creating executable JAR from Eclipse IDECreating executable JAR from Eclipse IDE
Creating executable JAR from Eclipse IDENag Arvind Gudiseva
 
Adding Idea IntelliJ projects to Subversion Version Control
Adding Idea IntelliJ projects to Subversion Version ControlAdding Idea IntelliJ projects to Subversion Version Control
Adding Idea IntelliJ projects to Subversion Version ControlNag Arvind Gudiseva
 
Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseApache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseNag Arvind Gudiseva
 
Order Review Solution Application (Version 2.0)
Order Review Solution Application (Version 2.0)Order Review Solution Application (Version 2.0)
Order Review Solution Application (Version 2.0)Nag Arvind Gudiseva
 
MSC Temporary Passwords reset tool
MSC Temporary Passwords reset toolMSC Temporary Passwords reset tool
MSC Temporary Passwords reset toolNag Arvind Gudiseva
 
Store Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationStore Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationNag Arvind Gudiseva
 
Store Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationStore Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationNag Arvind Gudiseva
 

Plus de Nag Arvind Gudiseva (13)

Elasticsearch security
Elasticsearch securityElasticsearch security
Elasticsearch security
 
Elasticsearch Security Strategy
Elasticsearch Security StrategyElasticsearch Security Strategy
Elasticsearch Security Strategy
 
Git as version control for Analytics project
Git as version control for Analytics projectGit as version control for Analytics project
Git as version control for Analytics project
 
Exception Handling in Scala
Exception Handling in ScalaException Handling in Scala
Exception Handling in Scala
 
Hive performance optimizations
Hive performance optimizationsHive performance optimizations
Hive performance optimizations
 
Creating executable JAR from Eclipse IDE
Creating executable JAR from Eclipse IDECreating executable JAR from Eclipse IDE
Creating executable JAR from Eclipse IDE
 
Adding Idea IntelliJ projects to Subversion Version Control
Adding Idea IntelliJ projects to Subversion Version ControlAdding Idea IntelliJ projects to Subversion Version Control
Adding Idea IntelliJ projects to Subversion Version Control
 
Apache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBaseApache Drill with Oracle, Hive and HBase
Apache Drill with Oracle, Hive and HBase
 
ElasticSearch Hands On
ElasticSearch Hands OnElasticSearch Hands On
ElasticSearch Hands On
 
Order Review Solution Application (Version 2.0)
Order Review Solution Application (Version 2.0)Order Review Solution Application (Version 2.0)
Order Review Solution Application (Version 2.0)
 
MSC Temporary Passwords reset tool
MSC Temporary Passwords reset toolMSC Temporary Passwords reset tool
MSC Temporary Passwords reset tool
 
Store Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationStore Support Operations - Training on MSC Application
Store Support Operations - Training on MSC Application
 
Store Support Operations - Training on MSC Application
Store Support Operations - Training on MSC ApplicationStore Support Operations - Training on MSC Application
Store Support Operations - Training on MSC Application
 

Dernier

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)

  • 1. HADOOP ON 64 BIT UBUNTU 14.04 In this, we'll install a single-node Hadoop cluster backed by the Hadoop Distributed File System on Ubuntu. INSTALLING JAVA Hadoop framework is written in Java!! # Update the source list $ sudo apt-get update # The OpenJDK project is the default version of Java $ sudo apt-get install default-jdk $ java -version java version "1.7.0_55" OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode) Adding a dedicated Hadoop user $ sudo addgroup hadoop $ sudo adduser --ingroup hadoop hduser INSTALLING SSH ssh has 2 main components : 1).ssh: The command we use to connect to remote machines - the client. 2).sshd: The daemon that is running on the server and allows clients to connect to the server. The ssh is pre-enabled on Linux, but in order to start sshd daemon, we need to install ssh first. Use this command to do that : $ sudo apt-get install ssh This will install ssh on our machine. If we get something similar to the following, we can think it is setup properly $ which ssh /usr/bin/ssh
  • 2. $ which sshd /usr/sbin/sshd CREATE AND SETUP SSH CERTIFICATES Hadoop requires SSH access to manage its nodes, i.e. remote machines plus our local machine. For our single-node setup of Hadoop, we therefore need to configure SSH access to localhost. So, we need to have SSH up and running on our machine and configured it to allow SSH public key authentication. Hadoop uses SSH (to access its nodes) which would normally require the user to enter a password. However, this requirement can be eliminated by creating and setting up SSH certificates using the following commands. If asked for a filename just leave it blank and press the enter key to continue. $ su hduser $ ssh-keygen -t rsa -P "" Generating public/private rsa key pair. Enter file in which to save the key (/home/hduser/.ssh/id_rsa): Created directory '/home/hduser/.ssh'. Your identification has been saved in /home/hduser/.ssh/id_rsa. Your public key has been saved in /home/hduser/.ssh/id_rsa.pub. The key fingerprint is: 5a:48:c1:c2:18:f7:93:b5:7e:d2:b8:4b:c3:19:7b:bf hduser@k The key's randomart image is: $ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys The second command adds the newly created key to the list of authorized keys so that Hadoop can use ssh without prompting for a password. We can check if ssh works: $ ssh localhost INSTALL HADOOP $ wget http://mirrors.sonic.net/apache/hadoop/common/hadoop-2.4.1/hadoop- 2.4.1.tar.gz $ tar xvzf hadoop-2.4.1.tar.gz We want to move the Hadoop installation to the /usr/local/hadoop directory using the following command: $ sudo mv hadoop-2.4.1 /usr/local/hadoop
  • 3. $ sudo chown -R hduser:hadoop hadoop $ pwd /usr/local/hadoop SETUP CONFIGURATION FILES The following files will have to be modified to complete the Hadoop setup: 1).~/.bashrc Before editing the .bashrc file in our home directory, we need to find the path where Java has been installed to set the JAVA_HOME environment variable. Now we can append the following to the end of ~/.bashrc #HADOOP VARIABLES START export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 export HADOOP_INSTALL=/usr/local/hadoop export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END 2)./usr/local/hadoop/etc/hadoop/hadoop-env.sh We need to set JAVA_HOME by modifying hadoop-env.sh file export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 3)./usr/local/hadoop/etc/hadoop/core-site.xml The /usr/local/hadoop/etc/hadoop/core-site.xml file contains configuration properties that Hadoop uses when starting up. This file can be used to override the default settings that Hadoop starts with. $ sudo mkdir -p /app/hadoop/tmp $ sudo chown hduser:hadoop /app/hadoop/tmp Open the file and enter the following in between the <configuration></configuration> tag:
  • 4. <configuration> <property> <name>hadoop.tmp.dir</name> <value>/app/hadoop/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The uri's scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The uri's authority is used to determine the host, port, etc. for a filesystem.</description> </property> </configuration> 4)./usr/local/hadoop/etc/hadoop/mapred-site.xml. By default, the /usr/local/hadoop/etc/hadoop/ folder contains the /usr/local/hadoop/etc/hadoop/mapred-site.xml.template file which has to be renamed/copied with the name mapred-site.xml: $ cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml The mapred-site.xml file is used to specify which framework is being used for MapReduce. We need to enter the following content in between the <configuration></configuration> tag: <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:9001</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> </configuration> 5)./usr/local/hadoop/etc/hadoop/yarn-site.xml. Add the following <configuration> <property>
  • 5. <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> 6)./usr/local/hadoop/etc/hadoop/hdfs-site.xml. The /usr/local/hadoop/etc/hadoop/hdfs-site.xml file needs to be configured for each host in the cluster that is being used. It is used to specify the directories which will be used as the namenode and the datanode on that host. Before editing this file, we need to create two directories which will contain the namenode and the datanode for this Hadoop installation. This can be done using the following commands:,/p> $ sudo mkdir -p /usr/local/hadoop_store/hdfs/namenode $ sudo mkdir -p /usr/local/hadoop_store/hdfs/datanode $ sudo chown -R hduser:hadoop /usr/local/hadoop_store Open the file and enter the following content in between the <configuration></configuration> tag: <configuration> <property> <name>dfs.replication</name> <value>1</value> <description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time. </description> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop_store/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop_store/hdfs/datanode</value> </property> </configuration>
  • 6. FORMAT THE NEW HADOOP FILESYSTEM Now, the Hadoop filesystem needs to be formatted so that we can start to use it. The format command should be issued with write permission since it creates current directory under /usr/local/hadoop_store/hdfs/namenode folder: hduser@k:~$ hadoop namenode -format STARTING HADOOP Now it's time to start the newly installed single node cluster. We can use start-all.sh or (start-dfs.sh and start-yarn.sh) hduser@k:/home/k$ start-all.sh We can check if it's really up and running: hduser@k:/home/k$ jps 6129 Jps 5484 NameNode 5811 SecondaryNameNode 5969 ResourceManager 6094 NodeManager 5610 DataNode