Apache Hadoop Tutorial – We shall learn to install Apache Hadoop on Ubuntu. Java is a prerequisite to run Hadoop.
Install Apache Hadoop on Ubuntu
Following is a step by step guide to Install Apache Hadoop on Ubuntu
Install Java
Hadoop is an open-source framework written in Java. So, for Hadoop to run on your computer, you should install Java in prior.
Open a terminal and run the following command :
$ sudo apt-get install default-jdk
To verify the installation of Java, run the following command in the terminal :
$ java -version
The output for the command would be as shown below.
hadoopuser@tutorialkart:~# java -version openjdk version "1.8.0_131" OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-0ubuntu1.16.04.2-b11) OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)
Install Hadoop
Download latest Hadoop binary package from [http://hadoop.apache.org/releases.html].
Look for latest stable release (not in alpha channel) and click on binary link provided for the release.

Click on the first mirror link

Copy the downloaded tar file to /usr/lib/ and untar.
$ sudo cp hadoop-2.8.1.tar.gz /usr/lib/ $ sudo tar zxf hadoop-2.8.1.tar.gz $ sudo rm hadoop-2.8.1.tar.gz
Provide the password if asked.
Set Java and Hadoop Path
Make sure you have the PATHs set up for Java and Hadoop in bashrc file.Open a Terminal and run the following command to edit bashrc file.
$ sudo nano ~/.bashrc
Paste the following entries at the end of .bashrc file.
#HADOOP VARIABLES START export JAVA_HOME=/usr/lib/jvm/default-java/jre export HADOOP_INSTALL=/usr/lib/hadoop-2.8.1 export PATH=$PATH:$HADOOP_INSTALL/bin export PATH=$PATH:$HADOOP_INSTALL/sbin export HADOOP_MAPRED_HOME=$HADOOP_INSTALL export HADOOP_COMMON_HOME=$HADOOP_INSTALL export HADOOP_HDFS_HOME=$HADOOP_INSTALL export YARN_HOME=$HADOOP_INSTALL export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib" #HADOOP VARIABLES END
Run Hadoop
After setting up the path for Hadoop and Java, you may run the hadoop command, from anywhere, using the terminal.
$ hadoop
The output would be as shown below :
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME] CLASSNAME run the class named CLASSNAME or where COMMAND is one of: fs run a generic filesystem user client version print the version jar run a jar file note: please use "yarn jar" to launch YARN applications, not this command. checknative [-a|-h] check native hadoop and compression libraries availability distcp copy file or directories recursively archive -archiveName NAME -p * create a hadoop archive classpath prints the class path needed to get the Hadoop jar and the required libraries credential interact with credential providers daemonlog get/set the log level for each daemon trace view and modify Hadoop tracing settings Most commands print help when invoked w/o parameters.
Conclusion
In this Apache Hadoop Tutorial, we have successfully installed Hadoop on Ubuntu. In subsequent tutorials, we shall look into HDFS and MapReduce and start with Word Count Example in Hadoop.