Saish JagTap

Posted on Jan 18, 2023

Installing Hadoop 3.1.0 in Windows 10

#hadoop #windows10 #insallinghadoop

Step 1: Install Java JDK 1.8 in system

After downloading java check your java version through this command on cmd.

Step 2: Download Hadoop version 3.1 and extract it to C:\ drive.
Hadoop version 3.1.0

Step 3: Setup System Environment Variables

To edit the system environment variable, go to environment variable in system properties.

Create 2 new user variables:

Variable Name: HADOOP_HOME Variable Value: The path of bin folder where extracted Hadoop.

Variable Name: JAVA_HOME Variable Value: The path of the bin folder in the Java directory.

To set Hadoop bin directory and Java bin directory path in system variable path, edit Path in the system variable.

Click on New and add the bin directory path of Hadoop and Java in it.

Step 4: Configuration
Edit some files located in the Hadoop directory of the etc folder.

Go to this link

Download the files and replace it at same location

add bin_new in hadoop directory.
add core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml in etc/hadoop directory.

Inside etc/hadoop directory edit hadoop-env.cmd

Replace %JAVA_HOME% with the path of the java folder where the jdk1.8 is installed and save it.

Rename bin folder to bin_old and rename bin_new to bin

Check whether hadoop is successfully installed by running this command hadoop version on cmd.

Hadoop is successfully installed in the system.

In above downloaded GitHub folder copy hadoop-yarn-server-timelineservice-3.1.0.jar file and paste in hadoop/share/hadoop/yarn folder.

Step 5: Format NameNode
Once the hadoop is installed, the NameNode is formatted. This is done to avoid deletion of all the data inside HDFS.

Run following command to format NameNode:
hdfs namenode -format

Step 6: To start run all the Apache Hadoop Distribution
Change the directory in cmd to sbin folder of hadoop directory using following command
cd C:\hadoop\hadoop-3.1.0\sbin

Start namenode and datanode using start-dfs.cmd command.

Two cmd windows will open for NameNode and DataNode

Start yarn using start-yarn.cmd command.

Two more windows will open, one for yarn resource manager and one for yarn node manager

Step 7: Verification
To access information about resource manager current jobs, successful and failed jobs, go to this link in browser- http://localhost:8088/cluster