DEV Community

Cover image for Installing Hadoop 3.1.0 in Windows 10
Saish JagTap
Saish JagTap

Posted on

Installing Hadoop 3.1.0 in Windows 10

Step 1: Install Java JDK 1.8 in system

Image description

After downloading java check your java version through this command on cmd.

Image description

Step 2: Download Hadoop version 3.1 and extract it to C:\ drive.
Hadoop version 3.1.0

Image description

Step 3: Setup System Environment Variables

Image description

To edit the system environment variable, go to environment variable in system properties.

Image description

Image description

Create 2 new user variables:

  1. Variable Name: HADOOP_HOME Variable Value: The path of bin folder where extracted Hadoop.

Image description

  1. Variable Name: JAVA_HOME Variable Value: The path of the bin folder in the Java directory.

Image description

To set Hadoop bin directory and Java bin directory path in system variable path, edit Path in the system variable.

Image description

Click on New and add the bin directory path of Hadoop and Java in it.

Image description

Step 4: Configuration
Edit some files located in the Hadoop directory of the etc folder.

Image description

Go to this link

Image description

Download the files and replace it at same location

  • add bin_new in hadoop directory.

  • add core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml in etc/hadoop directory.

Inside etc/hadoop directory edit hadoop-env.cmd

Image description

Replace %JAVA_HOME% with the path of the java folder where the jdk1.8 is installed and save it.

Image description

Rename bin folder to bin_old and rename bin_new to bin

Image description

Check whether hadoop is successfully installed by running this command hadoop version on cmd.

Image description

Hadoop is successfully installed in the system.

In above downloaded GitHub folder copy hadoop-yarn-server-timelineservice-3.1.0.jar file and paste in hadoop/share/hadoop/yarn folder.

Image description

Step 5: Format NameNode
Once the hadoop is installed, the NameNode is formatted. This is done to avoid deletion of all the data inside HDFS.

Run following command to format NameNode:
hdfs namenode -format

Image description

Step 6: To start run all the Apache Hadoop Distribution
Change the directory in cmd to sbin folder of hadoop directory using following command
cd C:\hadoop\hadoop-3.1.0\sbin

Image description

Start namenode and datanode using start-dfs.cmd command.

Image description

Two cmd windows will open for NameNode and DataNode

Image description

Start yarn using start-yarn.cmd command.

Image description

Two more windows will open, one for yarn resource manager and one for yarn node manager

Image description

Step 7: Verification
To access information about resource manager current jobs, successful and failed jobs, go to this link in browser- http://localhost:8088/cluster

Image description

Thank You for Reading

Top comments (1)

Collapse
 
sangram11 profile image
Sangram Mohanty • Edited

Hi Saish,
I followed all your steps. But I am receiving below errors while trying to start namenode and datanode : -

Image description

Also, no "data" folder is getting created while I tried to start namenode & datanode.

Image description

Kindly, help me fix this. Let me know how you are able to start namenode and datanode. How "data" folder is getting created.