Steps to install Hadoop 2.x release (Yarn or Next-Gen) on single node cluster setup

Posted on September 23, 2012 by Rasesh Mori

Hadoop 2.x release involves many changes to Hadoop and MapReduce. The centralized JobTracker service is replaced with a ResourceManager that manages the resources in the cluster and an ApplicationManager that manages the application lifecycle. These architectural changes enable hadoop to scale to much larger clusters. For more details on architectural changes in Hadoop next-gen (a.k.a. Yarn), watch this video or visit this blog.

This post concentrates on installing Hadoop 2.x a.k.a. Yarn a.k.a. next-gen on a single-node cluster.

Prerequisites:

Java 6 installed
Dedicated user for hadoop
SSH configured

Steps to install Hadoop 2.x:

1. Download tarball

You can download tarball for hadoop 2.x from here. Extract it to a folder say, /home/hduser/yarn. We assume dedicated user for Hadoop is “hduser”.

$ cd /home/hduser/yarn
$ sudo chown -R hduser:hadoop hadoop-2.0.1-alpha

2. Setup Environment Variables

$ export HADOOP_HOME=$HOME/yarn/hadoop-2.0.1-alpha
$ export HADOOP_MAPRED_HOME=$HOME/yarn/hadoop-2.0.1-alpha
$ export HADOOP_COMMON_HOME=$HOME/yarn/hadoop-2.0.1-alpha
$ export HADOOP_HDFS_HOME=$HOME/yarn/hadoop-2.0.1-alpha
$ export HADOOP_YARN_HOME=$HOME/yarn/hadoop-2.0.1-alpha
$ export HADOOP_CONF_DIR=$HOME/yarn/hadoop-2.0.1-alpha/etc/hadoop

This is very important as if you miss any one variable or set the value incorrectly, it will be very difficult to detect the error and the job will fail.

Also, add these to your ~/.bashrc or other shell start-up script so that you don’t need to set them every time.

3. Create directories

Create two directories to be used by namenode and datanode.

$ mkdir -p $HOME/yarn/yarn_data/hdfs/namenode
$ mkdir -p $HOME/yarn/yarn_data/hdfs/datanode

4. Set up config files

$ cd $YARN_HOME

Add the following properties under configuration tag in the files mentioned below:

etc/hadoop/yarn-site.xml:

   yarn.nodemanager.aux-services
   mapreduce_shuffle


   yarn.nodemanager.aux-services.mapreduce.shuffle.class
   org.apache.hadoop.mapred.ShuffleHandler

etc/hadoop/core-site.xml:

   fs.default.name
   hdfs://localhost:9000

etc/hadoop/hdfs-site.xml:

 
   dfs.replication
   1
 
 
   dfs.namenode.name.dir
   file:/home/hduser/yarn/yarn_data/hdfs/namenode
 
 
   dfs.datanode.data.dir
   file:/home/hduser/yarn/yarn_data/hdfs/datanode

etc/hadoop/mapred-site.xml:

If this file does not exist, create it and paste the content provided below:



   
      mapreduce.framework.name
      yarn

5. Format namenode

This step is needed only for the first time. Doing it every time will result in loss of content on HDFS.

$ bin/hadoop namenode -format

6. Start HDFS processes

Name node:

$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /home/hduser/yarn/hadoop-2.0.1-alpha/logs/hadoop-hduser-namenode-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode

Data node:

$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /home/hduser/yarn/hadoop-2.0.1-alpha/logs/hadoop-hduser-datanode-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode
17170 DataNode

7. Start Hadoop Map-Reduce Processes

Resource Manager:

$ sbin/yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hduser/yarn/hadoop-2.0.1-alpha/logs/yarn-hduser-resourcemanager-pc3-laptop.out
$ jps
18509 Jps
17107 NameNode
17170 DataNode
17252 ResourceManager

Node Manager:

$ sbin/yarn-daemon.sh start nodemanager
starting nodemanager, logging to /home/hduser/yarn/hadoop-2.0.1-alpha/logs/yarn-hduser-nodemanager-pc3-laptop.out
$jps
18509 Jps
17107 NameNode
17170 DataNode
17252 ResourceManager
17309 NodeManager

Job History Server:

$ sbin/mr-jobhistory-daemon.sh start historyserver
starting historyserver, logging to /home/hduser/yarn/hadoop-2.0.1-alpha/logs/yarn-hduser-historyserver-pc3-laptop.out
$jps
18509 Jps
17107 NameNode
17170 DataNode
17252 ResourceManager
17309 NodeManager
17626 JobHistoryServer

8. Running the famous wordcount example to verify installation

$ mkdir in
$ cat > in/file
This is one line
This is another one

Add this directory to HDFS:

$ bin/hadoop dfs -copyFromLocal in /in

Run wordcount example provided:

$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.*-alpha.jar wordcount /in /out

Check the output:

$ bin/hadoop dfs -cat /out/*
This 2
another 1
is 2
line 1
one 2

9. Web interface

Browse HDFS and check health using http://localhost:50070 in the browser:

You can check the status of the applications running using the following URL:

http://localhost:8088

10. Stop the processes

$ sbin/hadoop-daemon.sh stop namenode
$ sbin/hadoop-daemon.sh stop datanode
$ sbin/yarn-daemon.sh stop resourcemanager
$ sbin/yarn-daemon.sh stop nodemanager
$ sbin/mr-jobhistory-daemon.sh stop historyserver

Happy Coding!!!

79 thoughts on “Steps to install Hadoop 2.x release (Yarn or Next-Gen) on single node cluster setup”

Pingback: Steps to install Hadoop 2.x release (Yarn or Next-Gen) on multi-node cluster | Rasesh Mori
Sidartha Karna on November 7, 2012 at 6:21 am said:

Thanks a lot! This is the best resource I found for latest Hadoop release. Others were outdated. Faced an issue that hostname is not found when formatting data node. Using other internet resources found that “hostname -fsqn” is giving error. So used yast>network settings>hostname/dns>assign hostname to loopback ip – checked the box. Took a long time to find it, as I am a linux new bie.

Reply ↓
- Rasesh Mori on November 7, 2012 at 9:33 am said:
  
  Happy to help!!! 🙂 I guess your problem was related to entry in /etc/hosts file. Isn’t it?
  
  Reply ↓
  - Sidartha Karna on November 8, 2012 at 5:45 am said:
    
    Yes. that just added a new entry into /etc/hosts with my machine name.
Amit Sangroya on February 4, 2013 at 8:22 pm said:

Hi,

Nice. I am able to run jobs on my local machine.

I am just wondering where are the job history files located. Do you have idea about this. Can this location configured in the hadoop 2.0.

Reply ↓
- Daniel Williams on November 17, 2013 at 9:03 pm said:
  
  Hi,
  
  The job history tasks are usually located at localhost:19888.
  
  However if you look at the console output when you run a MapReduce job you should see the link for the job results there.
  
  For example:
  The url to track the job: http://localhost:8088/proxy/application_1384700761740_0001/
  
  Regards,
  Dan
  
  Reply ↓
padhu on February 10, 2013 at 2:28 pm said:

Excellent tutorial! I just followed, it went well without any error . able to run word count example

Reply ↓
- Rasesh Mori on February 10, 2013 at 2:31 pm said:
  
  Thanks. Happy if it helped you.
  
  Reply ↓
  - padhu on February 10, 2013 at 6:12 pm said:
    
    starting “secondarynamenode” service step is missing in the article
    
    $ sbin/hadoop-daemon.sh start namenode
gaurav on May 20, 2013 at 2:35 pm said:

Hi ,

When I am running bin/hadoop namenode -format
I am getting following error:

hduser@ubuntu:~/lab/install/hadoop-2.0.4-alpha$ bin/hadoop namenode -format
bin/hadoop: line 26: /home/hduser/lab/install/hadoop-2.0.4-alpha/bin/../libexec/hadoop-config.sh: No such file or directory
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

/home/hduser/lab/install/hadoop-2.0.4-alpha/bin/hdfs: line 34: /home/hduser/lab/install/hadoop-2.0.4-alpha/bin/../libexec/hdfs-config.sh: No such file or directory
/home/hduser/lab/install/hadoop-2.0.4-alpha/bin/hdfs: line 150: cygpath: command not found
/home/hduser/lab/install/hadoop-2.0.4-alpha/bin/hdfs: line 191: exec: : not found

Reply ↓
Pingback: » Hadoop 2.1.0 beta and Raspberry Pi Y12 Studio
nayannilank on August 30, 2013 at 5:07 pm said:

First of thanks a lot Rasesh for wonderful tutorial. With the help of it, I configured Hadoop 2.1.0 Beta version in first shot. All components are up and running.
nayan@nayan-desktop:~/dev$ jps
11426
1546 ResourceManager
2038 JobHistoryServer
2100 Jps
1365 DataNode
1275 NameNode
1792 NodeManager

But when I try to create a directory, it is failing.

nayan@nayan-desktop:~/dev$ hdfs dfs -mkdir /home/nayan/hdfs/sample
13/08/30 17:07:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
mkdir: `/home/nayan/hdfs/sample’: No such file or directory

Can you please tell what I am doing wrong?

Thanks,
Nayan

Reply ↓
duraiviswanatha on October 11, 2013 at 6:32 pm said:

Hi friends
its installation methods very understating on hadoop-2.1.0-beta

thank you

Reply ↓
Joe Buck on October 15, 2013 at 11:11 am said:

Great tutorial, thanks for sharing.

As of Hadoop 2.2.0 it looks like two things need to change in the yarn-site.xml configuration file (mapreduce.shuffle is now mapreduce_shuffle). Mine looks like this now:

yarn.nodemanager.aux-services
mapreduce_shuffle

yarn.nodemanager.aux-services.mapreduce_shuffle.class
org.apache.hadoop.mapred.ShuffleHandler

if I don’t do this, I see the following error:

2013-10-14 22:26:21,228 FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

Reply ↓
Joe Buck on October 15, 2013 at 11:13 am said:

urg, my copy/paste was botched up but hopefully this still makes sense. The value mapreduce.shuffle is now mapreduce_shuffle and the name yarn.nodemanager.aux-services.mapreduce.shuffle.class is now yarn.nodemanager.aux-services.mapreduce_shuffle.class

Reply ↓
- Rasesh Mori on October 16, 2013 at 12:16 pm said:
  
  Thanks Joe very much. It will help others…
  
  Reply ↓
- Deshawn Williams on January 24, 2014 at 12:09 pm said:
  
  Thanks for this, had the exact same error and this solved it for me!
  
  Reply ↓
- Victor on March 22, 2014 at 12:16 am said:
  
  Thanks, that helped a lot!
  
  Reply ↓
- iosutron on May 16, 2014 at 1:02 pm said:
  
  headache saver, thanks!!
  
  Reply ↓
CVVS on October 16, 2013 at 2:31 am said:

Nice work, Rasesh. This is one of the better, simpler getting started tutorials I’ve seen. Kudos.

Reply ↓
- Rasesh Mori on October 16, 2013 at 12:15 pm said:
  
  Thank you… Happy to help!!
  
  Reply ↓
Mohit Kumar on October 16, 2013 at 7:25 pm said:

Good Writeup..thanks..but i have one problem..everything works fine except that jobs dont show up on localhost:8088. The job list is empty…and when i submit any job the log shows it is using LocalJobRunner..any idea where i could be going wrong..
Thanks in advance

Reply ↓
Pingback: HowTo: Setup Hadoop 2.X on Linux-Ubuntu (Single Node) | techo byte
Pingback: Hadoop 2 Nodemanager not startingQueryBy | QueryBy, ejjuit, query, query by, queryby.com, android doubt, ios question, sql query, sqlite query, nodejsquery, dns query, update query, insert query, kony, mobilesecurity, postquery, queryposts.com, sapquery,
sudheer1313 on October 25, 2013 at 4:35 pm said:

It was nice article it was very useful for me as well as useful foronline Hadoop training learners. thanks for providing this valuable information.

Reply ↓
Pingback: What hadoop can and can’t do. | 做人要豁達大道
Vivek on November 5, 2013 at 5:38 am said:

Hi Rasesh
Thanks for this wonderful tutorial.. It was really very helpful. I did all setup and when I am trying to run a job I am getting below error. I am not sure what is an issue. Can you help me finding the possible reason?

13/11/04 15:54:59 INFO Configuration.deprecation: mapreduce.reduce.class is deprecated. Instead, use mapreduce.job.reduce.class
13/11/04 15:54:59 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
13/11/04 15:54:59 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/11/04 15:54:59 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
13/11/04 15:54:59 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
13/11/04 15:54:59 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
13/11/04 15:54:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1383608961434_0003
13/11/04 15:54:59 INFO impl.YarnClientImpl: Submitted application application_1383608961434_0003 to ResourceManager at /0.0.0.0:8032
13/11/04 15:54:59 INFO mapreduce.Job: The url to track the job: http://Office-Mac-2.local:8088/proxy/application_1383608961434_0003/
13/11/04 15:54:59 INFO mapreduce.Job: Running job: job_1383608961434_0003
13/11/04 15:55:02 INFO mapreduce.Job: Job job_1383608961434_0003 running in uber mode : false
13/11/04 15:55:02 INFO mapreduce.Job: map 0% reduce 0%
13/11/04 15:55:02 INFO mapreduce.Job: Job job_1383608961434_0003 failed with state FAILED due to: Application application_1383608961434_0003 failed 2 times due to AM Container for appattempt_1383608961434_0003_000002 exited with exitCode: 127 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

Reply ↓
- Lars on December 1, 2013 at 10:36 pm said:
  
  Hi Vivek,
  
  Did you manage to solve this? Same problem here..
  
  Lars
  
  Reply ↓
  - Vivek on December 6, 2013 at 4:28 am said:
    
    Hi Lars,
    
    I found a problem in one of the hadoop environment variables set in .bash_profile file. I am not sure if its the same case for you. for reference I am giving the content of my file. Hope it will help
    
    export JAVA_HOME=$(/usr/libexec/java_home)
    export HADOOP_MAPRED_HOME=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/
    export HADOOP_COMMON_HOME=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/
    export HADOOP_HDFS_HOME=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/
    export YARN_HOME=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/
    export HADOOP_YARN_HOME=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/
    export HADOOP_CONF_DIR=/Volumes/MacNewHD/Hadoop/hadoop-2.2.0/etc/hadoop/
    export YARN_CONF_DIR=$HADOOP_CONF_DIR
    export PATH=/opt/subversion/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/binexport:$JAVA_HOME/bin:
  - Lars on December 6, 2013 at 1:01 pm said:
    
    Thanks Vivek! I solved it! I had some leftover configurations from other tutorials, so after replacing my config files with the original ones and making the changes explained in this tutorial it was fixed.
Vishal Mishra on November 7, 2013 at 7:20 am said:

Thanx a lot, this was super helpful!
One minor change is that the config for nodemanager in yarn-site.xml seems to have ‘_’ instead of ‘.’ in it:

yarn.nodemanager.aux-services
mapreduce_shuffle

NodeManager was failing to start without it.

Reply ↓
- Rasesh Mori on November 8, 2013 at 10:17 am said:
  
  Thanks Vishal for pointing that out. I guess it changed after I posted this. I have updated the post as you suggested.
  
  Reply ↓
balan P on November 8, 2013 at 4:59 pm said:

Hi,

I am getting the following error while starting the nodemanager, due to this Nodemanager process killed. Should I need to configure any entry in etc/hosts file.

013-11-08 16:46:49,701 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8031
2013-11-08 16:46:53,738 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:46:56,742 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:00,750 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:03,758 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:07,766 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:11,774 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:14,782 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:18,790 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:22,798 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-11-08 16:47:25,806 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8031. Already tried 9 time(s); retry policy is RetryUpToMa

Thanks
Balan.P

Reply ↓
- Allen on December 3, 2013 at 11:33 pm said:
  
  Hey Balan,
  
  I got exactly the same problem. Did you solve that?
  Allen
  
  Reply ↓
  - Garima on July 4, 2015 at 12:02 pm said:
    
    I got this error when my resourcemanager was not running and I was trying to run mapreduce programs. First resourcemannager needs to be started.
Pingback: Yarn, installation | Here
Niteesh on December 8, 2013 at 7:07 pm said:

Hi All,
Im getting the classNotFound issue in all the executions. i doubt Im missing something very basic(Hadoop 2.2 on windows cygwin).
I found comments mentioning to add -classpath “$(cygpath -pw “$CLASSPATH”)” in some place(Any idea where exactly??).

Thanks,
Niteesh

Reply ↓
Adio Gemon on December 12, 2013 at 3:32 pm said:

Great, very helpful, a very very very big thank you !
I am currently reading Hadoop in Action book and was unable to run hadoop on my ubuntu. But, after I followed your configurations I was able to run Hadoop successfully …. !

Reply ↓
- Rasesh Mori on December 25, 2013 at 11:02 am said:
  
  Happy to help!!
  
  Reply ↓
- Ulric on March 27, 2014 at 1:08 pm said:
  
  did you ever get this running on Windows? I’m stuck with that same problem you had: $ hadoop version
  Error: Could not find or load main class org.apache.hadoop.util.VersionInfo
  
  Reply ↓
  - Kalai on April 29, 2014 at 12:48 pm said:
    
    Add the following line in your .bashrc file in cygwin
    
    export HADOOP_CLASSPATH=$(cygpath -pw $(hadoop classpath)):$HADOOP_CLASSPATH
    
    This issue will be resolved
duraiviswanathan on December 16, 2013 at 10:13 pm said:

how can solve that is error
http://localhost:8088/proxy/application_1384700761740_0001/

HTTP ERROR 500

Problem accessing /proxy/application_1384700761740_0001/. Reason:

org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1384700761740_0001’ doesn’t exist in RM.
Caused by:

java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1384700761740_0001’ doesn’t exist in RM.
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:341)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1384700761740_0001’ doesn’t exist in RM.
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:247)
at org.apache.hadoop.yarn.server.webproxy.AppReportFetcher.getApplicationReport(AppReportFetcher.java:90)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.getApplicationReport(WebAppProxyServlet.java:221)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:276)
… 36 more
Caused by:

org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id ‘application_1384700761740_0001’ doesn’t exist in RM.
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:247)
at org.apache.hadoop.yarn.server.webproxy.AppReportFetcher.getApplicationReport(AppReportFetcher.java:90)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.getApplicationReport(WebAppProxyServlet.java:221)
at org.apache.hadoop.yarn.server.webproxy.WebAppProxyServlet.doGet(WebAppProxyServlet.java:276)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:66)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:900)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:834)
at com.sun.jersey.spi.container.servlet.ServletContainer.doFilter(ServletContainer.java:795)
at com.google.inject.servlet.FilterDefinition.doFilter(FilterDefinition.java:163)
at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:58)
at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:118)
at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:113)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.lib.StaticUserWebFilter$StaticUserFilter.doFilter(StaticUserWebFilter.java:109)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:1081)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:326)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)

Reply ↓
Hadoop Developer on December 19, 2013 at 5:59 pm said:

Hi,
I have set up Hadoop 2.2.0 on Windows7(64 bit) with Cygwin
I am able to start dfs and yarn
but now when trying to run Word Count it fails while launching container.
It’s something related to classpath issue.
As I am getting following error
java.lang.NoClassDefFoundError: org/apache/hadoop/service/CompositeService
and
Could not find the main class: org.apache.hadoop.mapreduce.v2.app.MRAppMaster. Program will exit.
I have set system level env. variables as follows
HADOOP_COMMON_HOME=c:\hadoop
HADOOP_CONF_DIR=c:\hadoop\etc\hadoop
HADOOP_HDFS_HOME=c:\hadoop
HADOOP_HOME=c:\hadoop
HADOOP_MAPRED_HOME=c:\hadoop
YARN_HOME=c:\hadoop

Reply ↓
Rasesh Mori on December 25, 2013 at 11:01 am said:

Hello Friends,

This was created when 2.0 was in alpha stage and since then there have been some changes in the configurations. Based on some comments and other sources, I have updated the steps (mostly change in one env. variable YARN_HOME to HADOOP_YARN_HOME). I hope this will help all of you.

Reply ↓
duraiviswanathan on December 26, 2013 at 6:36 pm said:

hduser@999:/usr/local/hadoop/hadoop/sbin$ ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Error: Cannot find configuration directory: /user/local/hadoop/hadoop/conf
starting yarn daemons
Error: Cannot find configuration directory: /user/local/hadoop/hadoop/conf

Reply ↓
sharmeen group on January 22, 2014 at 12:24 pm said:

An enormous round of applause, continue the great work.

standard textile mills

Reply ↓
Rama Kundurthi on March 8, 2014 at 11:05 am said:

Thanks for a really simple and yet working instructions. Just followed these steps with 2.2.0 and got word count example working.

Reply ↓
Prachi on March 11, 2014 at 11:44 am said:

Hi, When i try to run the built-in pi program, i get below error.Earlier it was running successful with hduser. i tried both the users hduser as well as root but could not fix it.
Any help would be appreciated.

root@bigdata:/usr/local/hadoop# hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5
Number of Maps = 2
Samples per Map = 5
14/03/11 17:10:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using bui
ltin-java classes where applicable
java.net.ConnectException: Call From bigdata/127.0.1.1 to localhost:9000 failed on connection exception: java.net
.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
at org.apache.hadoop.ipc.Client.call(Client.java:1351)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at sun.proxy.$Proxy9.getFileInfo(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtoco
lTranslatorPB.java:651)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1679)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1106)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1102)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1397)
at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:278)
at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:597)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:547)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:642)
at org.apache.hadoop.ipc.Client$Connection.access$2600(Client.java:314)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1399)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
… 33 more

Reply ↓
- Rimo on March 22, 2014 at 3:45 am said:
  
  Did you get fix for this problem ? I am also hitting similar iissue
  
  Reply ↓
  - Garima on July 4, 2015 at 12:00 pm said:
    
    Did you managed to solve this error…..
Pingback: hadoopy yarn | My Tech Notes
Pingback: hadoop yarn 2.x single cluster | My Tech Notes
Prasanna on April 1, 2014 at 11:44 am said:

Thanks Rasesh Mori…:)
It is very helpful and easy to understand!!!!

Reply ↓
Fariz Aghayev on April 25, 2014 at 1:29 am said:

Hi, when i used hadoop version 1 I put all my libraries in lib folder, but in hadoop2 version there is only native folder. Every one knows where i must put my libs for using this libs in my apps?

Thanks yours responds .

Reply ↓
Sotiris Beis on April 25, 2014 at 2:59 pm said:

Hi Raresh,
great tutorial. I have one problem though. When I try to start the namenode I get this message:

starting namenode, logging to /home/hduser/yarn/hadoop-2.4.0/logs/hadoop-root-namenode-iti-313.out
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /home/hduser/yarn/hadoop-2.4.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It’s highly recommended that you fix the library with ‘execstack -c ‘, or link it with ‘-z noexecstack’.

Does anyone knows what is it about?

Reply ↓
galeej on April 29, 2014 at 12:12 pm said:

excellent.. works like charm… can you also tell me how to compile the jar from an .java file that I created and then execute the hadoop job?

Reply ↓
Pingback: freestylerock
Pingback: Setting Up an R-Hadoop System: Step-by-Step Guide | Big D8TA
Lakshminarayana on June 4, 2014 at 2:23 pm said:

Hi Rasesimori,
Thanks so much for your efforts in stpep by step guide, which helped me like anything.
I followed many step by step guides which still needs to have clarity. Even apache hadoop website also not providing much information to start work on simple example.
But You have done a great job. I setted up environment and executed first application of hadoop.
Thanks for your help and Keep posting useful information..

Reply ↓
Baban Gaigole on June 7, 2014 at 6:25 pm said:

Thanks for sharing this post. it would help many. I have setup a few single & multi node Hadoop 1.2.1 clusters on both Ubuntu 14.04 & Solaris 11 operating systems. Follow link http://hashprompt.blogspot.in/search/label/Hadoop

Reply ↓
Keddy on June 16, 2014 at 3:28 pm said:

Hi,
I am getting following error-

14/06/16 15:20:27 INFO mapreduce.Job: Job job_1401348657033_0013 running in uber mode : false
14/06/16 15:20:27 INFO mapreduce.Job: map 0% reduce 0%
14/06/16 15:20:27 INFO mapreduce.Job: Job job_1401348657033_0013 failed with state FAILED due to: Application application_1401348657033_0013 failed 2 times due to AM Container for appattempt_1401348657033_0013_000002 exited with exitCode: 1 due to: Exception from container-launch:
org.apache.hadoop.util.Shell$ExitCodeException:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:464)
at org.apache.hadoop.util.Shell.run(Shell.java:379)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)

.Failing this attempt.. Failing the application.
14/06/16 15:20:27 INFO mapreduce.Job: Counters: 0

Can Some one help ?????

Many Thanks,
~Kedar

Reply ↓
Mantha Pavan on August 9, 2014 at 8:48 pm said:

Hi,

I went through your post and its seemlessly very good, can you please post a similar installation tutorial for windows environment with hadoop vanilla flavour.

Thank You.
Pavan Kumar.

Reply ↓
Pingback: Setting Up an R-Hadoop System and Predicting Future Website Visitors: Step-by-Step Guide |
XuanTran on October 1, 2014 at 7:09 pm said:

Hello,
I am trying to install Hadoop Yarn single-node cluster, but It failed.
Now I would like to remove all related created directories and retry installation. But I don’t know how.
Could you help me please?
Besides, I have been stuck at step 4, to add more properties and I don’t know why, either 😦
Help me please!

Thanks a lot in advance!

Reply ↓
hoangthang on October 31, 2014 at 1:27 pm said:

Hi Rasesh Mori,
Thanks for your tutorial, i installed successful hadoop with a single node. Now, I have a mistake, i hope you can help me!
I run map reduce programme with the data appromately 6Gb and the result:
java.io.IOException: No space left on device
You can give me some advives or solution for the problem

Thank you so much!

Thang

Reply ↓
krack3rz on March 6, 2015 at 11:55 am said:

Great tutorial but your configuration is wrong.
It took me some googling to fix it.

My datanode didnt start because I copied your settings and you set
dfs.datanode.data.dir property rather than
dfs.datanode.name.dir property as you had for the namenode
under the etc/hadoop/hdfs-site.xml file.

Reply ↓
- kishorbaindoor on May 6, 2015 at 10:40 pm said:
  
  Have same problem. But I am not able to understand how you fixed it exactly.
  
  Reply ↓
Keylabs on April 25, 2015 at 11:04 am said:

Nice post thanks for sharing. If any one need Hadoop Interview Questions & Answers and Free Material Click Here

Reply ↓
Pingback: Hadoop Nodemanager and Resourcemanager not starting - TecHub
Pingback: Building R + Hadoop system | Mathminers
Pingback: R-Hadoop 시스템 구축 가이드 | Encaion
Indrasinh Bihola on February 3, 2016 at 4:57 pm said:

Hey Rasesh, Nice Post!! Btw Do you know how to connect hive 2 with php using jdbc?? I am stuck on that I have done it in java but not able to do it in PHP.

Reply ↓
Pingback: Step-by-Step Guide to Setting Up an R-Hadoop System | Rhadoop
Vikrant on May 1, 2016 at 11:32 pm said:

This guide is perfect.. Installed like a peach!!.. I used it install 2.7.2 version… TYSM:):)

Reply ↓
Pingback: Introdução ao Hadoop + Instalando Hadoop de forma Distribuída – Here Is Mari
ashish on July 14, 2016 at 8:34 pm said:

You can refer to the below url they are providing very good free videos on hadoop:

For free videos from previous sessions refer:
http://hadoopbigdatatutorial.com/hadoop-training/big-data-tutorial

Reply ↓
Memgqsz on February 11, 2017 at 8:51 pm said:

Pharmacy Yakima Wa Pharmacy Cvs Pharmacy Education Pharmacy License Lookup Pharmacy Tech Salary Pharmacy Journals Pharmacy Intern Pharmacy Programs Pharmacy Memes
hi!

Reply ↓
Pingback: hadoop - Hadoop Nodemanager et Resourcemanager pas de départ
Pingback: bigdata - Hadoop Nodemanager e Resourcemanager non di partenza