Steps to install Hadoop 2.x release (Yarn or Next-Gen) on multi-node cluster

In the previous post, we saw how to setup Hadoop 2.x on single-node. Here, we will see how to set up a multi-node cluster.

Hadoop 2.x release involves many changes to Hadoop and MapReduce. The centralized JobTracker service is replaced with a ResourceManager that manages the resources in the cluster and an ApplicationManager that manages the application lifecycle. These architectural changes enable hadoop to scale to much larger clusters. For more details on architectural changes in Hadoop next-gen (a.k.a. Yarn), watch this video or visit this blog.

This post concentrates on installing Hadoop 2.x a.k.a. Yarn a.k.a. next-gen on a multi-node cluster.

Prerequisites:

  • Java 6 installed
  • Dedicated user for hadoop
  • SSH configured

Steps to install Hadoop 2.x:

1. Download tarball

You can download tarball for hadoop 2.x from here. Extract it to a folder say, /home/hduser/yarn on master and all the slaves. We assume dedicated user for Hadoop is “hduser”.

NOTE: Master and all the slaves must have the same user and hadoop directory on same path.

$ cd /home/hduser/yarn
$ sudo chown -R hduser:hadoop hadoop-2.0.1-alpha

2. Edit /etc/hosts

Add the association between the hostnames and the ip address for the master and the slaves on all the nodes in the /etc/hosts file. Make sure that the all the nodes in the cluster are able to ping to each other.

Important Change:

127.0.0.1 localhost localhost.localdomain my-laptop
127.0.1.1 my-laptop

If you have provided alias for localhost (as done in entries above), protocol buffers will try to connect to my-laptop from other hosts while making RPC calls which will fail.

Solution:

Assuming the machine (my-laptop) has ip address “10.3.3.43”, make an entry as follows in all the other machines:

10.3.3.43       my-laptop

3. Password less SSH

Make sure that the master is able to do a password-less ssh to all the slaves.

4. Edit ~/.bashrc

export HADOOP_HOME=/home/hduser/yarn/hadoop-2.0.1-alpha
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

5. Edit Hadoop environment files

Add JAVA_HOME to following files

Add following line at start of script in libexec/hadoop-config.sh :

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386/

Add following lines at start of script in etc/hadoop/yarn-env.sh :

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386/
export HADOOP_HOME=/home/hduser/yarn/hadoop-2.0.1-alpha
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

Change the path as per your java installation.

6. Create Temp folder in HADOOP_HOME

$ mkdir -p $HADOOP_HOME/tmp

7. Add properties in configuration files

Make changes as mentioned below in all the machines:

$HADOOP_CONF_DIR/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://master:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hduser/yarn/hadoop-2.0.1-alpha/tmp</value>
  </property>
</configuration>

$HADOOP_CONF_DIR/hdfs-site.xml :

<?xml version="1.0" encoding="UTF-8"?>
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 <configuration>
   <property>
     <name>dfs.replication</name>
     <value>2</value>
   </property>
   <property>
     <name>dfs.permissions</name>
     <value>false</value>
   </property>
 </configuration>

$HADOOP_CONF_DIR/mapred-site.xml :

<?xml version="1.0"?>
<configuration>
 <property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
</configuration>

$HADOOP_CONF_DIR/yarn-site.xml :

<?xml version="1.0"?>
 <configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>master:8025</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>master:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>master:8040</value>
  </property>
 </configuration>

8. Add slaves

Add the slave entries in $HADOOP_CONF_DIR/slaves on master machine:

slave1
slave2

9. Format the namenode

$ bin/hadoop namenode -format

10. Start Hadoop Daemons

$ sbin/hadoop-daemon.sh start namenode
$ sbin/hadoop-daemons.sh start datanode
$ sbin/yarn-daemon.sh start resourcemanager
$ sbin/yarn-daemons.sh start nodemanager
$ sbin/mr-jobhistory-daemon.sh start historyserver

NOTE: For datanode and nodemanager, scripts are *-daemons.sh and not *-daemon.sh. daemon.sh does not lookup in slaves file and hence, will only start processes on master

11. Check installation

Check for jps output on slaves and master.

For master:

$ jps
6539 ResourceManager
6451 DataNode
8701 Jps
6895 JobHistoryServer
6234 NameNode
6765 NodeManager

For slaves:

$ jps
8014 NodeManager
7858 DataNode
9868 Jps

If these services are not up, check the logs in $HADOOP_HOME/logs directory to identify the issue.

12. Run a demo application to verify installtion

$ mkdir in
$ cat > in/file
This is one line
This is another one

Add this directory to HDFS:

$ bin/hadoop dfs -copyFromLocal in /in

Run wordcount example provided:

$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.*-alpha.jar wordcount /in /out

Check the output:

$ bin/hadoop dfs -cat /out/*
This 2
another 1
is 2
line 1
one 2

13. Web interface

1. http://master:50070/dfshealth.jsp
2. http://master:8088/cluster
3. http://master:19888/jobhistory (for Job History Server)

14. Stopping the daemons

$ sbin/mr-jobhistory-daemon.sh stop historyserver
$ sbin/yarn-daemons.sh stop nodemanager
$ sbin/yarn-daemon.sh stop resourcemanager
$ sbin/hadoop-daemons.sh stop datanode
$ sbin/hadoop-daemon.sh stop namenode

15. Possible errors

If you get a exception stack trace similar to given below:

Container launch failed for container_1350204169962_0002_01_000004 : java.lang.reflect.UndeclaredThrowableException
 at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
 at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:101)
 at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:149)
 at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:373)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)
Caused by: com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "my-laptop":40365; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:187)
 at $Proxy29.startContainer(Unknown Source)
 at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:99)
 ... 5 more
Caused by: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "my-laptop":40365; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:740)
 at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:248)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1261)
 at org.apache.hadoop.ipc.Client.call(Client.java:1141)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
 ... 7 more
Caused by: java.net.UnknownHostException
 ... 11 more

Solution: Check the Important Change in Step 2 and apply the necessary changes.

 

Happy Coding!!!

– Rasesh Mori

Advertisements

93 thoughts on “Steps to install Hadoop 2.x release (Yarn or Next-Gen) on multi-node cluster

  1. Hi, Rashesh
    Thanks for nice article

    You have mentioned that :
    “NOTE: Master and all the slaves must have the same user and hadoop directory on same path.”

    I have two slaves with same user name but i can’t make same path for their hadoop installation directory. As we have different directory structures designed for difference servers (That i am trying to use as slaves) in our organization.

    Thats why it is throwing me error for slaves as :
    server4: bash: line 0: cd: /home/hduser/hadoop/libexec/..: No such file or directory
    (The hadoop is not installed in /home/hduser/hadoop , Actually it is installed in /test/user1/hadoop. I don’t have permission to create /home/hduser directory)

    Kindly assist me in this context as i am just one step behind for running hadoop on multinode clusters.

    Thanks

  2. Wow, one of the best writeups on Hadoop 2 installation I’ve seen. And it (still) works with 2.0.5 – tried it today on Mint Olivia. Thanks a lot!

    Regards,
    Stefan

    • Hi Stefan,

      Thanks!! Actually, I started working on 2.0.2 when it was in alpha stage and it was very difficult to install it. So, I thought why not make it easy for others 🙂

  3. If anyone going to try this with node manager + resource manager in a same instance remember to configure yarn.nodemanager.localizer.address. Otherwise both node manager and resource manager will try to start resource localizer (found out by looking at YARN code) on same ports.

  4. I followed the tutorial of installing the hadoop 2.2.0 version on multi-node cluster .
    when i launch $ sbin/yarn-daemons.sh start nodemanager command , the nodemanager is not launched and in the log i had this erro:

    FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle
    java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

    Can you help me ?

    • Hi,

      For the error message

      FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle
      java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

      You need to change the
      ——————————————————————

      yarn.nodemanager.aux-services
      mapreduce.shuffle

      ——————————————————————-

      to “_” between mapreduce and shuffle
      ——————————————————————

      yarn.nodemanager.aux-services
      mapreduce_shuffle

      ——————————————————————-

      I think something related to newer version. Also took me some time to fix this.

      /Salman.

  5. Pingback: What hadoop can and can’t do. | 做人要豁達大道

  6. Thanks for the greate guide! After I set up hadoop 2.2.0 on ubuntu 13.10 with 1 master and 3 slaves, I don’t see NodeManager started on the master with jps command(Others are fine). It’s only up on slaves. Why?

  7. Thanks for the nice tutorial.
    I have set up hadoop2.2.0 on 3 clusters. Everything is going fine. NodeManager and Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

    map 100% reduce 0%
    13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
    Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
    YarnException: Unauthorized request to start container.
    This token is expired. current time is 1385612996018 found 1385612533275
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    I have gone through internet to find out solution. But I couldn’t find out. Help me out. Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

    map 100% reduce 0%
    13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
    Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
    YarnException: Unauthorized request to start container.
    This token is expired. current time is 1385612996018 found 1385612533275
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    I have gone through internet to find out solution. But I couldn’t find out. Help me out.

  8. Thanks for the nice tutorial.
    I have set up hadoop2.2.0 on 3 clusters. Everything is going fine. NodeManager and Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

    map 100% reduce 0%
    13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
    Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
    YarnException: Unauthorized request to start container.
    This token is expired. current time is 1385612996018 found 1385612533275
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    I have gone through internet to find out solution. But I couldn’t find out. Help me out.

  9. Nice tutorial! I have luck on the first one but i have the following question for cluster setup:

    After starting Hadoop, NodeManager and DataNode are missing from master node (slaves nodes are fine tho):

    $jps
    14048 Jps
    13716 ResourceManager
    13602 NameNode
    13818 JobHistoryServer

    Would anyone be kind enough to point out what might went wrong?

  10. Hi very nice article thanks for your effort.
    I have one question. I have planned to do 3 node installation using VMware. Can I do all the changes in single node and copy the whole setup three times and do the changes? is it possible? Please help and let me know what changes i should do if that is possible.. thanks

  11. Thanks a lot for the write up. I setup the single node on a centos VM and it was running fine and I am trying to following your instruction to setup a 2 node cluster (2 centos VM’s) and am having problems. The password less ssh between the 2 VM and the setup of the namenode and datanodes worked fine and the datanode came up on both VM’s. However, when I tried to bringup the resource manager I found the following error message in the log:
    =================

    2014-01-06 18:45:46,388 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: “master”; destination host is: (unknown):0;
    org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: “master”; destination host is: (unknown):0;

    My VM’s hostname is NOT master but omehow it used the name master and can’t resolve the slave’s hostname even though both are included in the slaves file and /etc/hosts file.

    When I try to bring up the node manager I have the opposite problem:

    2014-01-06 17:37:18,166 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at master:8025
    2014-01-06 17:37:18,179 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
    java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: “master”:8025; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    Any help is appreciated

    • Hi Tim,

      It seems entry for master is missing in /etc/hosts file that maps host name “master” to an ip. If that is already present, you will have to search for these kind of errors for the OS you are running.

      • Thanks Rasesh,
        So the master host has to have the hostname master? What about the slaves can they have any names or do must they be named slave1, slave2, etc.?

      • So I added “master” as an alias to the /etc/hosts file to both master and slave VM’s and the resourceManager now comes up properly. However, when I try to bring up node manager it failed with a bind exception claiming that port 8040 is already in use so i ran the netstat and ps commands and found that Resource Manager is using that port already. I assume that this isn’t expected??
        Looking at the yarn-site.xml file resource manager is supposed to use this port so I changed the yarn-site.xml file to use port 8045 and the node manager comes up properly now. Why is node manager trying to use the same port?
        I didn’t change the yarn-site.xml file on the slave host but that worked without changes…
        Thanks for your help and explanations.

      • Thanks for the reminder. I did look through this list and since the network setup allows password less ssh, namenode and datanode bring up to work properly I assume that the hostname resolutions work OK on the network level. I didn’t know that you have to use “master” as one of the network alias for the master machine. It’s fixed now as per my reply to your last comment.

  12. Pingback: Từng bước chinh phục Hadoop 2.x (Next Gen or YARN): cài đặt cluster multi node | Duy Trí's Homepage

  13. Hi,
    I’m following this tutorial but in my master the nodemanager, after the start, dies in a few seconds.
    Where is the problem?
    The tutorial is great.
    Thanks

  14. hi,
    i am following this tutorial but in my masternode ,after sometime datanode dies in a feww seconds
    so checked the logs and found this error
    fatal error exception in secure main

  15. Hi guys, I’m getting this error when I try to run hdfs dfs -copyFromLocal in /in

    14/01/24 12:57:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
    14/01/24 12:57:27 WARN hdfs.DFSClient: DataStreamer Exception
    org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

    at org.apache.hadoop.ipc.Client.call(Client.java:1347)
    at org.apache.hadoop.ipc.Client.call(Client.java:1300)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
    copyFromLocal: File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
    14/01/24 12:57:27 ERROR hdfs.DFSClient: Failed to close file /in/in/file._COPYING_
    org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

    at org.apache.hadoop.ipc.Client.call(Client.java:1347)
    at org.apache.hadoop.ipc.Client.call(Client.java:1300)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)

    Here is my master jps list:
    30134 NameNode
    30724 ResourceManager
    30906 NodeManager
    31168 Jps
    30545 SecondaryNameNode

    and my slave list:
    18064 Jps
    17853 NodeManager
    17661 DataNode

    What am I missing?

    Best regards.

  16. superb doc.but my datanodes are not starting on slaves.this is the error in the log.please help me out.

    Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured

  17. Thanks for such a nice tutorial .I am able to set up my cluster with Hadoop 2.2.0 with just one modification,Which is as follows

    $HADOOP_CONF_DIR/yarn-site.xml (change to “mapreduce_shuffle” instead of mapreduce.shuffle)
    ———————————————————————————————–

    yarn.nodemanager.aux-services
    mapreduce_shuffle

    ———————————————————————————————–

  18. Pingback: hadoopy yarn | My Tech Notes

  19. Hi,
    I have some problems, when I start Hadoop services in Step 10 no error found but when I check by jps:
    In Master just has:
    12167 Jps
    11437 ResourceManager
    and in Slave just has:
    Jps
    Notice that I used JAVA jdk from oracle not the open jdk
    I think your tutorial is good
    Thank you so much

    • I checked logs and get these:
      2014-03-31 16:51:05,839 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
      2014-03-31 16:51:06,502 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data :
      java.io.FileNotFoundException: File file:/home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data does not exist
      at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
      at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
      at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129)
      at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146)
      at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1698)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.getDataDirsFromURIs(DataNode.java:1745)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1722)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1642)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1665)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1837)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1858)
      2014-03-31 16:51:06,504 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
      java.io.IOException: All directories in dfs.datanode.data.dir are invalid: “/home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data”
      at org.apache.hadoop.hdfs.server.datanode.DataNode.getDataDirsFromURIs(DataNode.java:1754)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1722)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1642)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1665)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1837)
      at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1858)
      2014-03-31 16:51:06,506 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
      2014-03-31 16:51:06,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:

      Thanks for your helps

      • Hi,
        I solved it already,
        Sorry because I didn’t read all comments for solution, just change from “.” to “_” and some problem within the permission on user hadoop
        I need to switch to hadoop user to configure the system 🙂
        Excellent tutor,
        Thanks you very much 🙂

      • Hi Ngoc Phan,

        I am facing the simillar error. Can I know what exactly the change to fix the issue.

        Thanks in Advance.
        Venkat

  20. Pingback: My Hadoop Experiment « Todd's Notes on Random Topics

  21. The Property

    yarn.nodemanager.aux-services
    mapreduce.shuffle

    has changed to

    yarn.nodemanager.aux-services
    mapreduce_shuffle

    Without the underscore node manager will not start!

  22. Pingback: BIG DATA : Hadoop Linux Yarn 2.2.0 | Sauget Charles-Henri – Blog Décisionnel Microsoft

  23. Its such as you learn my mind! You appear to understand
    so much about this, such as you wrote the ebook in it or something.
    I feel that you simply could do with some percent to drive the message home a bit, however other than
    that, this is magnificent blog. A fantastic read. I’ll certainly be
    back.

  24. Pingback: install cloudera cdh from local repository | wangxiangqian

  25. Hi, Rashesh,
    Your article is wonderful. I tried to install a 3 node cluster according to your tutorial. I actually succeeded once with only the web interface part not done.
    When I tried to do this for the second time, I had a problem while I formatting the name node. I use the following command: “bin/hdfs namenode -format”, and I got the following error message: “Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode”

    Do you have any idea why this is happening? I have set the .bashrc, the hadoop-env.sh, and the properties files just as I had done for the first time. I simply could not move on without this step. Any help would be greatly appreciated!

    Max

  26. Pingback: Setting Up an R-Hadoop System: Step-by-Step Guide | Big D8TA

  27. Great blog! Do you have any suggestions for aspiring writers?
    I’m planning to start my own website soon but I’m a little lost on everything.

    Would you suggest starting with a free platform
    like WordPress or go for a paid option? There are so many choices out there that I’m completely confused ..
    Any ideas? Bless you!

    • I am not a blogger as such. For a project, I was trying to install hadoop and didn’t find any resources online. So, compiled this for my reference later on but it seems to be helping others as well 🙂

  28. My developer is trying to persuade me to move to .net
    from PHP. I have always disliked the idea because of
    the costs. But he’s tryiong none the less. I’ve been using WordPress on several websites for about a year and am worried about
    switching to another platform. I have heard great things about
    blogengine.net. Is there a way I can import all my wordpress content into it?
    Any kind of help would be really appreciated!

  29. Pingback: Why Live Node on HDFS monitoring just recognize 1 from master node?

  30. Pingback: Setting Up an R-Hadoop System and Predicting Future Website Visitors: Step-by-Step Guide |

  31. Pingback: Apache Hadoop 2.5.0 セットアップ手順 その2 – クラスター構築手順 | hrendoh's memo

  32. Hello,

    After successfully installation of single node on master and slave.We tried Multi-node.while doing steps one by one..we formatted the namenode again as per the given steps..do we need to format the namenode again??…plss help..

    And again when we do sbin/hadoop-daemon.sh start namenode …it show
    starting namenode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-namenode-localhost.localdomain.out
    but when we do jps..it doesnt show any namenode at all…

    when we do sbin/hadoop-daemons.sh start datanode it shows

    slave: mv: cannot stat ‘/home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out.1’: No such file or directory
    localhost: starting datanode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out
    slave: starting datanode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out
    slave: ulimit -a for user hduser
    slave: core file size (blocks, -c) 0
    slave: data seg size (kbytes, -d) unlimited
    slave: scheduling priority (-e) 0
    slave: file size (blocks, -f) unlimited
    slave: pending signals (-i) 63441
    slave: max locked memory (kbytes, -l) 64
    slave: max memory size (kbytes, -m) unlimited
    slave: open files (-n) 1024
    slave: pipe size (512 bytes, -p) 8
    [hduser@localhost hadoop]$ jps
    7290 Jps
    7091 DataNode

    please help for this problem…Please..

  33. May I simply say what a relief to fiond somebody whoo truly understands what they’re discussing online.
    You definitely understand how to bring a problem to light
    and make itt important. Morre annd more pewople should read this andd understand this side oof
    the story. I was surprised that you are not mire popular given that you definitely possess the gift.

  34. Hello,
    I completed with hadoop multicluster using ur steps.Thanks!
    I have used hadoop 2.5.0 version
    I want to use eclipse to write mapreduce programs,but i am unable to make connectivity between
    eclipse and hadoop.What version of eclipse should i use for hadoop 2.5.0 and we also need hadoop-eclipse plugin also for the same.But,hadoop-eclipse plugin for hadoop 2.5.0 is not available yet.What should I do..?.Really stuck with this problem,Please help.

    • Hello,
      Is there something I should be careful with?
      case I just follow the steps but i still cant run hadoop
      even “echo $JAVA_HOME” cant work
      “hadoop command not found” either , Please I really needs help.

      Thanks a lot.

      Debian user

  35. thank you for your post , it was very use full to me .. i have one problem .. that is i have master and slave in my hadoop cluster. when i run wordcount program in master or slave job is running on only one node. if i run in master then it is executed in master only same in salve node. in my jobHistory also no sign of running Application Manager and job. No Containers too

  36. Hi I am new to hadoop getting below error any one having any idea.
    hadoop jar /media/Naval_tools/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar wordcount input ouput
    15/04/09 00:22:03 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
    15/04/09 00:22:04 WARN hdfs.DFSClient: DataStreamer Exception
    org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005/job.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

    at org.apache.hadoop.ipc.Client.call(Client.java:1411)
    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
    15/04/09 00:22:04 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005
    org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005/job.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

    at org.apache.hadoop.ipc.Client.call(Client.java:1411)
    at org.apache.hadoop.ipc.Client.call(Client.java:1364)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:483)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
    at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)

  37. Nice tutorial. However I would like to configure Secondary NameNode settings on master node but unbale to locate file and values that need to be configured. Please help me to proceed further on this.

  38. i have a question ..their is an error coming in my yarn..when i am starting it

    start-yarn.sh
    starting yarn daemons
    starting resourcemanager, logging to /logs/yarn-hduser-resourcemanager-ubuntu.out
    nice: /bin/yarn: No such file or directory
    localhost: starting nodemanager, logging to /logs/yarn-hduser-nodemanager-ubuntu.out
    localhost: nice: /bin/yarn: No such file or directory

  39. Pingback: Building R + Hadoop system | Mathminers

  40. Hi Rasesh,
    Thanks for the excellent instructions. This was the only resource that I was able to find that led through all the steps with sufficient details that I could follow them. Unfortunately, I get all the way to running the example with

    $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /in /out

    only to get the error

    15/06/16 00:59:53 INFO mapreduce.Job: Task Id : attempt_1434414924941_0004_m_000000_0, Status : FAILED
    Rename cannot overwrite non empty destination directory /home/hduser/hadoop-2.6.0/nm-local-dir/usercache/hduser/appcache/application_1434414924941_0004/filecache/10
    java.io.IOException: Rename cannot overwrite non empty destination directory /home/hduser/hadoop-2.6.0/nm-local-dir/usercache/hduser/appcache/application_1434414924941_0004/filecache/10
    at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716)
    at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228)
    at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659)
    at org.apache.hadoop.fs.FileContext.rename(FileContext.java:909)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
    at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

    Any suggestions on what is going on and how to fix it? Thanks!

  41. Hi,
    I am running my MR job on yarn . When add conf.set(“mapreduce.framework.name”, “yarn”); line in to my job configuration then my job get listed in hisroty server but when i removed that my job is not visible on history server.
    I have already configured mapred-site.xml ,yarn-env.sh as per your instructiuon . But don’t know why it is not taking that entry from configuration.

  42. Pingback: Cài đặt Hadoop multi node cluster | My Blog - My Favourite

  43. As the admin of this site is working, no uncertainty very rapidly
    it will be famous, due to its feature contents.

  44. Hi,
    Thanks for a nice tutorial.

    I ran into a problem while looking for history of the jobs. Yesterday I was able to view the history of a particular job and now I couldn’t. It is saying could not load history.”Exception” but it is saying WordCount job succeeded. Then I restarted the history server now it is showing”RemoteException”:{“exception”:”NotFoundException”,”message”:”java.lang.Exception: job, job_1449022650667_0005, is not found”,”javaClassName”:”org.apache.hadoop.yarn.webapp.NotFoundException”}}
    Actually it was showing all the jobs yesterday and now it is showing history for certain jobs but not for all.
    Actually I have done many trials for my course project .now I am in need of time taken by each job.
    May I please, know is there any way to know the running time of jobs if I have job ids.

    Thanks
    Gandhi

  45. Pingback: R-Hadoop 시스템 구축 가이드 | Encaion

  46. Pingback: hdfs - hadoop complains about attempting to overwrite nonempty destination directory - CSS PHP

  47. hi i get a problem, i have 1 master and 3 slaves and i don’t get a error or something like that, but in the web page does’tn show the slaves, just show 0 live nodes, 0 die nodes and 0 Decommissioning nodes

    thanks….

  48. Pingback: Introdução ao Hadoop + Instalando Hadoop de forma Distribuída – Here Is Mari

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s