Steps to install Hadoop 2.x release (Yarn or Next-Gen) on multi-node cluster

Posted on October 14, 2012 by Rasesh Mori

In the previous post, we saw how to setup Hadoop 2.x on single-node. Here, we will see how to set up a multi-node cluster.

Hadoop 2.x release involves many changes to Hadoop and MapReduce. The centralized JobTracker service is replaced with a ResourceManager that manages the resources in the cluster and an ApplicationManager that manages the application lifecycle. These architectural changes enable hadoop to scale to much larger clusters. For more details on architectural changes in Hadoop next-gen (a.k.a. Yarn), watch this video or visit this blog.

This post concentrates on installing Hadoop 2.x a.k.a. Yarn a.k.a. next-gen on a multi-node cluster.

Prerequisites:

Java 6 installed
Dedicated user for hadoop
SSH configured

Steps to install Hadoop 2.x:

1. Download tarball

You can download tarball for hadoop 2.x from here. Extract it to a folder say, /home/hduser/yarn on master and all the slaves. We assume dedicated user for Hadoop is “hduser”.

NOTE: Master and all the slaves must have the same user and hadoop directory on same path.

$ cd /home/hduser/yarn
$ sudo chown -R hduser:hadoop hadoop-2.0.1-alpha

2. Edit /etc/hosts

Add the association between the hostnames and the ip address for the master and the slaves on all the nodes in the /etc/hosts file. Make sure that the all the nodes in the cluster are able to ping to each other.

Important Change:

127.0.0.1 localhost localhost.localdomain my-laptop
127.0.1.1 my-laptop

If you have provided alias for localhost (as done in entries above), protocol buffers will try to connect to my-laptop from other hosts while making RPC calls which will fail.

Solution:

Assuming the machine (my-laptop) has ip address “10.3.3.43”, make an entry as follows in all the other machines:

10.3.3.43       my-laptop

3. Password less SSH

Make sure that the master is able to do a password-less ssh to all the slaves.

4. Edit ~/.bashrc

export HADOOP_HOME=/home/hduser/yarn/hadoop-2.0.1-alpha
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

5. Edit Hadoop environment files

Add JAVA_HOME to following files

Add following line at start of script in libexec/hadoop-config.sh :

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386/

Add following lines at start of script in etc/hadoop/yarn-env.sh :

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386/
export HADOOP_HOME=/home/hduser/yarn/hadoop-2.0.1-alpha
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

Change the path as per your java installation.

6. Create Temp folder in HADOOP_HOME

$ mkdir -p $HADOOP_HOME/tmp

7. Add properties in configuration files

Make changes as mentioned below in all the machines:

$HADOOP_CONF_DIR/core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://master:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hduser/yarn/hadoop-2.0.1-alpha/tmp</value>
  </property>
</configuration>

$HADOOP_CONF_DIR/hdfs-site.xml :

<?xml version="1.0" encoding="UTF-8"?>
 <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
 <configuration>
   <property>
     <name>dfs.replication</name>
     <value>2</value>
   </property>
   <property>
     <name>dfs.permissions</name>
     <value>false</value>
   </property>
 </configuration>

$HADOOP_CONF_DIR/mapred-site.xml :

<?xml version="1.0"?>
<configuration>
 <property>
   <name>mapreduce.framework.name</name>
   <value>yarn</value>
 </property>
</configuration>

$HADOOP_CONF_DIR/yarn-site.xml :

<?xml version="1.0"?>
 <configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>master:8025</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>master:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>master:8040</value>
  </property>
 </configuration>

8. Add slaves

Add the slave entries in $HADOOP_CONF_DIR/slaves on master machine:

slave1
slave2

9. Format the namenode

$ bin/hadoop namenode -format

10. Start Hadoop Daemons

$ sbin/hadoop-daemon.sh start namenode
$ sbin/hadoop-daemons.sh start datanode
$ sbin/yarn-daemon.sh start resourcemanager
$ sbin/yarn-daemons.sh start nodemanager
$ sbin/mr-jobhistory-daemon.sh start historyserver

NOTE: For datanode and nodemanager, scripts are *-daemons.sh and not *-daemon.sh. daemon.sh does not lookup in slaves file and hence, will only start processes on master

11. Check installation

Check for jps output on slaves and master.

For master:

$ jps
6539 ResourceManager
6451 DataNode
8701 Jps
6895 JobHistoryServer
6234 NameNode
6765 NodeManager

For slaves:

$ jps
8014 NodeManager
7858 DataNode
9868 Jps

If these services are not up, check the logs in $HADOOP_HOME/logs directory to identify the issue.

12. Run a demo application to verify installtion

$ mkdir in
$ cat > in/file
This is one line
This is another one

Add this directory to HDFS:

$ bin/hadoop dfs -copyFromLocal in /in

Run wordcount example provided:

$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.*-alpha.jar wordcount /in /out

Check the output:

$ bin/hadoop dfs -cat /out/*
This 2
another 1
is 2
line 1
one 2

13. Web interface

1. http://master:50070/dfshealth.jsp
2. http://master:8088/cluster
3. http://master:19888/jobhistory (for Job History Server)

14. Stopping the daemons

$ sbin/mr-jobhistory-daemon.sh stop historyserver
$ sbin/yarn-daemons.sh stop nodemanager
$ sbin/yarn-daemon.sh stop resourcemanager
$ sbin/hadoop-daemons.sh stop datanode
$ sbin/hadoop-daemon.sh stop namenode

15. Possible errors

If you get a exception stack trace similar to given below:

Container launch failed for container_1350204169962_0002_01_000004 : java.lang.reflect.UndeclaredThrowableException
 at org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:135)
 at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:101)
 at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:149)
 at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:373)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)
Caused by: com.google.protobuf.ServiceException: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "my-laptop":40365; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:187)
 at $Proxy29.startContainer(Unknown Source)
 at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagerPBClientImpl.startContainer(ContainerManagerPBClientImpl.java:99)
 ... 5 more
Caused by: java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: "my-laptop":40365; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:740)
 at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:248)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1261)
 at org.apache.hadoop.ipc.Client.call(Client.java:1141)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:184)
 ... 7 more
Caused by: java.net.UnknownHostException
 ... 11 more

Solution: Check the Important Change in Step 2 and apply the necessary changes.

Happy Coding!!!

– Rasesh Mori

95 thoughts on “Steps to install Hadoop 2.x release (Yarn or Next-Gen) on multi-node cluster”

Big Data Events on January 9, 2013 at 1:02 pm said:

Nice article, Global Big Data Conference is going to held on Jan 28, 2013, Santa Clara Register http://bit.ly/10aSvt5

Reply ↓
Amit Singh on May 24, 2013 at 7:22 pm said:

Hi, Rashesh
Thanks for nice article

You have mentioned that :
“NOTE: Master and all the slaves must have the same user and hadoop directory on same path.”

I have two slaves with same user name but i can’t make same path for their hadoop installation directory. As we have different directory structures designed for difference servers (That i am trying to use as slaves) in our organization.

Thats why it is throwing me error for slaves as :
server4: bash: line 0: cd: /home/hduser/hadoop/libexec/..: No such file or directory
(The hadoop is not installed in /home/hduser/hadoop , Actually it is installed in /test/user1/hadoop. I don’t have permission to create /home/hduser directory)

Kindly assist me in this context as i am just one step behind for running hadoop on multinode clusters.

Thanks

Reply ↓
Stefan Zobel on June 20, 2013 at 3:52 am said:

Wow, one of the best writeups on Hadoop 2 installation I’ve seen. And it (still) works with 2.0.5 – tried it today on Mint Olivia. Thanks a lot!

Regards,
Stefan

Reply ↓
- Rasesh Mori on June 20, 2013 at 8:51 am said:
  
  Hi Stefan,
  
  Thanks!! Actually, I started working on 2.0.2 when it was in alpha stage and it was very difficult to install it. So, I thought why not make it easy for others 🙂
  
  Reply ↓
Milinda on September 26, 2013 at 1:17 am said:

If anyone going to try this with node manager + resource manager in a same instance remember to configure yarn.nodemanager.localizer.address. Otherwise both node manager and resource manager will try to start resource localizer (found out by looking at YARN code) on same ports.

Reply ↓
- Milinda on September 26, 2013 at 1:18 am said:
  
  Above configuration is needed for Hadoop 2.0.6-alpha. I am not sure about other versions.
  
  Reply ↓
  - doga@webaction.com on March 3, 2015 at 5:35 am said:
    
    Thanks i had this same problem
afasfa@112.com on October 25, 2013 at 4:47 am said:

you might have to run chmod -x etc/hadoop/*.sh to make 2.2 script works.

Reply ↓
missaouiahmed on October 29, 2013 at 4:13 pm said:

I followed the tutorial of installing the hadoop 2.2.0 version on multi-node cluster .
when i launch $ sbin/yarn-daemons.sh start nodemanager command , the nodemanager is not launched and in the log i had this erro:

FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle
java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers

Can you help me ?

Reply ↓
- Salman Toor on November 3, 2013 at 6:11 am said:
  
  Hi,
  
  For the error message
  
  FATAL org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices: Failed to initialize mapreduce.shuffle
  java.lang.IllegalArgumentException: The ServiceName: mapreduce.shuffle set in yarn.nodemanager.aux-services is invalid.The valid service name should only contain a-zA-Z0-9_ and can not start with numbers
  
  You need to change the
  ——————————————————————
  
  yarn.nodemanager.aux-services
  mapreduce.shuffle
  
  ——————————————————————-
  
  to “_” between mapreduce and shuffle
  ——————————————————————
  
  yarn.nodemanager.aux-services
  mapreduce_shuffle
  
  ——————————————————————-
  
  I think something related to newer version. Also took me some time to fix this.
  
  /Salman.
  
  Reply ↓
  - Rasesh Mori on November 3, 2013 at 9:40 am said:
    
    Thanks Salman… It will help many….
  - missaouiahmed on November 3, 2013 at 2:38 pm said:
    
    Hi Salman,
    Thank you very much , it’s solved now and the cluster is installed .
    i appreciated your help .
Pingback: What hadoop can and can’t do. | 做人要豁達大道
Rural Hunter on November 8, 2013 at 9:00 am said:

Thanks for the greate guide! After I set up hadoop 2.2.0 on ubuntu 13.10 with 1 master and 3 slaves, I don’t see NodeManager started on the master with jps command(Others are fine). It’s only up on slaves. Why?

Reply ↓
- Zind on November 12, 2013 at 4:08 pm said:
  
  For Hadoop 2.2.0, you should modify yarn-site.xml:
  
  yarn.nodemanager.aux-services
  # mapreduce.shuffle
  mapreduce_shuffle
  
  Reply ↓
Vinay on November 27, 2013 at 1:41 am said:

Thanks you for the details.

We need to do “hdfs dfs -mkdir -p /in” before doing copyFromLocal ( i m using hadoop 2.0.2)

Reply ↓
Sudeep Bhattarai on November 29, 2013 at 10:04 am said:

Thanks for the nice tutorial.
I have set up hadoop2.2.0 on 3 clusters. Everything is going fine. NodeManager and Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

map 100% reduce 0%
13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
YarnException: Unauthorized request to start container.
This token is expired. current time is 1385612996018 found 1385612533275
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

I have gone through internet to find out solution. But I couldn’t find out. Help me out. Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

map 100% reduce 0%
13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
YarnException: Unauthorized request to start container.
This token is expired. current time is 1385612996018 found 1385612533275
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

I have gone through internet to find out solution. But I couldn’t find out. Help me out.

Reply ↓
Sudeep Bhattarai on November 29, 2013 at 10:06 am said:

Thanks for the nice tutorial.
I have set up hadoop2.2.0 on 3 clusters. Everything is going fine. NodeManager and Datanode are started in each clusters. But, when I run wordcount example, 100% mapping takes place and it gives following exception:

map 100% reduce 0%
13/11/28 09:57:15 INFO mapreduce.Job: Task Id : attempt_1385611768688_0001_r_000000_0, Status : FAILED
Container launch failed for container_1385611768688_0001_01_000003 : org.apache.hadoop.yarn.exceptions.
YarnException: Unauthorized request to start container.
This token is expired. current time is 1385612996018 found 1385612533275
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

I have gone through internet to find out solution. But I couldn’t find out. Help me out.

Reply ↓
Sam on December 9, 2013 at 9:58 am said:

Nice tutorial! I have luck on the first one but i have the following question for cluster setup:

After starting Hadoop, NodeManager and DataNode are missing from master node (slaves nodes are fine tho):

$jps
14048 Jps
13716 ResourceManager
13602 NameNode
13818 JobHistoryServer

Would anyone be kind enough to point out what might went wrong?

Reply ↓
- Ravi on December 16, 2013 at 11:19 am said:
  
  add master and slave both in the slaves files
  $HADOOP_CONF_DIR/slaves
  master
  slave
  
  Reply ↓
- Rosko on March 13, 2014 at 4:18 am said:
  
  Hey, I have the same problem, but I have already added master and slave to the slaves file and the problem still presists. Any ideas? Thanks! 🙂
  
  Reply ↓
Ravi on December 16, 2013 at 11:22 am said:

Thank you, I am successfully in setting 3 node setup.

Reply ↓
kart on December 28, 2013 at 11:01 pm said:

Hi very nice article thanks for your effort.
I have one question. I have planned to do 3 node installation using VMware. Can I do all the changes in single node and copy the whole setup three times and do the changes? is it possible? Please help and let me know what changes i should do if that is possible.. thanks

Reply ↓
Tim T. on January 7, 2014 at 8:31 am said:

Thanks a lot for the write up. I setup the single node on a centos VM and it was running fine and I am trying to following your instruction to setup a 2 node cluster (2 centos VM’s) and am having problems. The password less ssh between the 2 VM and the setup of the namenode and datanodes worked fine and the datanode came up on both VM’s. However, when I tried to bringup the resource manager I found the following error message in the log:
=================
…
2014-01-06 18:45:46,388 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: “master”; destination host is: (unknown):0;
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: “master”; destination host is: (unknown):0;
…
My VM’s hostname is NOT master but omehow it used the name master and can’t resolve the slave’s hostname even though both are included in the slaves file and /etc/hosts file.

When I try to bring up the node manager I have the opposite problem:
…
2014-01-06 17:37:18,166 INFO org.apache.hadoop.yarn.client.RMProxy: Connecting to ResourceManager at master:8025
2014-01-06 17:37:18,179 ERROR org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Unexpected error starting NodeStatusUpdater
java.net.UnknownHostException: Invalid host name: local host is: (unknown); destination host is: “master”:8025; java.net.UnknownHostException; For more details see: http://wiki.apache.org/hadoop/UnknownHost
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
…

Any help is appreciated

Reply ↓
- Rasesh Mori on January 7, 2014 at 9:42 am said:
  
  Hi Tim,
  
  It seems entry for master is missing in /etc/hosts file that maps host name “master” to an ip. If that is already present, you will have to search for these kind of errors for the OS you are running.
  
  Reply ↓
  - Tim T on January 7, 2014 at 11:02 pm said:
    
    Thanks Rasesh,
    So the master host has to have the hostname master? What about the slaves can they have any names or do must they be named slave1, slave2, etc.?
  - Tim T on January 8, 2014 at 12:05 am said:
    
    So I added “master” as an alias to the /etc/hosts file to both master and slave VM’s and the resourceManager now comes up properly. However, when I try to bring up node manager it failed with a bind exception claiming that port 8040 is already in use so i ran the netstat and ps commands and found that Resource Manager is using that port already. I assume that this isn’t expected??
    Looking at the yarn-site.xml file resource manager is supposed to use this port so I changed the yarn-site.xml file to use port 8045 and the node manager comes up properly now. Why is node manager trying to use the same port?
    I didn’t change the yarn-site.xml file on the slave host but that worked without changes…
    Thanks for your help and explanations.
- Rasesh Mori on January 7, 2014 at 9:43 am said:
  
  Also the link provided in the error log is helpful. Please check that and follow the steps mentioned there: http://wiki.apache.org/hadoop/UnknownHost
  
  Reply ↓
  - Tim T on January 8, 2014 at 12:13 am said:
    
    Thanks for the reminder. I did look through this list and since the network setup allows password less ssh, namenode and datanode bring up to work properly I assume that the hostname resolutions work OK on the network level. I didn’t know that you have to use “master” as one of the network alias for the master machine. It’s fixed now as per my reply to your last comment.
  - Rasesh Mori on January 9, 2014 at 11:32 pm said:
    
    Great!! Happy to help 🙂
Pingback: Từng bước chinh phục Hadoop 2.x (Next Gen or YARN): cài đặt cluster multi node | Duy Trí's Homepage
Danilo on January 14, 2014 at 2:52 pm said:

Hi,
I’m following this tutorial but in my master the nodemanager, after the start, dies in a few seconds.
Where is the problem?
The tutorial is great.
Thanks

Reply ↓
- yumahil on January 23, 2014 at 2:33 pm said:
  
  You may find BindException on your node’s log file saying “Address already in use”.
  In short, change those port numbers to default ( 8030~8033 ) in yarn-site.xml
  
  Refer here : http://hadoop.apache.org/docs/r2.0.5-alpha/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
  
  Reply ↓
vakul on January 21, 2014 at 6:16 pm said:

hi,
i am following this tutorial but in my masternode ,after sometime datanode dies in a feww seconds
so checked the logs and found this error
fatal error exception in secure main

Reply ↓
Guilherme Russi on January 24, 2014 at 6:36 pm said:

Hi guys, I’m getting this error when I try to run hdfs dfs -copyFromLocal in /in

14/01/24 12:57:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
14/01/24 12:57:27 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)
copyFromLocal: File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
14/01/24 12:57:27 ERROR hdfs.DFSClient: Failed to close file /in/in/file._COPYING_
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /in/in/file._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:59582)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)

at org.apache.hadoop.ipc.Client.call(Client.java:1347)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:330)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1226)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1078)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:514)

Here is my master jps list:
30134 NameNode
30724 ResourceManager
30906 NodeManager
31168 Jps
30545 SecondaryNameNode

and my slave list:
18064 Jps
17853 NodeManager
17661 DataNode

What am I missing?

Best regards.

Reply ↓
hadoop on January 29, 2014 at 11:44 am said:

superb doc.but my datanodes are not starting on slaves.this is the error in the log.please help me out.

Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured

Reply ↓
ss on February 21, 2014 at 9:00 pm said:

Thanks for such a nice tutorial .I am able to set up my cluster with Hadoop 2.2.0 with just one modification,Which is as follows

$HADOOP_CONF_DIR/yarn-site.xml (change to “mapreduce_shuffle” instead of mapreduce.shuffle)
———————————————————————————————–

yarn.nodemanager.aux-services
mapreduce_shuffle

———————————————————————————————–

Reply ↓
Victor on March 21, 2014 at 3:23 am said:

Perfect, thank you!

Reply ↓
Pingback: hadoopy yarn | My Tech Notes
AbdulRahman on March 23, 2014 at 12:19 am said:

Thank you so much this saved my day 😉

Reply ↓
Giannis Evagorou on March 23, 2014 at 5:29 pm said:

thank you,excellent tutorial

Reply ↓
Ngoc Phan on March 31, 2014 at 1:37 pm said:

Hi,
I have some problems, when I start Hadoop services in Step 10 no error found but when I check by jps:
In Master just has:
12167 Jps
11437 ResourceManager
and in Slave just has:
Jps
Notice that I used JAVA jdk from oracle not the open jdk
I think your tutorial is good
Thank you so much

Reply ↓
- Ngoc Phan on March 31, 2014 at 1:54 pm said:
  
  I checked logs and get these:
  2014-03-31 16:51:05,839 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
  2014-03-31 16:51:06,502 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Invalid dfs.datanode.data.dir /home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data :
  java.io.FileNotFoundException: File file:/home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data does not exist
  at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:520)
  at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:398)
  at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:129)
  at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:146)
  at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:1698)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.getDataDirsFromURIs(DataNode.java:1745)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1722)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1642)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1665)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1837)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1858)
  2014-03-31 16:51:06,504 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in secureMain
  java.io.IOException: All directories in dfs.datanode.data.dir are invalid: “/home/hadoop/yarn/hadoop-2.2.0/tmp/dfs/data”
  at org.apache.hadoop.hdfs.server.datanode.DataNode.getDataDirsFromURIs(DataNode.java:1754)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1722)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1642)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1665)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1837)
  at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1858)
  2014-03-31 16:51:06,506 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
  2014-03-31 16:51:06,510 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
  
  Thanks for your helps
  
  Reply ↓
  - Ngoc Phan on March 31, 2014 at 2:19 pm said:
    
    Hi,
    I solved it already,
    Sorry because I didn’t read all comments for solution, just change from “.” to “_” and some problem within the permission on user hadoop
    I need to switch to hadoop user to configure the system 🙂
    Excellent tutor,
    Thanks you very much 🙂
  - Venkat on April 18, 2014 at 5:51 am said:
    
    Hi Ngoc Phan,
    
    I am facing the simillar error. Can I know what exactly the change to fix the issue.
    
    Thanks in Advance.
    Venkat
Pingback: My Hadoop Experiment « Todd's Notes on Random Topics
Muin on April 9, 2014 at 10:02 am said:

Good tutorial 🙂 it’ll be more helpfull if you can upload HDFS high availability cluster configurations

Reply ↓
Konrad on April 13, 2014 at 5:43 pm said:

The Property

yarn.nodemanager.aux-services
mapreduce.shuffle

has changed to

yarn.nodemanager.aux-services
mapreduce_shuffle

Without the underscore node manager will not start!

Reply ↓
- Rasesh Mori on April 13, 2014 at 7:39 pm said:
  
  Thanks Konrad for pointing this out. I have updated the post accordingly.
  
  Reply ↓
Sami Haytham Abobala on April 28, 2014 at 9:43 pm said:

HI
I have UnknownHost problem, and I tried everything but still not working !!

Reply ↓
Pingback: BIG DATA : Hadoop Linux Yarn 2.2.0 | Sauget Charles-Henri – Blog Décisionnel Microsoft
L'ornement de la masse : Essai sur la modernitÃ© weimarienne pdf on May 8, 2014 at 7:59 pm said:

Hi, I do think this is an excellent site. I stumbledupon it 😉 I am going to return once again since i have
book-marked it. Money and freedom is the greatest way to change, may
you be rich and continue to help other people.

Reply ↓
Marie-Banlieue pdf on May 9, 2014 at 3:07 am said:

Its such as you learn my mind! You appear to understand
so much about this, such as you wrote the ebook in it or something.
I feel that you simply could do with some percent to drive the message home a bit, however other than
that, this is magnificent blog. A fantastic read. I’ll certainly be
back.

Reply ↓
Pingback: install cloudera cdh from local repository | wangxiangqian
Max zhao on May 22, 2014 at 6:16 am said:

Hi, Rashesh,
Your article is wonderful. I tried to install a 3 node cluster according to your tutorial. I actually succeeded once with only the web interface part not done.
When I tried to do this for the second time, I had a problem while I formatting the name node. I use the following command: “bin/hdfs namenode -format”, and I got the following error message: “Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode”

Do you have any idea why this is happening? I have set the .bashrc, the hadoop-env.sh, and the properties files just as I had done for the first time. I simply could not move on without this step. Any help would be greatly appreciated!

Max

Reply ↓
Pingback: Setting Up an R-Hadoop System: Step-by-Step Guide | Big D8TA
Color in Decoration download mobi on June 12, 2014 at 9:50 pm said:

Great blog! Do you have any suggestions for aspiring writers?
I’m planning to start my own website soon but I’m a little lost on everything.

Would you suggest starting with a free platform
like WordPress or go for a paid option? There are so many choices out there that I’m completely confused ..
Any ideas? Bless you!

Reply ↓
- Rasesh Mori on October 9, 2014 at 9:54 am said:
  
  I am not a blogger as such. For a project, I was trying to install hadoop and didn’t find any resources online. So, compiled this for my reference later on but it seems to be helping others as well 🙂
  
  Reply ↓
Consider the Oyster download pdf on June 13, 2014 at 1:04 am said:

I am regular reader, how are you everybody? This piece of writing posted at this site is in fact pleasant.

Reply ↓
An Illustrated Life: Drawing Inspiration from the Private Sketchbooks of Artists, Illustrators and Designers download mobi on June 13, 2014 at 1:06 am said:

My developer is trying to persuade me to move to .net
from PHP. I have always disliked the idea because of
the costs. But he’s tryiong none the less. I’ve been using WordPress on several websites for about a year and am worried about
switching to another platform. I have heard great things about
blogengine.net. Is there a way I can import all my wordpress content into it?
Any kind of help would be really appreciated!

Reply ↓
Pingback: Why Live Node on HDFS monitoring just recognize 1 from master node?
Whilda Chaq on August 20, 2014 at 1:57 pm said:

has anyone ever got this exception: org.apache.hadoop.yarn.exceptions.NMNotYetReadyException: Rejecting new containers as NodeManager has not yet connected with ResourceManager?

i use hadoop 2.2.0, fedora 18, 2 node.
for more detail check this out : http://stackoverflow.com/questions/25398384/rejecting-new-containers-as-nodemanager-has-not-yet-connected-with-resourcemanag

Reply ↓
fifa 14 cheats on September 1, 2014 at 12:58 pm said:

Thanks designed for sharing such a pleasant thought, article is good, thats why i have
read it entirely

Reply ↓
Pingback: Setting Up an R-Hadoop System and Predicting Future Website Visitors: Step-by-Step Guide |
atul on September 5, 2014 at 4:52 pm said:

Very helpful indeed! Thanks for compiling this.

Reply ↓
clash of clans hack tool password on September 11, 2014 at 12:00 am said:

I was recommended this website by my cousin. I am not sure whether this post is written by
him as nobody else know such detailed about my problem.
You are incredible! Thanks!

Reply ↓
- Rasesh Mori on October 9, 2014 at 9:52 am said:
  
  Thank you!
  
  Reply ↓
Pingback: Apache Hadoop 2.5.0 セットアップ手順その2 – クラスター構築手順 | hrendoh's memo
Prati on October 2, 2014 at 6:36 pm said:

Hello,

After successfully installation of single node on master and slave.We tried Multi-node.while doing steps one by one..we formatted the namenode again as per the given steps..do we need to format the namenode again??…plss help..

And again when we do sbin/hadoop-daemon.sh start namenode …it show
starting namenode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-namenode-localhost.localdomain.out
but when we do jps..it doesnt show any namenode at all…

when we do sbin/hadoop-daemons.sh start datanode it shows

slave: mv: cannot stat ‘/home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out.1’: No such file or directory
localhost: starting datanode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out
slave: starting datanode, logging to /home/hduser/yarn/hadoop/logs/hadoop-hduser-datanode-localhost.localdomain.out
slave: ulimit -a for user hduser
slave: core file size (blocks, -c) 0
slave: data seg size (kbytes, -d) unlimited
slave: scheduling priority (-e) 0
slave: file size (blocks, -f) unlimited
slave: pending signals (-i) 63441
slave: max locked memory (kbytes, -l) 64
slave: max memory size (kbytes, -m) unlimited
slave: open files (-n) 1024
slave: pipe size (512 bytes, -p) 8
[hduser@localhost hadoop]$ jps
7290 Jps
7091 DataNode

please help for this problem…Please..

Reply ↓
Prati on October 3, 2014 at 9:55 pm said:

Do we need to format the namenode on master as well as slave while doing multicluster..?..Please help

Reply ↓
- Rasesh Mori on October 9, 2014 at 9:52 am said:
  
  There are no name nodes on slaves
  
  Reply ↓
  - Prati on October 12, 2014 at 12:03 am said:
    
    Yeah..got d problem..thanks..:-)..can u tell me about how to use hadoop eclipse plugin for hadoop2x versions and a tutorial on hadoop programs using eclipse..getting too much of problems..
Pirate Bay Proxies Sites on October 7, 2014 at 6:43 pm said:

May I simply say what a relief to fiond somebody whoo truly understands what they’re discussing online.
You definitely understand how to bring a problem to light
and make itt important. Morre annd more pewople should read this andd understand this side oof
the story. I was surprised that you are not mire popular given that you definitely possess the gift.

Reply ↓
- Rasesh Mori on October 9, 2014 at 9:47 am said:
  
  Thanks 🙂
  
  Reply ↓
Neuropathy Miracle Guide on October 8, 2014 at 10:39 pm said:

Hey! Would you mind if I share your blog with my twwitter group?
There’s a lot of people that I think would really appreciate your content.
Please let me know. Thank you

Reply ↓
- Rasesh Mori on October 9, 2014 at 9:47 am said:
  
  Sure. Why not? “Share what you know…..” is the tag line 🙂
  
  Reply ↓
PM on November 5, 2014 at 4:42 pm said:

Hello,
I completed with hadoop multicluster using ur steps.Thanks!
I have used hadoop 2.5.0 version
I want to use eclipse to write mapreduce programs,but i am unable to make connectivity between
eclipse and hadoop.What version of eclipse should i use for hadoop 2.5.0 and we also need hadoop-eclipse plugin also for the same.But,hadoop-eclipse plugin for hadoop 2.5.0 is not available yet.What should I do..?.Really stuck with this problem,Please help.

Reply ↓
- John on November 23, 2014 at 6:57 pm said:
  
  Hello,
  Is there something I should be careful with?
  case I just follow the steps but i still cant run hadoop
  even “echo $JAVA_HOME” cant work
  “hadoop command not found” either , Please I really needs help.
  
  Thanks a lot.
  
  Debian user
  
  Reply ↓
sjeromeeronimos on February 27, 2015 at 10:25 pm said:

thank you for your post , it was very use full to me .. i have one problem .. that is i have master and slave in my hadoop cluster. when i run wordcount program in master or slave job is running on only one node. if i run in master then it is executed in master only same in salve node. in my jobHistory also no sign of running Application Manager and job. No Containers too

Reply ↓
Naval on April 9, 2015 at 12:26 am said:

Hi I am new to hadoop getting below error any one having any idea.
hadoop jar /media/Naval_tools/hadoop-2.5.2/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.2.jar wordcount input ouput
15/04/09 00:22:03 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
15/04/09 00:22:04 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005/job.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)
15/04/09 00:22:04 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /tmp/hadoop-yarn/staging/naval/.staging/job_1428518032753_0005/job.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 0 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1471)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2791)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:606)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:455)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
at com.sun.proxy.$Proxy9.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:368)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1449)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1270)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:526)

Reply ↓
Sridhar on April 19, 2015 at 1:27 pm said:

Nice tutorial. However I would like to configure Secondary NameNode settings on master node but unbale to locate file and values that need to be configured. Please help me to proceed further on this.

Reply ↓
sanchit on April 23, 2015 at 3:10 pm said:

i have a question ..their is an error coming in my yarn..when i am starting it

start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /logs/yarn-hduser-resourcemanager-ubuntu.out
nice: /bin/yarn: No such file or directory
localhost: starting nodemanager, logging to /logs/yarn-hduser-nodemanager-ubuntu.out
localhost: nice: /bin/yarn: No such file or directory

Reply ↓
Pingback: Building R + Hadoop system | Mathminers
Alex on June 16, 2015 at 6:34 am said:

Hi Rasesh,
Thanks for the excellent instructions. This was the only resource that I was able to find that led through all the steps with sufficient details that I could follow them. Unfortunately, I get all the way to running the example with
—
$HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /in /out
—
only to get the error
—
15/06/16 00:59:53 INFO mapreduce.Job: Task Id : attempt_1434414924941_0004_m_000000_0, Status : FAILED
Rename cannot overwrite non empty destination directory /home/hduser/hadoop-2.6.0/nm-local-dir/usercache/hduser/appcache/application_1434414924941_0004/filecache/10
java.io.IOException: Rename cannot overwrite non empty destination directory /home/hduser/hadoop-2.6.0/nm-local-dir/usercache/hduser/appcache/application_1434414924941_0004/filecache/10
at org.apache.hadoop.fs.AbstractFileSystem.renameInternal(AbstractFileSystem.java:716)
at org.apache.hadoop.fs.FilterFs.renameInternal(FilterFs.java:228)
at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:659)
at org.apache.hadoop.fs.FileContext.rename(FileContext.java:909)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:364)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
—
Any suggestions on what is going on and how to fix it? Thanks!

Reply ↓
jeet shri on July 15, 2015 at 5:24 pm said:

Hi,
I am running my MR job on yarn . When add conf.set(“mapreduce.framework.name”, “yarn”); line in to my job configuration then my job get listed in hisroty server but when i removed that my job is not visible on history server.
I have already configured mapred-site.xml ,yarn-env.sh as per your instructiuon . But don’t know why it is not taking that entry from configuration.

Reply ↓
Pingback: Cài đặt Hadoop multi node cluster | My Blog - My Favourite
https://dauxoabopkhopglucosaminenhapkhau.wordpress.com/ on August 21, 2015 at 7:25 pm said:

As the admin of this site is working, no uncertainty very rapidly
it will be famous, due to its feature contents.

Reply ↓
Gandhimathi on December 3, 2015 at 11:46 pm said:

Hi,
Thanks for a nice tutorial.

I ran into a problem while looking for history of the jobs. Yesterday I was able to view the history of a particular job and now I couldn’t. It is saying could not load history.”Exception” but it is saying WordCount job succeeded. Then I restarted the history server now it is showing”RemoteException”:{“exception”:”NotFoundException”,”message”:”java.lang.Exception: job, job_1449022650667_0005, is not found”,”javaClassName”:”org.apache.hadoop.yarn.webapp.NotFoundException”}}
Actually it was showing all the jobs yesterday and now it is showing history for certain jobs but not for all.
Actually I have done many trials for my course project .now I am in need of time taken by each job.
May I please, know is there any way to know the running time of jobs if I have job ids.

Thanks
Gandhi

Reply ↓
Pingback: R-Hadoop 시스템 구축 가이드 | Encaion
Pingback: hdfs - hadoop complains about attempting to overwrite nonempty destination directory - CSS PHP
Dasun on February 9, 2016 at 3:53 pm said:

I also wrote an article on the same topic. Hadoop 2 deployment on Ubuntu

http://dasunhegoda.com/hadoop-2-deployment-installation-ubuntu/1085/

Reply ↓
armando on May 5, 2016 at 3:12 am said:

hi i get a problem, i have 1 master and 3 slaves and i don’t get a error or something like that, but in the web page does’tn show the slaves, just show 0 live nodes, 0 die nodes and 0 Decommissioning nodes

thanks….

Reply ↓
Pingback: Introdução ao Hadoop + Instalando Hadoop de forma Distribuída – Here Is Mari
ghanshyam on February 27, 2018 at 3:37 pm said:

Hi,
i think , on master node , the services running should be NameNode, Secondary NameNode and Resource Manager only. Datanode and Node Manager services are for Datanodes i.e for Slave nodes. Correct me if I am wrong.

Reply ↓
ethereum zero on July 22, 2021 at 5:18 pm said:

Town & Country (film)

Reply ↓