Spring 启动 YARN 没有 运行 在 Hadoop 2.8.0 客户端无法访问 DataNode
Spring Boot YARN doesn't run on Hadoop 2.8.0 client cannot access DataNode
我正在尝试 运行 Spring 启动 YARN 示例(https://spring.io/guides/gs/yarn-basic/ on Windows)。在 application.yml
中,我更改了 fsUri
和 resourceManagerHost
以指向我的虚拟机主机 192.168...
。
但是当我尝试 运行 应用程序 Exceprion 时出现:
DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1508)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1284)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
[2017-05-27 19:59:49.570] boot - 7728 INFO [Thread-5] --- DFSClient: Abandoning BP-646365587-10.0.2.15-1495898351938:blk_1073741830_1006
[2017-05-27 19:59:49.602] boot - 7728 INFO [Thread-5] --- DFSClient: Excluding datanode DatanodeInfoWithStorage[10.0.2.15:50010,DS-f909ec7a-8374-4cdd-9cfc-0e778810d98c,DISK]
[2017-05-27 19:59:49.647] boot - 7728 WARN [Thread-5] --- DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
这意味着无法从我的主机访问 DataNode。出于这个原因,我添加到 hdfs-site.xml
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
但它仍然抛出该异常。
我的 VM 上安装了 Hadoop 2.8.0 运行ning。这是会议。文件:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/datanode</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-
disk-percentage</name>
<value>99</value>
</property>
</configuration>
Spring 在 core-site.xml 中将 0.0.0.0:9000 更改为 [VM's IP]:9000 后连接到 YARN。感谢@RameshMaharjan
您的 core-site.xml
应该指向 Namenode
地址,但目前它指向 0.0.0.0
,这意味着本地计算机上的所有地址。这将产生不明确的结果,因为每台机器都应被视为 Namenode
。
Namenode
在一个hadoop集群中应该只有一个。
将 0.0.0.0
替换为 Namenode
的 ip
或 hostname
应该可以解决您面临的问题。
我正在尝试 运行 Spring 启动 YARN 示例(https://spring.io/guides/gs/yarn-basic/ on Windows)。在 application.yml
中,我更改了 fsUri
和 resourceManagerHost
以指向我的虚拟机主机 192.168...
。
但是当我尝试 运行 应用程序 Exceprion 时出现:
DFSClient: Exception in createBlockOutputStream
java.net.ConnectException: Connection timed out: no further information
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.hdfs.DFSOutputStream.createSocketForPipeline(DFSOutputStream.java:1508)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1284)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1237)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:449)
[2017-05-27 19:59:49.570] boot - 7728 INFO [Thread-5] --- DFSClient: Abandoning BP-646365587-10.0.2.15-1495898351938:blk_1073741830_1006
[2017-05-27 19:59:49.602] boot - 7728 INFO [Thread-5] --- DFSClient: Excluding datanode DatanodeInfoWithStorage[10.0.2.15:50010,DS-f909ec7a-8374-4cdd-9cfc-0e778810d98c,DISK]
[2017-05-27 19:59:49.647] boot - 7728 WARN [Thread-5] --- DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /app/gs-yarn-basic/gs-yarn-basic-container-0.1.0.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
这意味着无法从我的主机访问 DataNode。出于这个原因,我添加到 hdfs-site.xml
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
但它仍然抛出该异常。
我的 VM 上安装了 Hadoop 2.8.0 运行ning。这是会议。文件:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/hadoop-2.8.0/data/datanode</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-
disk-percentage</name>
<value>99</value>
</property>
</configuration>
Spring 在 core-site.xml 中将 0.0.0.0:9000 更改为 [VM's IP]:9000 后连接到 YARN。感谢@RameshMaharjan
您的 core-site.xml
应该指向 Namenode
地址,但目前它指向 0.0.0.0
,这意味着本地计算机上的所有地址。这将产生不明确的结果,因为每台机器都应被视为 Namenode
。
Namenode
在一个hadoop集群中应该只有一个。
将 0.0.0.0
替换为 Namenode
的 ip
或 hostname
应该可以解决您面临的问题。