即使发生 IOException,作业也成功完成
Jobs finishing successfully even though IOException occurs
当 运行 GridMix 时,我的主节点收到各种 IOException,我想知道这是否是我应该真正关心的事情,或者它是否是我的工作成功完成时的短暂事件:
IOException: Bad connect ack with firstBadLink: \
java.io.IOException: Bad response ERROR for block BP-49483579-10.0.1.190-1449960324681:blk_1073746606_5783 from datanode 10.0.1.192:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
忽略它?
try {
...
} catch (IOException iox) {
//***NOP***
}
在我了解您的完整设置之前我无法确定,但很有可能是这些异常是在附加到管道设置时发生的,就代码而言,您可以说 stage == BlockConstructionStage.PIPELINE_SETUP_APPEND
.
在任何情况下,既然您的作业已成功完成,您不必担心,它成功完成的原因是因为在尝试打开 DataOutputStream 到 DataNode 管道时并且发生了一些异常然后它继续尝试直到管道被设置。
异常发生在 org.apache.hadoop.hdfs.DFSOutputStream
,下面是重要的代码片段,供您理解。
private boolean createBlockOutputStream(DatanodeInfo[] nodes, long newGS, boolean recoveryFlag) {
//Code..
if (pipelineStatus != SUCCESS) {
if (pipelineStatus == Status.ERROR_ACCESS_TOKEN) {
throw new InvalidBlockTokenException(
"Got access token error for connect ack with firstBadLink as "
+ firstBadLink);
} else {
throw new IOException("Bad connect ack with firstBadLink as "
+ firstBadLink);
}
}
//Code..
}
现在,createBlockOutputStream
从 setupPipelineForAppendOrRecovery
调用,正如该方法的代码注释中提到的 - "It keeps on trying until a pipeline is setup".
/**
* Open a DataOutputStream to a DataNode pipeline so that
* it can be written to.
* This happens when a file is appended or data streaming fails
* It keeps on trying until a pipeline is setup
*/
private boolean setupPipelineForAppendOrRecovery() throws IOException {
//Code..
while (!success && !streamerClosed && dfsClient.clientRunning) {
//Code..
success = createBlockOutputStream(nodes, newGS, isRecovery);
}
//Code..
}
如果您将阅读完整的 org.apache.hadoop.hdfs.DFSOutputStream
代码,您将了解管道设置试验将继续进行,直到创建用于追加或全新使用的管道。
如果你想处理它,那么你可以尝试从 hdfs-site.xml
调整 dfs.datanode.max.xcievers
属性,大多数人已经报告了相同的解决方案。请注意,设置 属性.
后需要重新启动 hadoop 服务
<property>
<name>dfs.datanode.max.xcievers</name>
<value>8192</value>
</property>
当 运行 GridMix 时,我的主节点收到各种 IOException,我想知道这是否是我应该真正关心的事情,或者它是否是我的工作成功完成时的短暂事件:
IOException: Bad connect ack with firstBadLink: \
java.io.IOException: Bad response ERROR for block BP-49483579-10.0.1.190-1449960324681:blk_1073746606_5783 from datanode 10.0.1.192:50010
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:819)
忽略它?
try {
...
} catch (IOException iox) {
//***NOP***
}
在我了解您的完整设置之前我无法确定,但很有可能是这些异常是在附加到管道设置时发生的,就代码而言,您可以说 stage == BlockConstructionStage.PIPELINE_SETUP_APPEND
.
在任何情况下,既然您的作业已成功完成,您不必担心,它成功完成的原因是因为在尝试打开 DataOutputStream 到 DataNode 管道时并且发生了一些异常然后它继续尝试直到管道被设置。
异常发生在 org.apache.hadoop.hdfs.DFSOutputStream
,下面是重要的代码片段,供您理解。
private boolean createBlockOutputStream(DatanodeInfo[] nodes, long newGS, boolean recoveryFlag) {
//Code..
if (pipelineStatus != SUCCESS) {
if (pipelineStatus == Status.ERROR_ACCESS_TOKEN) {
throw new InvalidBlockTokenException(
"Got access token error for connect ack with firstBadLink as "
+ firstBadLink);
} else {
throw new IOException("Bad connect ack with firstBadLink as "
+ firstBadLink);
}
}
//Code..
}
现在,createBlockOutputStream
从 setupPipelineForAppendOrRecovery
调用,正如该方法的代码注释中提到的 - "It keeps on trying until a pipeline is setup".
/**
* Open a DataOutputStream to a DataNode pipeline so that
* it can be written to.
* This happens when a file is appended or data streaming fails
* It keeps on trying until a pipeline is setup
*/
private boolean setupPipelineForAppendOrRecovery() throws IOException {
//Code..
while (!success && !streamerClosed && dfsClient.clientRunning) {
//Code..
success = createBlockOutputStream(nodes, newGS, isRecovery);
}
//Code..
}
如果您将阅读完整的 org.apache.hadoop.hdfs.DFSOutputStream
代码,您将了解管道设置试验将继续进行,直到创建用于追加或全新使用的管道。
如果你想处理它,那么你可以尝试从 hdfs-site.xml
调整 dfs.datanode.max.xcievers
属性,大多数人已经报告了相同的解决方案。请注意,设置 属性.
<property>
<name>dfs.datanode.max.xcievers</name>
<value>8192</value>
</property>