由于 "Mismatch in length of source",从集群到集群的 Hadoop 复制失败

Hadoop copy from cluster to cluster fails due to "Mismatch in length of source"

我想将数据从一个集群复制到另一个集群。我用这个命令

hadoop distcp hdfs://SOURCE-NAMENODE:9000/dir/ \ hdfs://DESTINATION-NAMENODE:9000/

我收到这条消息:

18/04/11 12:05:37 INFO mapred.CopyMapper: Copying hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 to hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 18/04/11 12:05:37 INFO mapred.RetriableFileCopyCommand: Creating temp file: hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0 18/04/11 12:05:38 ERROR util.RetriableCommand: Failure in Retriable command: Copying hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 to hdfs://DESTINATION-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 java.io.IOException: Mismatch in length of source:hdfs://SOURCE-NAMENODE:9000/SOURCE-NAMENODE/WALs/xxxx,18560,1523039740289/xxxx%2C18560%2C1523039740289.default.1523445499108 and target:hdfs://DESTINATION-NAMENODE:9000/.distcp.tmp.attempt_local2084770019_0001_m_000000_0 at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareFileLengths(RetriableFileCopyCommand.java:193)...

在目的地我只看到创建的目录和 none 个文件。

有什么想法吗?

这可能是因为您正在复制正在写入的文件。