随机播放空文件失败。 EOFException:输入流意外结束
Shuffle failed on empty file. EOFException: Unexpected end of input stream
我正在尝试 运行 数据处理管道的副本,它在集群上正常工作,在本地机器上,hadoop 和 hbase 在独立模式下工作。
管道包含几个 mapreduce 作业一个接一个地开始,其中一个作业具有不在输出中写入任何内容的映射器(取决于输入,但它在我的测试中不写入任何内容),但具有缩减器。
我在这项工作 运行ning:
期间收到此异常
16:42:19,322 [INFO] [localfetcher#13] o.a.h.i.c.CodecPool: Got brand-new decompressor [.gz]
16:42:19,322 [INFO] [localfetcher#13] o.a.h.m.t.r.LocalFetcher: localfetcher#13 about to shuffle output of map attempt_local509755465_0013_m_000000_0 decomp: 2 len: 6 to MEMORY
16:42:19,326 [WARN] [Thread-4749] o.a.h.m.LocalJobRunner: job_local509755465_0013 java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#13
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.5.1.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.5.1.jar:?]
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#13
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.5.1.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:145) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:157) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
我检查了 mapper 生成的文件,我预计它们会是空的,因为 mapper 不会写入任何要存储的内容,但它们包含奇怪的文本:
文件:/tmp/hadoop-egorkiruhin/mapred/local/localRunner/egorkiruhin/jobcache/job_local509755465_0013/attempt_local509755465_0013_m_000000_0/output/file.out
ÿÿÿÿ^@^@
文件:/tmp/hadoop-egorkiruhin/mapred/local/localRunner/egorkiruhin/jobcache/job_local509755465_0013/attempt_local509755465_0013_m_000000_0/output/file.out.index
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@^F^@^@^@^@dTG<93>
我找不到这个问题的解释,但我通过关闭映射器输出的压缩解决了这个问题:
config.set("mapreduce.map.output.compress", "false");
我正在尝试 运行 数据处理管道的副本,它在集群上正常工作,在本地机器上,hadoop 和 hbase 在独立模式下工作。 管道包含几个 mapreduce 作业一个接一个地开始,其中一个作业具有不在输出中写入任何内容的映射器(取决于输入,但它在我的测试中不写入任何内容),但具有缩减器。 我在这项工作 运行ning:
期间收到此异常16:42:19,322 [INFO] [localfetcher#13] o.a.h.i.c.CodecPool: Got brand-new decompressor [.gz]
16:42:19,322 [INFO] [localfetcher#13] o.a.h.m.t.r.LocalFetcher: localfetcher#13 about to shuffle output of map attempt_local509755465_0013_m_000000_0 decomp: 2 len: 6 to MEMORY
16:42:19,326 [WARN] [Thread-4749] o.a.h.m.LocalJobRunner: job_local509755465_0013 java.lang.Exception: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#13
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) ~[hadoop-mapreduce-client-common-2.5.1.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) [hadoop-mapreduce-client-common-2.5.1.jar:?]
Caused by: org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in localfetcher#13
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:134) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:376) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) ~[hadoop-mapreduce-client-common-2.5.1.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) ~[?:1.8.0_181]
at java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
Caused by: java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:145) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:199) ~[hadoop-common-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.shuffle(InMemoryMapOutput.java:97) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:157) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.doCopy(LocalFetcher.java:102) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
at org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.run(LocalFetcher.java:85) ~[hadoop-mapreduce-client-core-2.7.3.jar:?]
我检查了 mapper 生成的文件,我预计它们会是空的,因为 mapper 不会写入任何要存储的内容,但它们包含奇怪的文本:
文件:/tmp/hadoop-egorkiruhin/mapred/local/localRunner/egorkiruhin/jobcache/job_local509755465_0013/attempt_local509755465_0013_m_000000_0/output/file.out
ÿÿÿÿ^@^@
文件:/tmp/hadoop-egorkiruhin/mapred/local/localRunner/egorkiruhin/jobcache/job_local509755465_0013/attempt_local509755465_0013_m_000000_0/output/file.out.index
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^@^F^@^@^@^@dTG<93>
我找不到这个问题的解释,但我通过关闭映射器输出的压缩解决了这个问题:
config.set("mapreduce.map.output.compress", "false");