使用 Presto 从 Alluxio 读取时通道关闭

Channel is closed while reading from Alluxio using Presto

我在 运行 Alluxio 上的 Presto 查询时遇到了这个堆栈跟踪。有时我的查询能够成功,但有时会因此错误而失败。这是什么意思,我该如何解决?

com.facebook.presto.spi.PrestoException: Error opening Hive split alluxio://xxxxx:19998/s3/data/m-00025 (offset=100663296, length=53990296) using org.apache.hadoop.mapred.TextInputFormat: Channel [id: 0xfa748b02, L:/xxxxx:34874 ! R:xxxxx/xxxxx:29999] is closed.
    at com.facebook.presto.hive.HiveUtil.createRecordReader(HiveUtil.java:219)
    at com.facebook.presto.hive.GenericHiveRecordCursorProvider.lambda$createRecordCursor[=10=](GenericHiveRecordCursorProvider.java:71)
    at com.facebook.presto.hive.authentication.NoHdfsAuthentication.doAs(NoHdfsAuthentication.java:23)
    at com.facebook.presto.hive.HdfsEnvironment.doAs(HdfsEnvironment.java:80)
    at com.facebook.presto.hive.GenericHiveRecordCursorProvider.createRecordCursor(GenericHiveRecordCursorProvider.java:70)
    at com.facebook.presto.hive.HivePageSourceProvider.createHivePageSource(HivePageSourceProvider.java:183)
    at com.facebook.presto.hive.HivePageSourceProvider.createPageSource(HivePageSourceProvider.java:93)
    at com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider.createPageSource(ClassLoaderSafeConnectorPageSourceProvider.java:44)
    at com.facebook.presto.split.PageSourceManager.createPageSource(PageSourceManager.java:56)
    at com.facebook.presto.operator.ScanFilterAndProjectOperator.getOutput(ScanFilterAndProjectOperator.java:216)
    at com.facebook.presto.operator.Driver.processInternal(Driver.java:379)
    at com.facebook.presto.operator.Driver.lambda$processFor(Driver.java:283)
    at com.facebook.presto.operator.Driver.tryWithLock(Driver.java:675)
    at com.facebook.presto.operator.Driver.processFor(Driver.java:276)
    at com.facebook.presto.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1053)
    at com.facebook.presto.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:162)
    at com.facebook.presto.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:456)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Channel [id: 0xfa748b02, L:/xxxxx:34874 ! R:xxxxx/xxxxx:29999] is closed.
    at alluxio.client.block.stream.NettyPacketReader$PacketReadHandler.channelUnregistered(NettyPacketReader.java:314)
    at alluxio.core.client.runtime.io.netty.channel.AbstractChannelHandlerContext.invokeChannelUnregistered(AbstractChannelHandlerContext.java:176)

这意味着 Alluxio 客户端(Presto)和 Alluxio worker 之间的连接意外关闭。

通常这是由于客户端GC暂停时间过长造成的。 Alluxio 客户端定期在连接上发送保持活动状态,但这可能会被完整的 GC 延迟(到 worker 关闭连接的点)。

你可以通过在 Presto 守护进程中添加 Java 选项 -XX:+PrintGCDetails-Xloggc:<file name here> 来验证是否存在 GC 压力。