Apache Beam:PubsubReader 因 NPE 而失败
Apache Beam: PubsubReader fails with NPE
我有一个 Beam 管道,它在应用一些转换后从 PubSub 读取并写入 BigQuery。管道始终失败并出现 NPE。我正在使用 Beam SDK 版本 0.6.0。关于我可能做错了什么的任何想法?我正在尝试 运行 使用 DirectRunner 的管道。
java.lang.NullPointerException
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:640)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:313)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:174)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:127)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
这个问题是因为一个Bug (BEAM-1656) in the DirectRunner and a precondition within PubsubCheckpoint. The bug in the DirectRunner was fixed in pull request 2237,被合并到Github master分支,但是在0.6.0发布之后。
更新到 0.7.0 每晚构建或从 github HEAD 构建将解决使用 DirectRunner 时的这个问题。
要更新到当前的夜间构建,您必须将以下存储库添加到项目的 pom.xml
。包含修复的 beam-runners-direct-java
模块的最早版本是 0.7.0-20170316.070901-9
,但并非所有模块都是使用此特定版本构建的,因此您可能必须指定单独的兼容版本或使用 0.7.0-SNAPSHOT
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
我有一个 Beam 管道,它在应用一些转换后从 PubSub 读取并写入 BigQuery。管道始终失败并出现 NPE。我正在使用 Beam SDK 版本 0.6.0。关于我可能做错了什么的任何想法?我正在尝试 运行 使用 DirectRunner 的管道。
java.lang.NullPointerException
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubReader.ackBatch(PubsubUnboundedSource.java:640)
at org.apache.beam.sdk.io.PubsubUnboundedSource$PubsubCheckpoint.finalizeCheckpoint(PubsubUnboundedSource.java:313)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.getReader(UnboundedReadEvaluatorFactory.java:174)
at org.apache.beam.runners.direct.UnboundedReadEvaluatorFactory$UnboundedReadEvaluator.processElement(UnboundedReadEvaluatorFactory.java:127)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:139)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:107)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
这个问题是因为一个Bug (BEAM-1656) in the DirectRunner and a precondition within PubsubCheckpoint. The bug in the DirectRunner was fixed in pull request 2237,被合并到Github master分支,但是在0.6.0发布之后。
更新到 0.7.0 每晚构建或从 github HEAD 构建将解决使用 DirectRunner 时的这个问题。
要更新到当前的夜间构建,您必须将以下存储库添加到项目的 pom.xml
。包含修复的 beam-runners-direct-java
模块的最早版本是 0.7.0-20170316.070901-9
,但并非所有模块都是使用此特定版本构建的,因此您可能必须指定单独的兼容版本或使用 0.7.0-SNAPSHOT
<repositories>
<repository>
<id>apache.snapshots</id>
<name>Apache Development Snapshot Repository</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>