通过 Spark runner 和 HDFS 的 Apache Beam 字数统计示例失败并显示 "Failed to serialize and deserialize property"
Apache Beam Word Count Example via Spark runner and HDFS fails with "Failed to serialize and deserialize property"
我正在尝试 运行 Spark v1 上的 Apache Beam v2.0.0 字数统计示例。6.x(通过 Yarn v2.7.3),以便它读取和写入 HDFS(v2 .7.3).
目前,我通过以下命令提交作业:
bin/spark-submit --class org.apache.beam.examples.WordCount \
--master yarn --deploy-mode cluster \
test/word-count-beam-1.0-SNAPSHOT.jar \
--inputFile=hdfs://test/input/* \
--output=hdfs://test/output \
--runner=SparkRunner --sparkMaster=yarn
不幸的是,作业失败并出现以下异常:
Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
这里是完整的堆栈跟踪:
java.lang.IllegalStateException: Failed to serialize the pipeline options.
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:58)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.<init>(SparkRuntimeContext.java:41)
at org.apache.beam.runners.spark.translation.EvaluationContext.<init>(EvaluationContext.java:67)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:196)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:85)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
at at.tmobile.bigdata.examples.WordCount.main(WordCount.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:561)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:163)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2342)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:56)
... 12 more
Caused by: java.io.IOException: Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:710)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:629)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:618)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:2881)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2338)
... 13 more
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:266)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:241)
at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:394)
at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3169)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3062)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2175)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:708)
... 18 more
Caused by: java.lang.IllegalArgumentException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.verifyNonDup(CreatorCollector.java:228)
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.addPropertyCreator(CreatorCollector.java:168)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._handleSingleArgumentConstructor(BasicDeserializerFactory.java:487)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._addDeserializerConstructors(BasicDeserializerFactory.java:406)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._constructDefaultValueInstantiator(BasicDeserializerFactory.java:325)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findValueInstantiator(BasicDeserializerFactory.java:266)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.createCollectionDeserializer(BasicDeserializerFactory.java:851)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:390)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:348)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:261)
... 25 more
有人知道如何解决这个问题吗?
我遇到了同样的问题。
java.util.ServiceLoader.load(com.fasterxml.jackson.databind.Module.class)
中加载的模块是:
- DefaultScalaModule
- HadoopFileSystemModule
- ParanamerModule
问题出在 dfsConfiguration
属性 类型 ArrayList<Configuration>
。
在 spark runner
配置文件的 jackson-module-scala
依赖项中排除 paranamer
依赖项有助于:
<profiles>
<profile>
<id>spark-runner</id>
<dependencies>
...
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>2.8.8</version>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-paranamer</artifactId>
</exclusion>
</exclusions>
</dependency>
...
</dependencies>
</profile>
</profiles>
ParanamerModule 检查 属性 注释并且对 ArrayList
构造函数失败,但它是可选的。
我正在尝试 运行 Spark v1 上的 Apache Beam v2.0.0 字数统计示例。6.x(通过 Yarn v2.7.3),以便它读取和写入 HDFS(v2 .7.3).
目前,我通过以下命令提交作业:
bin/spark-submit --class org.apache.beam.examples.WordCount \
--master yarn --deploy-mode cluster \
test/word-count-beam-1.0-SNAPSHOT.jar \
--inputFile=hdfs://test/input/* \
--output=hdfs://test/output \
--runner=SparkRunner --sparkMaster=yarn
不幸的是,作业失败并出现以下异常:
Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
这里是完整的堆栈跟踪:
java.lang.IllegalStateException: Failed to serialize the pipeline options.
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:58)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.<init>(SparkRuntimeContext.java:41)
at org.apache.beam.runners.spark.translation.EvaluationContext.<init>(EvaluationContext.java:67)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:196)
at org.apache.beam.runners.spark.SparkRunner.run(SparkRunner.java:85)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
at at.tmobile.bigdata.examples.WordCount.main(WordCount.java:184)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon.run(ApplicationMaster.scala:561)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Unexpected IOException (of type java.io.IOException): Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at com.fasterxml.jackson.databind.JsonMappingException.fromUnexpectedIOE(JsonMappingException.java:163)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2342)
at org.apache.beam.runners.spark.translation.SparkRuntimeContext.serializePipelineOptions(SparkRuntimeContext.java:56)
... 12 more
Caused by: java.io.IOException: Failed to serialize and deserialize property 'hdfsConfiguration' with value '[Configuration: /usr/hdp/current/hadoop-client/conf/core-site.xml, /usr/hdp/current/hadoop-client/conf/hdfs-site.xml]'
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:710)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:629)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.serialize(ProxyInvocationHandler.java:618)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:128)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:2881)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2338)
... 13 more
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:266)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCacheValueDeserializer(DeserializerCache.java:241)
at com.fasterxml.jackson.databind.deser.DeserializerCache.findValueDeserializer(DeserializerCache.java:142)
at com.fasterxml.jackson.databind.DeserializationContext.findRootValueDeserializer(DeserializationContext.java:394)
at com.fasterxml.jackson.databind.ObjectMapper._findRootDeserializer(ObjectMapper.java:3169)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3062)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:2175)
at org.apache.beam.sdk.options.ProxyInvocationHandler$Serializer.ensureSerializable(ProxyInvocationHandler.java:708)
... 18 more
Caused by: java.lang.IllegalArgumentException: Conflicting property-based creators: already had [constructor for java.util.ArrayList, annotations: [null]], encountered [constructor for java.util.ArrayList, annotations: [null]]
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.verifyNonDup(CreatorCollector.java:228)
at com.fasterxml.jackson.databind.deser.impl.CreatorCollector.addPropertyCreator(CreatorCollector.java:168)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._handleSingleArgumentConstructor(BasicDeserializerFactory.java:487)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._addDeserializerConstructors(BasicDeserializerFactory.java:406)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory._constructDefaultValueInstantiator(BasicDeserializerFactory.java:325)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.findValueInstantiator(BasicDeserializerFactory.java:266)
at com.fasterxml.jackson.databind.deser.BasicDeserializerFactory.createCollectionDeserializer(BasicDeserializerFactory.java:851)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer2(DeserializerCache.java:390)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createDeserializer(DeserializerCache.java:348)
at com.fasterxml.jackson.databind.deser.DeserializerCache._createAndCache2(DeserializerCache.java:261)
... 25 more
有人知道如何解决这个问题吗?
我遇到了同样的问题。
java.util.ServiceLoader.load(com.fasterxml.jackson.databind.Module.class)
中加载的模块是:
- DefaultScalaModule
- HadoopFileSystemModule
- ParanamerModule
问题出在 dfsConfiguration
属性 类型 ArrayList<Configuration>
。
在 spark runner
配置文件的 jackson-module-scala
依赖项中排除 paranamer
依赖项有助于:
<profiles>
<profile>
<id>spark-runner</id>
<dependencies>
...
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_2.10</artifactId>
<version>2.8.8</version>
<scope>runtime</scope>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-paranamer</artifactId>
</exclusion>
</exclusions>
</dependency>
...
</dependencies>
</profile>
</profiles>
ParanamerModule 检查 属性 注释并且对 ArrayList
构造函数失败,但它是可选的。