为什么 java.net.URL.toString 在 EMR AMI 3.8.0 上抛出 NullPointerException?
Why does java.net.URL.toString throw a NullPointerException on EMR AMI 3.8.0?
我的 Hadoop 作业在 Amazon ElasticMapreduce AMI 3.7.0 上运行良好。但是当我升级到 AMI 版本 3.8.0 时,java.net.URL class 的 toString 方法开始抛出 NullPointerException:
java.lang.NullPointerException
at java.net.URL.toExternalForm(URL.java:925)
at java.net.URL.toString(URL.java:911)
at com.snowplowanalytics.iglu.client.repositories.HttpRepositoryRef.lookupSchema(HttpRepositoryRef.scala:602)
at com.snowplowanalytics.iglu.client.Resolver.recurse(Resolver.scala:236)
at com.snowplowanalytics.iglu.client.Resolver.lookupSchema(Resolver.scala:247)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply$$anonfun$apply.apply(validatableJson.scala:171)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply$$anonfun$apply.apply(validatableJson.scala:170)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply.apply(validatableJson.scala:170)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply.apply(validatableJson.scala:169)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate.apply(validatableJson.scala:169)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate.apply(validatableJson.scala:166)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.verifySchemaAndValidate(validatableJson.scala:166)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonNode.verifySchemaAndValidate(validatableJson.scala:244)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$$anonfun$apply.apply(Shredder.scala:267)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$$anonfun$apply.apply(Shredder.scala:266)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson.apply(Shredder.scala:266)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson.apply(Shredder.scala:264)
at scala.Option.map(Option.scala:145)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractAndValidateJson(Shredder.scala:264)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractContexts(Shredder.scala:101)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.shred(Shredder.scala:108)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred.apply(ShredJob.scala:83)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred.apply(ShredJob.scala:80)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$.loadAndShred(ShredJob.scala:80)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun.apply(ShredJob.scala:170)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun.apply(ShredJob.scala:169)
at com.twitter.scalding.MapFunction.operate(Operations.scala:58)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
调用方法的URL不为空。 class 的内部 toExternalForm 方法抛出异常。
为什么会这样?
这是 AMI 3.8.0 集群上 java -version
的输出(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$ java -version
java version "1.7.0_76"
Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
对于 AMI 3.7.0(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$ java -version
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
不同的 JRE 版本是否是罪魁祸首?
尽管我不太愿意做出声明,但这似乎是一个 JVM 错误。在 java.net.URL
的 OpenJDK 源代码中,整个 toExternalForm()
方法是对处理程序的委托,它是一个瞬态字段:
public String toExternalForm() {
return handler.toExternalForm(this);
}
唯一可能抛出 NPE 的方法是 handler
为 null。据我所知,所有构造函数路径和 readObject(ObjectInputStream)
方法确保设置 handler
字段并抛出异常(MalformedURLException
或 IOException
)不是。例如:
private synchronized void readObject(java.io.ObjectInputStream s)
throws IOException, ClassNotFoundException
{
s.defaultReadObject(); // read the fields
if ((handler = getURLStreamHandler(protocol)) == null) {
throw new IOException("unknown protocol: " + protocol);
}
...
我注意到有一个 public JRE 7u79 版本,如果升级到 Java 8 不可行,我建议尝试该版本。
我的 Hadoop 作业在 Amazon ElasticMapreduce AMI 3.7.0 上运行良好。但是当我升级到 AMI 版本 3.8.0 时,java.net.URL class 的 toString 方法开始抛出 NullPointerException:
java.lang.NullPointerException
at java.net.URL.toExternalForm(URL.java:925)
at java.net.URL.toString(URL.java:911)
at com.snowplowanalytics.iglu.client.repositories.HttpRepositoryRef.lookupSchema(HttpRepositoryRef.scala:602)
at com.snowplowanalytics.iglu.client.Resolver.recurse(Resolver.scala:236)
at com.snowplowanalytics.iglu.client.Resolver.lookupSchema(Resolver.scala:247)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply$$anonfun$apply.apply(validatableJson.scala:171)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply$$anonfun$apply.apply(validatableJson.scala:170)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply.apply(validatableJson.scala:170)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate$$anonfun$apply.apply(validatableJson.scala:169)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate.apply(validatableJson.scala:169)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$$anonfun$verifySchemaAndValidate.apply(validatableJson.scala:166)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonMethods$.verifySchemaAndValidate(validatableJson.scala:166)
at com.snowplowanalytics.iglu.client.validation.ValidatableJsonNode.verifySchemaAndValidate(validatableJson.scala:244)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$$anonfun$apply.apply(Shredder.scala:267)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson$$anonfun$apply.apply(Shredder.scala:266)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson.apply(Shredder.scala:266)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$$anonfun$extractAndValidateJson.apply(Shredder.scala:264)
at scala.Option.map(Option.scala:145)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractAndValidateJson(Shredder.scala:264)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.extractContexts(Shredder.scala:101)
at com.snowplowanalytics.snowplow.enrich.common.utils.shredder.Shredder$.shred(Shredder.scala:108)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred.apply(ShredJob.scala:83)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun$loadAndShred.apply(ShredJob.scala:80)
at scalaz.Validation$class.flatMap(Validation.scala:141)
at scalaz.Success.flatMap(Validation.scala:347)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$.loadAndShred(ShredJob.scala:80)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun.apply(ShredJob.scala:170)
at com.snowplowanalytics.snowplow.enrich.hadoop.ShredJob$$anonfun.apply(ShredJob.scala:169)
at com.twitter.scalding.MapFunction.operate(Operations.scala:58)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:99)
at cascading.flow.stream.FunctionEachStage.receive(FunctionEachStage.java:39)
at cascading.flow.stream.SourceStage.map(SourceStage.java:102)
at cascading.flow.stream.SourceStage.run(SourceStage.java:58)
at cascading.flow.hadoop.FlowMapper.run(FlowMapper.java:130)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:452)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:344)
at org.apache.hadoop.mapred.YarnChild.run(YarnChild.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166)
调用方法的URL不为空。 class 的内部 toExternalForm 方法抛出异常。
为什么会这样?
这是 AMI 3.8.0 集群上 java -version
的输出(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$ java -version
java version "1.7.0_76"
Java(TM) SE Runtime Environment (build 1.7.0_76-b13)
Java HotSpot(TM) 64-Bit Server VM (build 24.76-b04, mixed mode)
对于 AMI 3.7.0(在主节点和核心节点上):
[hadoop@ip-xxx-xx-xx-xx ~]$ java -version
java version "1.7.0_71"
Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)
不同的 JRE 版本是否是罪魁祸首?
尽管我不太愿意做出声明,但这似乎是一个 JVM 错误。在 java.net.URL
的 OpenJDK 源代码中,整个 toExternalForm()
方法是对处理程序的委托,它是一个瞬态字段:
public String toExternalForm() {
return handler.toExternalForm(this);
}
唯一可能抛出 NPE 的方法是 handler
为 null。据我所知,所有构造函数路径和 readObject(ObjectInputStream)
方法确保设置 handler
字段并抛出异常(MalformedURLException
或 IOException
)不是。例如:
private synchronized void readObject(java.io.ObjectInputStream s)
throws IOException, ClassNotFoundException
{
s.defaultReadObject(); // read the fields
if ((handler = getURLStreamHandler(protocol)) == null) {
throw new IOException("unknown protocol: " + protocol);
}
...
我注意到有一个 public JRE 7u79 版本,如果升级到 Java 8 不可行,我建议尝试该版本。