Spark-submit 与 yarn master 失败,错误要求在 scala.Predef 处失败
Spark-submit fails with yarn master, error- requirement failed at scala.Predef
我的 Spark 作业失败并出现以下异常,我无法弄清楚缺少什么导致作业失败的要求:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:472)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:470)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:470)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:468)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)
Spark-提交命令:
spark-submit --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/xyz/conf/log4j.xml \
-DHOME=/xyz/transformation -DENV=e1 \
-DJOB=xformation --conf spark.local.dir=/warehouse/tmp/spark1489619325 \
--queue dev --master yarn --deploy-mode cluster \
--properties-file /xyz/conf/job.conf \
--files /xyz/conf/e1.properties --class TransformationJob /xyz/job.jar
同样的程序在 master 和本地运行良好。
任何建议都会有很大帮助。提前致谢。
我在使用“--jars”元素的类路径中有一个巨大的 jar 列表,其中一个 jar 是罪魁祸首,当我从“--jars”中删除它时,问题得到解决,我仍然不确定为什么由于那个 jar,spark-submit 失败了。
我收到了类似的错误:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:501)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:499)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:499)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:497)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:763)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
解决方法是:
在终端或日志上,您将有一个 WARN 行,如下所示:
WARN 客户端:资源文件:多次添加到分布式缓存。
只需从 spark 提交脚本中删除这个额外的 jar。希望这对所有人都有帮助。
我遇到了由数据文件引起的错误。
我指向了错误的列车数据方向。
训练数据和测试数据不匹配。
训练数据:0,1 0 0 1 0 0 1 0 0 1
测试数据:1, 2, 0, 0, 10
我修改了train数据源的方向,问题已经解决
我的 Spark 作业失败并出现以下异常,我无法弄清楚缺少什么导致作业失败的要求:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:472)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:470)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:470)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:468)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)
Spark-提交命令:
spark-submit --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/xyz/conf/log4j.xml \
-DHOME=/xyz/transformation -DENV=e1 \
-DJOB=xformation --conf spark.local.dir=/warehouse/tmp/spark1489619325 \
--queue dev --master yarn --deploy-mode cluster \
--properties-file /xyz/conf/job.conf \
--files /xyz/conf/e1.properties --class TransformationJob /xyz/job.jar
同样的程序在 master 和本地运行良好。
任何建议都会有很大帮助。提前致谢。
我在使用“--jars”元素的类路径中有一个巨大的 jar 列表,其中一个 jar 是罪魁祸首,当我从“--jars”中删除它时,问题得到解决,我仍然不确定为什么由于那个 jar,spark-submit 失败了。
我收到了类似的错误:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:501)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$$anonfun$apply.apply(Client.scala:499)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:499)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources.apply(Client.scala:497)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:763)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
解决方法是:
在终端或日志上,您将有一个 WARN 行,如下所示:
WARN 客户端:资源文件:多次添加到分布式缓存。
只需从 spark 提交脚本中删除这个额外的 jar。希望这对所有人都有帮助。
我遇到了由数据文件引起的错误。 我指向了错误的列车数据方向。
训练数据和测试数据不匹配。
训练数据:0,1 0 0 1 0 0 1 0 0 1
测试数据:1, 2, 0, 0, 10
我修改了train数据源的方向,问题已经解决