通过 --config 时 Spark-submit 失败
Spark-submit fails while passing --config
我正在尝试将配置信息传递给 Amazon EMR 中的 Spark,如下所示
spark-submit --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected -–conf spark.yarn.nodemanager.vmem-check-enabled=false
但是我收到以下错误,因为 spark 无法解析我的配置信息。
18/04/06 07:48:22 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
Exception in thread "main" java.lang.NumberFormatException: For input string: "-–conf"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at SparkPTE.sparkContext(SparkPTE.java:91)
at SparkPTE.main(SparkPTE.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
如果我在 --jar 之前给出 --config,我会得到以下错误。
spark-submit -–conf spark.yarn.nodemanager.vmem-check-enabled=false --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected
Error: Unrecognized option: -–conf
以下对我有用
spark-submit --conf spark.yarn.nodemanager.vmem-check-enabled=false --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected
您需要在尝试 运行 的 jar 名称之前提供 --conf 选项。这是因为无论您在 jar 名称之后写什么都会作为该 jar 的参数。
我正在尝试将配置信息传递给 Amazon EMR 中的 Spark,如下所示
spark-submit --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected -–conf spark.yarn.nodemanager.vmem-check-enabled=false
但是我收到以下错误,因为 spark 无法解析我的配置信息。
18/04/06 07:48:22 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
Exception in thread "main" java.lang.NumberFormatException: For input string: "-–conf"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:580)
at java.lang.Integer.parseInt(Integer.java:615)
at SparkPTE.sparkContext(SparkPTE.java:91)
at SparkPTE.main(SparkPTE.java:79)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
如果我在 --jar 之前给出 --config,我会得到以下错误。
spark-submit -–conf spark.yarn.nodemanager.vmem-check-enabled=false --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected
Error: Unrecognized option: -–conf
以下对我有用
spark-submit --conf spark.yarn.nodemanager.vmem-check-enabled=false --jars "/home/hadoop/transfer_cluster/run_spark/spark_jars/jars/trove-3.0.2.jar" --class SparkPTE bin/pte_sc.jar arabic_undirected
您需要在尝试 运行 的 jar 名称之前提供 --conf 选项。这是因为无论您在 jar 名称之后写什么都会作为该 jar 的参数。