Spark 提交失败 - /opt/cloudera/parcels/CDH/bin/spark-class: 没有这样的文件或目录
Spark Submit fails - /opt/cloudera/parcels/CDH/bin/spark-class: No such file or directory
我正在按照 Cloudera 教程进行操作并执行“4. 使用 spark-submit 提交应用程序”。我做错了什么导致 运行 教程失败?我在 /bin 文件夹中找到了 spark-shell 和 spark-submit,但没有找到 Spark-slass。
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_streaming.html#streaming
export SPARK_HOME="/opt/cloudera/parcels/CDH"
spark-submit --master local[2] --conf
"spark.dynamicAllocation.enabled=false" --jars
$SPARK_HOME/lib/spark/lib/spark-examples.jar kafka_wordcount_keke.py k
localhost:2181 POCTopicKeke1
[Myadmin@Myclouderadatahub-mn0 lib]$ spark-submit --master local[2] --jars $SPARK_HOME/lib/spark/lib/spark-examples.jar kafka_wordcount_keke.py localhost:2181 POCTopicKeke1
/log/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/spark/bin/spark-submit: line 27: /opt/cloudera/parcels/CDH/bin/spark-class: No such file or directory
/log/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/spark/bin/spark-submit: line 27: exec: /opt/cloudera/parcels/CDH/bin/spark-class: cannot execute: No such file or directory
[Myadmin@Myclouderadatahub-mn0 lib]$
我在使用 CDH 5.13 和 Spark2.2 时遇到了类似的问题
/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/bin/../lib/spark2/bin/pyspark: line 77: /opt/cloudera/parcels/SPARK2/bin/spark-submit: No such file or directory
经过调查,我发现我在 /etc/profile
中手动将 SPARK_HOME 设置为
export SPARK_HOME=/opt/cloudera/parcels/SPARK2
甚至在评论它并重新加载 /etc/profile
之后它也没有用。
解决方法:
env
命令显示 SPARK_HOME
仍然设置(奇怪)所以我使用以下命令取消设置 SPARK_HOME
unset SPARK_HOME
它开始工作了。
Spark 2.4.4 面临类似问题:
bin/spark-submit --version
bin/spark-submit: line 27: /some/path/spark-2.4.4-bin-hadoop2.7
/some/path/spark-2.4.4-bin-hadoop2.7/bin/spark-class: No such file or directory
bin/spark-submit: line 27: exec: /some/path/spark-2.4.4-bin-hadoop2.7
/some/path/spark-2.4.4-bin-hadoop2.7/bin/spark-class: cannot execute: No such file or directory
解决方案:定义 SPARK_HOME(我没有定义)
export SPARK_HOME=/some/path/spark-2.4.4-bin-hadoop2.7
在 CDH 上遇到与 spark 相同的问题。
问题的关键在于CDH已经在其spark-env.sh上指定了SPARK_HOME方向。它将覆盖检测到的 linux 环境变量。
在公司不允许使用“/opt”安装CDH的spark客户端的情况下。它应该改变 HADOOP_HOME 和 SPARK_HOME on spark-env.sh .
export SPARK_HOME=/home/Unionpay_Xzb/CDH/lib/spark
SPARK_PYTHON_PATH=""
if [ -n "$SPARK_PYTHON_PATH" ]; then
export PYTHONPATH="$PYTHONPATH:$SPARK_PYTHON_PATH"
fi
export HADOOP_HOME=/home/Unionpay_Xzb/CDH/lib/hadoop
export HADOOP_COMMON_HOME="$HADOOP_HOME"
检测到的用户定义SPARK_HOME覆盖不会发生!
我正在按照 Cloudera 教程进行操作并执行“4. 使用 spark-submit 提交应用程序”。我做错了什么导致 运行 教程失败?我在 /bin 文件夹中找到了 spark-shell 和 spark-submit,但没有找到 Spark-slass。
https://www.cloudera.com/documentation/enterprise/5-5-x/topics/spark_streaming.html#streaming
export SPARK_HOME="/opt/cloudera/parcels/CDH"
spark-submit --master local[2] --conf
"spark.dynamicAllocation.enabled=false" --jars
$SPARK_HOME/lib/spark/lib/spark-examples.jar kafka_wordcount_keke.py k
localhost:2181 POCTopicKeke1
[Myadmin@Myclouderadatahub-mn0 lib]$ spark-submit --master local[2] --jars $SPARK_HOME/lib/spark/lib/spark-examples.jar kafka_wordcount_keke.py localhost:2181 POCTopicKeke1
/log/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/spark/bin/spark-submit: line 27: /opt/cloudera/parcels/CDH/bin/spark-class: No such file or directory
/log/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/spark/bin/spark-submit: line 27: exec: /opt/cloudera/parcels/CDH/bin/spark-class: cannot execute: No such file or directory
[Myadmin@Myclouderadatahub-mn0 lib]$
我在使用 CDH 5.13 和 Spark2.2 时遇到了类似的问题
/opt/cloudera/parcels/SPARK2-2.2.0.cloudera1-1.cdh5.12.0.p0.142354/bin/../lib/spark2/bin/pyspark: line 77: /opt/cloudera/parcels/SPARK2/bin/spark-submit: No such file or directory
经过调查,我发现我在 /etc/profile
中手动将 SPARK_HOME 设置为
export SPARK_HOME=/opt/cloudera/parcels/SPARK2
甚至在评论它并重新加载 /etc/profile
之后它也没有用。
解决方法:
env
命令显示 SPARK_HOME
仍然设置(奇怪)所以我使用以下命令取消设置 SPARK_HOME
unset SPARK_HOME
它开始工作了。
Spark 2.4.4 面临类似问题:
bin/spark-submit --version
bin/spark-submit: line 27: /some/path/spark-2.4.4-bin-hadoop2.7
/some/path/spark-2.4.4-bin-hadoop2.7/bin/spark-class: No such file or directory
bin/spark-submit: line 27: exec: /some/path/spark-2.4.4-bin-hadoop2.7
/some/path/spark-2.4.4-bin-hadoop2.7/bin/spark-class: cannot execute: No such file or directory
解决方案:定义 SPARK_HOME(我没有定义)
export SPARK_HOME=/some/path/spark-2.4.4-bin-hadoop2.7
在 CDH 上遇到与 spark 相同的问题。
问题的关键在于CDH已经在其spark-env.sh上指定了SPARK_HOME方向。它将覆盖检测到的 linux 环境变量。
在公司不允许使用“/opt”安装CDH的spark客户端的情况下。它应该改变 HADOOP_HOME 和 SPARK_HOME on spark-env.sh .
export SPARK_HOME=/home/Unionpay_Xzb/CDH/lib/spark
SPARK_PYTHON_PATH=""
if [ -n "$SPARK_PYTHON_PATH" ]; then
export PYTHONPATH="$PYTHONPATH:$SPARK_PYTHON_PATH"
fi
export HADOOP_HOME=/home/Unionpay_Xzb/CDH/lib/hadoop
export HADOOP_COMMON_HOME="$HADOOP_HOME"
检测到的用户定义SPARK_HOME覆盖不会发生!