DSE 4.6 到 DSE 4.7 找不到 Spark 程序集

DSE 4.6 to DSE 4.7 Failed to find Spark assembly

将 DSE 4.6 升级到 4.7 后,job-server-0.5.0 出现问题。如果我 运行 server_start.sh 我会得到错误 “无法在 /usr/share/dse/spark/assembly/target/scala-2.10 中找到 Spark 程序集 您需要在 运行 运行此程序之前构建 Spark。"

我在 /usr/share/dse/spark/bin/compute-classpath.sh

中找到

此代码引发错误

for f in ${assembly_folder}/spark-assembly*hadoop*.jar; do
  if [[ ! -e "$f" ]]; then
    echo "Failed to find Spark assembly in $assembly_folder" 1>&2
    echo "You need to build Spark before running this program." 1>&2
    exit 1
  fi
  ASSEMBLY_JAR="$f"
  num_jars=$((num_jars+1))
done

如果我 运行 /usr/share/dse/spark/bin/spark-submit 我会得到同样的错误。

如果您正在使用 DSE,您很可能应该在不点击计算类路径的情况下启动 spark-jobserver。您可以尝试修改启动脚本以使用 dse spark-submit,如下例所示。

# job server jar needs to appear first so its deps take higher priority
# need to explicitly include app dir in classpath so logging configs can be found
#CLASSPATH="$appdir:$appdir/spark-job-server.jar:$($SPARK_HOME/bin/compute-classpath.sh)"

#exec java -cp $CLASSPATH $GC_OPTS $JAVA_OPTS $LOGGING_OPTS $CONFIG_OVERRIDES $MAIN $conffile 2>&1 &
dse spark-submit --class $MAIN $appdir/spark-job-server.jar --driver-java-options "$GC_OPTS $JAVA_OPTS $LOGGING_OPTS" $conffile 2>&1 &

https://github.com/spark-jobserver/spark-jobserver/blob/f5406a50406c59f26c878d7cee7334d6b9203312/bin/server_start.sh