Datastax 6 独立分析服务器

Datastax 6 standalone analytics server

我下载了 datastax 6 并希望启动单个(在 mac El Capitan 上)分析(spark 很好,但 spark + 搜索会更好)。我提取了 gz,配置了目录结构并执行了 dse cassandra -ks。启动似乎工作得很好,我可以进入 spark 主节点,问题是当我 运行 dse spark-sql (或只是 spark)。我不断收到以下错误: 是否可以设置单个节点进行开发?

ERROR [ExecutorRunner for app-20180623083819-0000/212] 2018-06-23 08:40:28,323 SPARK-WORKER Logging.scala:91 - Error running executor
java.lang.IllegalStateException: Cannot find any build directories.
    at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:248) ~[spark-launcher_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.launcher.AbstractCommandBuilder.getScalaVersion(AbstractCommandBuilder.java:240) ~[spark-launcher_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.launcher.AbstractCommandBuilder.buildClassPath(AbstractCommandBuilder.java:194) ~[spark-launcher_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.launcher.AbstractCommandBuilder.buildJavaCommand(AbstractCommandBuilder.java:117) ~[spark-launcher_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:39) ~[spark-core_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:45) ~[spark-core_2.11-2.2.0.14.jar:2.2.0.14]
    at org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:63) ~[spark-core_2.11-2.2.0.14.jar:6.0.0]
    at org.apache.spark.deploy.worker.CommandUtils$.buildProcessBuilder(CommandUtils.scala:51) ~[spark-core_2.11-2.2.0.14.jar:6.0.0]
    at org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:150) ~[spark-core_2.11-2.2.0.14.jar:6.0.0]
    at org.apache.spark.deploy.worker.DseExecutorRunner$$anon.run(DseExecutorRunner.scala:80) [dse-spark-6.0.0.jar:6.0.0]
INFO  [dispatcher-event-loop-7] 2018-06-23 08:40:28,323 SPARK-WORKER Logging.scala:54 - Executor app-20180623083819-0000/212 finished with state FAILED message java.lang.IllegalStateException: Cannot find any build directories.
INFO  [dispatcher-event-loop-7] 2018-06-23 08:40:28,324 SPARK-MASTER Logging.scala:54 - Removing executor app-20180623083819-0000/212 because it is FAILED
INFO  [dispatcher-event-loop-0] 2018-06-23 08:40:30,288 SPARK-MASTER Logging.scala:54 - Received unregister request from application app-20180623083819-0000
INFO  [dispatcher-event-loop-0] 2018-06-23 08:40:30,292 SPARK-MASTER Logging.scala:54 - Removing app app-20180623083819-0000
INFO  [dispatcher-event-loop-0] 2018-06-23 08:40:30,295 SPARK-MASTER CassandraPersistenceEngine.scala:50 - Removing existing object 

检查 dse.yaml 中目录的 /var/lib/... - 您应该对它们具有写入权限...例如,检查 DSEFS 目录是否正确配置,AlwaysOn SQL 目录等

但实际上,应该通过将 DSE 启动为 dse cassandra -s -k 而不是 -sk...

来解决问题

P.S。我正在使用以下脚本将日志等指向特定目录:

export CASSANDRA_HOME=WHERE_YOU_EXTRACTED
export DATA_BASE_DIR=SOME_DIRECTORY
export DSE_DATA=${DATA_BASE_DIR}/data
export DSE_LOGS=${DATA_BASE_DIR}/logs
# set up where you want log so you don’t have to mess with logback.xml files
export CASSANDRA_LOG_DIR=$DSE_LOGS/cassandra
mkdir -p $CASSANDRA_LOG_DIR
# so we don’t have to play with dse-spark-env.sh
export SPARK_WORKER_DIR=$DSE_DATA/spark/worker
# new setting in 6.0, in older versions set SPARK_LOCAL_DIRS
export SPARK_EXECUTOR_DIRS=$DSE_DATA/spark/rdd
export SPARK_LOCAL_DIRS=$DSE_DATA/spark/rdd
mkdir -p $SPARK_LOCAL_DIRS
export SPARK_WORKER_LOG_DIR=$DSE_DATA/spark/worker/
export SPARK_MASTER_LOG_DIR=$DSE_DATA/spark/master
# if you want to run the always on sql server
export ALWAYSON_SQL_LOG_DIR=$DSE_DATA/spark/alwayson_sql_server
export ALWAYSON_SQL_SERVER_AUDIT_LOG_DIR=$DSE_DATA/spark/alwayson_sql_server
# so tomcat logs for solr goes to a place we know
export TOMCAT_LOGS=$DSE_LOGS/tomcat

PATH=${CASSANDRA_HOME}/bin:${CASSANDRA_HOME}/resources/cassandra/tools/bin:$PATH