如何在使用 SparkConf 连接到远程 Cassandra 集群时通过 "requires authentication"?

How to get pass "requires authentication" while connecting to remote Cassandra cluster using SparkConf?

我正在尝试使用 apache spark 和 cassandra 进行数据分析。所以我写了一个 java 代码来访问远程机器上 运行 的 cassandra。我使用了以下 java 代码。

public class JavaDemo implements Serializable {
private transient SparkConf conf;

private JavaDemo(SparkConf conf) {
    this.conf = conf;
}

private void run() {
    JavaSparkContext sc = new JavaSparkContext(conf);
    generateData(sc);
    compute(sc);
    showResults(sc);
    sc.stop();
}

private void generateData(JavaSparkContext sc) {
    CassandraConnector connector = CassandraConnector.apply(sc.getConf());
    Session session = connector.openSession();

    // Prepare the schema

        session.execute("DROP KEYSPACE IF EXISTS java_api");
        session.execute("CREATE KEYSPACE java_api WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}");
        session.execute("CREATE TABLE java_api.products (id INT PRIMARY KEY, name TEXT, parents LIST<INT>)");
        session.execute("CREATE TABLE java_api.sales (id UUID PRIMARY KEY, product INT, price DECIMAL)");
        session.execute("CREATE TABLE java_api.summaries (product INT PRIMARY KEY, summary DECIMAL)");

}

private void compute(JavaSparkContext sc) {
    System.out.println("IN compute");
}

private void showResults(JavaSparkContext sc) {
    System.out.println("IN showResults");
}

public static void main(String[] args) {


    SparkConf conf = new SparkConf();
    conf.setAppName("Java API demo");
    conf.setMaster("local[1]");
    System.out.println("---------------------------------");
    conf.set("spark.cassandra.connection.host", "192.168.1.219");


    JavaDemo app = new JavaDemo(conf);
    app.run();
} 

}

192.168.1.219 是我的远程主机,cassandra 是 运行。默认端口是 9160.When 我运行程序时出现以下错误。

    15/01/29 10:14:26 INFO ui.SparkUI: Started Spark Web UI at http://Justin:4040
15/01/29 10:14:27 WARN core.FrameCompressor: Cannot find LZ4 class, you should make sure the LZ4 library is in the classpath if you intend to use it. LZ4 compression will not be available for the protocol.
Exception in thread "main" com.datastax.driver.core.exceptions.AuthenticationException: Authentication error on host /192.168.1.219:9042: Host /192.168.1.219:9042 requires authentication, but no authenticator found in Cluster configuration
    at com.datastax.driver.core.AuthProvider.newAuthenticator(AuthProvider.java:38)
    at com.datastax.driver.core.Connection.initializeTransport(Connection.java:139)
    at com.datastax.driver.core.Connection.<init>(Connection.java:111)
    at com.datastax.driver.core.Connection$Factory.open(Connection.java:445)
    at com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection.java:216)
    at com.datastax.driver.core.ControlConnection.reconnectInternal(ControlConnection.java:172)
    at com.datastax.driver.core.ControlConnection.connect(ControlConnection.java:80)
    at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:1145)
    at com.datastax.driver.core.Cluster.getMetadata(Cluster.java:313)
    at com.datastax.spark.connector.cql.CassandraConnector$.com$datastax$spark$connector$cql$CassandraConnector$$createSession(CassandraConnector.scala:166)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun.apply(CassandraConnector.scala:151)
    at com.datastax.spark.connector.cql.CassandraConnector$$anonfun.apply(CassandraConnector.scala:151)
    at com.datastax.spark.connector.cql.RefCountedCache.createNewValueAndKeys(RefCountedCache.scala:36)
    at com.datastax.spark.connector.cql.RefCountedCache.acquire(RefCountedCache.scala:61)
    at com.datastax.spark.connector.cql.CassandraConnector.openSession(CassandraConnector.scala:72)
    at com.datastax.spark.demo.JavaDemo.generateData(JavaDemo.java:42)
    at com.datastax.spark.demo.JavaDemo.run(JavaDemo.java:34)
    at com.datastax.spark.demo.JavaDemo.main(JavaDemo.java:73)

有什么我遗漏的吗?它直接连接到端口 9042。我该如何连接它?

您的 cassandra 集群似乎配置了身份验证。由于您不提供凭据,因此不允许您连接。您可以使用 spark.cassandra.auth.usernamespark.cassandra.auth.password 属性传递身份验证凭据,如 here.

所述

所以你可以这样做:

conf.set("spark.cassandra.auth.username", "cassandra");            
conf.set("spark.cassandra.auth.password", "cassandra");

在你的代码中使这个工作。

如果您启用了 authentication 而您还没有 created/changed 任何用户,您可以使用 'cassandra' 作为用户名和密码。但在生产中,您应该创建一个单独的帐户并使用它,并更改 cassandra 用户密码,因为它可以访问所有内容。