尝试 运行 一个简单的 spark streaming kafka 示例时出错

Getting an error while trying to run a simple spark streaming kafka example

我正在尝试 运行 一个简单的 kafka spark 流示例。这是我得到的错误。

16/10/02 20:45:43 INFO SparkEnv: Registering OutputCommitCoordinator Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$scope()Lscala/xml/TopScope$; at org.apache.spark.ui.jobs.StagePage.(StagePage.scala:44) at org.apache.spark.ui.jobs.StagesTab.(StagesTab.scala:34) at org.apache.spark.ui.SparkUI.(SparkUI.scala:62) at org.apache.spark.ui.SparkUI$.create(SparkUI.scala:215) at org.apache.spark.ui.SparkUI$.createLiveUI(SparkUI.scala:157) at org.apache.spark.SparkContext.(SparkContext.scala:443) at org.apache.spark.streaming.StreamingContext$.createNewSparkContext(StreamingContext.scala:836) at org.apache.spark.streaming.StreamingContext.(StreamingContext.scala:84) at org.apache.spark.streaming.api.java.JavaStreamingContext.(JavaStreamingContext.scala:138) at com.application.SparkConsumer.App.main(App.java:27)

我正在使用以下 pom.xml 设置此示例。我试图找到这个缺失的 scala.Predef class,并为 spark-streaming-kafka-0-8-assembly 添加了缺失的依赖项,当我探索这个时我可以看到 class罐子。

<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka_2.11</artifactId>
    <version>0.8.2.0</version>
</dependency>
<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>0.8.2.0</version>
</dependency>
<dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>2.0.0</version>
      <scope>provided</scope>
</dependency>
<dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
      <version>2.0.0</version>
      <scope>provided</scope>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
    <groupId>org.scala-lang</groupId>
    <artifactId>scala-library</artifactId>
    <version>2.11.0</version>
</dependency>
<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming-kafka-0-8-assembly_2.11</artifactId>
    <version>2.0.0</version>
</dependency>

我尝试了一个简单的 spark 字数统计示例,效果很好。当我使用这个 spark-streaming-kafka 时,我遇到了麻烦。我试图查找此错误,但没有成功。

这是代码片段。

        SparkConf sparkConf = new SparkConf().setAppName("someapp").setMaster("local[2]");
        // Create the context with 2 seconds batch size
        JavaStreamingContext jssc = new JavaStreamingContext(sparkConf, new Duration(2000));

        int numThreads = Integer.parseInt(args[3]);
        Map<String, Integer> topicMap = new HashMap<String,Integer>();
        topicMap.put("fast-messages", 1);
        Map<String, String> kafkaParams = new HashMap<String,String>();
        kafkaParams.put("metadata.broker.list", "localhost:9092");
        JavaPairReceiverInputDStream<String, String> messages = 
        KafkaUtils.createStream(jssc,"zoo1","my-consumer-group", topicMap); 

我用0.8.2.0 kafka的2.11好像有问题。切换到 2.10 后它工作正常。