Statistics.corr 在 IntelliJ IDEA 中出现以下错误:无法解析重载方法 'corr'

Statistics.corr gives following error in IntelliJ IDEA: Cannot resolve overloaded method 'corr'

我正在尝试关注这个项目https://github.com/caroljmcdonald/spark-stock-sql/blob/master/src/main/scala/example/Stock.scala

并且在我的 IDE 中它给了我错误:无法解析重载方法 'corr' 在代码计算从 parquet 文件读取的 2 列之间的相关性的部分

val df = sqlContext.read.parquet("joinstock.parquet")

df.show
df.printSchema

df.explain()

// COMMAND ----------

//var agg_df = df.groupBy("location").agg(min("id"), count("id"), avg("date_diff"))
df.select(year($"dt").alias("yr"), month($"dt").alias("mo"), $"apcclose", $"xomclose", $"spyclose").groupBy("yr", "mo").agg(avg("apcclose"), avg("xomclose"), avg("spyclose")).orderBy(desc("yr"), desc("mo")).show

// COMMAND ----------

df.select(year($"dt").alias("yr"), month($"dt").alias("mo"), $"apcclose", $"xomclose", $"spyclose").groupBy("yr", "mo").agg(avg("apcclose"), avg("xomclose"), avg("spyclose")).orderBy(desc("yr"), desc("mo")).explain

这些行给我的 IntelliJ 错误 IDE 无法解析重载方法 'corr'

    // COMMAND ----------
    var seriesX = df.select($"xomclose").map { row: Row => row.getAs[Double]("xomclose") } //.rdd
    var seriesY = df.select($"spyclose").map { row: Row => row.getAs[Double]("spyclose") } //.rdd
    var correlation = Statistics.corr(seriesX, seriesY, "pearson")

    // COMMAND ----------

    seriesX = df.select($"apcclose").map { row: Row => row.getAs[Double]("apcclose") } //.rdd
    seriesY = df.select($"xomclose").map { row: Row => row.getAs[Double]("xomclose") } //.rdd
    correlation = Statistics.corr(seriesX, seriesY, "pearson")

  }
}

你可以试试dataframe的相关方法:

var correlation = df.stat.corr("xomclose", "spyclose", "pearson")