spark 的 approxQuantile 问题,无法识别 List<String>

Issue with approxQuantile of spark , not recognizing List<String>

我在我的项目中使用 spark-sql-2.4.1v java8.

我需要计算以下给定数据帧 df:

的某些(计算的)列(即 con_dist_1con_dist_2)的分位数
+----+---------+-------------+----------+-----------+
|  id|     date|   revenue   |con_dist_1| con_dist_2|
+----+---------+-------------+----------+-----------+
|  10|1/15/2018|  0.010680705|         6|0.019875458|
|  10|1/15/2018|  0.006628853|         4|0.816039063|
|  10|1/15/2018|   0.01378215|         4|0.082049528|
|  10|1/15/2018|  0.010680705|         6|0.019875458|
|  10|1/15/2018|  0.006628853|         4|0.816039063|
|  10|1/15/2018|   0.01378215|         4|0.082049528|
|  10|1/15/2018|  0.010680705|         6|0.019875458|
|  10|1/15/2018|  0.010680705|         6|0.019875458|
|  10|1/15/2018|  0.014933087|         5|0.034681906|
|  10|1/15/2018|  0.014448282|         3|0.082049528|
+----+---------+-------------+----------+-----------+

List<String> calcColmns = Arrays.asList("con_dist_1","con_dist_2")

当我尝试使用 approxQuantile 的第一个版本时,即 approxQuantile(List<String>, List<Double>, double) 如下

List<List<Double>> quants = df.stat().approxQuantile(calcColmns , Array(0.0,0.1,0.5),0.0);

报错:

The method approxQuantile(String, double[], double) in the type DataFrameStatFunctions is not applicable for the arguments (List, List, double)

这里有什么问题?我正在我的 eclipseIDE 中做这件事。为什么它不调用 List<String> 即使我正在传递 List<String>

已添加 API 的快照:

看起来可能是由于在 approxQuantile 函数的输入中使用了 Array。最简单的解决方法是对列和百分位数都使用数组(这将使用 API 快照中的第三个 approxQuantile 方法。:

String[] calcColmns = {"con_dist_1", "con_dist_2"};
double[] percentiles = {0.0,0.1,0.5};

然后调用函数:

double[][] quants = df.stat().approxQuantile(calcColmns, percentiles, 0.0);