如何根据分组变量计算 Power Query 中的百分位数?
How do I calculate Percentiles in PowerQuery based on grouping variables?
我有几列数据,我需要将"PERCENTILE"的excel版本转换成Powerquery格式。
我有一些代码作为函数添加,但应用不准确,因为它不允许按类别和年份对数据进行分组。因此,完全自由裁量 1.5-2.5 和 2014 中的任何内容都需要添加到百分位数数组中,同样,完全自由裁量 2.5-3.5 和 2014 中的任何内容都需要进入不同的百分位数数组
let
Source = (list as any, k as number) => let
Source = list,
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Sorted Rows" = Table.Sort(#"Converted to Table",{{"Column1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "TheIndex", each Table.RowCount(#"Converted to Table")*k/100),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Index] >= [TheIndex] and [Index] <= [TheIndex]+1),
Custom1 = List.Average(#"Filtered Rows"[Column1])
in
Custom1
in
Source
所以预期的结果是,在 2 列(年份、类别)上匹配的任何内容都应该应用到同一个数组中。当前调用上述函数只会给我错误。
我也尝试过使用分组并输出 "Min, Median, and Max" 输出,但我还需要 10% 和 90% 的百分位数。
提前致谢
基于其他网站上的一些发现和大量谷歌搜索(大多数人只想使用 DAX,但如果你只使用 Power Query,你就不能!)有人发布了一个非常有帮助的答案:
基本上:
/PercentileInclusive Function
(inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
Percentile
以上将复制在 Excel 中找到的 PERCENTILE 函数 - 您使用 "New Query" 和高级编辑器将其作为查询传递。然后在对数据进行分组后调用它 -
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
List.Average([Amount Sales]), type number}})
In the above formula, RenamedColumns is the name of the previous step
in the script. Change the name to match your actual case. I've assumed
that the pre-grouping sales amount column is "Amount Sales." Names of
grouped columns are "Sales Total" and "95 Percentile Sales."
Next modify the group formula, substituting List.Average with
PercentileInclusive:
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each
List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each
PercentileInclusive([Amount Sales],0.95), type number}})
这适用于我的数据集并且匹配相似
我有几列数据,我需要将"PERCENTILE"的excel版本转换成Powerquery格式。
我有一些代码作为函数添加,但应用不准确,因为它不允许按类别和年份对数据进行分组。因此,完全自由裁量 1.5-2.5 和 2014 中的任何内容都需要添加到百分位数数组中,同样,完全自由裁量 2.5-3.5 和 2014 中的任何内容都需要进入不同的百分位数数组
let
Source = (list as any, k as number) => let
Source = list,
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Sorted Rows" = Table.Sort(#"Converted to Table",{{"Column1", Order.Ascending}}),
#"Added Index" = Table.AddIndexColumn(#"Sorted Rows", "Index", 1, 1),
#"Added Custom" = Table.AddColumn(#"Added Index", "TheIndex", each Table.RowCount(#"Converted to Table")*k/100),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each [Index] >= [TheIndex] and [Index] <= [TheIndex]+1),
Custom1 = List.Average(#"Filtered Rows"[Column1])
in
Custom1
in
Source
所以预期的结果是,在 2 列(年份、类别)上匹配的任何内容都应该应用到同一个数组中。当前调用上述函数只会给我错误。 我也尝试过使用分组并输出 "Min, Median, and Max" 输出,但我还需要 10% 和 90% 的百分位数。
提前致谢
基于其他网站上的一些发现和大量谷歌搜索(大多数人只想使用 DAX,但如果你只使用 Power Query,你就不能!)有人发布了一个非常有帮助的答案:
基本上:
/PercentileInclusive Function
(inputSeries as list, percentile as number) =>
let
SeriesCount = List.Count(inputSeries),
PercentileRank = percentile*(SeriesCount-1)+1, //percentile value between 0 and 1
PercentileRankRoundedUp = Number.RoundUp(PercentileRank),
PercentileRankRoundedDown = Number.RoundDown(PercentileRank),
Percentile1 = List.Max(List.MinN(inputSeries,PercentileRankRoundedDown)),
Percentile2 = List.Max(List.MinN(inputSeries,PercentileRankRoundedUp)),
Percentile = Percentile1+(Percentile2-Percentile1)*(PercentileRank-PercentileRankRoundedDown)
in
Percentile
以上将复制在 Excel 中找到的 PERCENTILE 函数 - 您使用 "New Query" 和高级编辑器将其作为查询传递。然后在对数据进行分组后调用它 -
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each List.Average([Amount Sales]), type number}})
In the above formula, RenamedColumns is the name of the previous step in the script. Change the name to match your actual case. I've assumed that the pre-grouping sales amount column is "Amount Sales." Names of grouped columns are "Sales Total" and "95 Percentile Sales."
Next modify the group formula, substituting List.Average with PercentileInclusive:
Table.Group(RenamedColumns, {"Country"}, {{"Sales Total", each List.Sum([Amount Sales]), type number}, {"95 Percentile Sales", each PercentileInclusive([Amount Sales],0.95), type number}})
这适用于我的数据集并且匹配相似