Select 具有每个类别 Power BI 最大值的行
Select row with MAX value per category Power BI
如何在 Power BI 的 M 中 select 具有每个类别最大值的行。假设我们有 table:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 1 | 2018-07-01 |
| apples | 2 | 2018-07-02 |
| apples | 3 | 2018-07-03 |
| bananas | 7 | 2018-07-04 |
| bananas | 8 | 2018-07-05 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
期望的结果是:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 3 | 2018-07-03 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
这是 PBI 的开始 table:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
我想知道是否有一种方法可以通过添加一些魔法列 IsMax:
仅在一个 table 内在后续步骤中达到预期的结果
+----------+-------+------------+-------+
| Category | Value | Date | IsMax |
+----------+-------+------------+-------+
| apples | 1 | 2018-07-01 | 0 |
| apples | 2 | 2018-07-02 | 0 |
| apples | 3 | 2018-07-03 | 1 |
| bananas | 7 | 2018-07-04 | 0 |
| bananas | 8 | 2018-07-05 | 0 |
| bananas | 9 | 2018-07-06 | 1 |
+----------+-------+------------+-------+
在 Power Query 编辑器中进行基本分组(按 Category
分组并取最大值 Value
)得到这个 table:
+----------+-------+
| Category | Value |
+----------+-------+
| apples | 3 |
| bananas | 9 |
+----------+-------+
向此 table 添加一个自定义列 IsMax
,它只是值 1
,然后将其与原始 table 匹配合并(左外连接) Category
和 Value
。最后,展开 IsMax
列以获得所需的 table,但使用 null
而不是 0
除外。如果您愿意,可以替换 null
值。
这是所有这些步骤的 M 代码:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
#"Expanded Added Custom"
编辑: 一个稍微简化的版本来重现“期望的结果”而不是 IsMax
版本:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Grouped Rows", {"Category", "Value"}, #"Changed Type", {"Category", "Value"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Date"}, {"Date"})
in
#"Expanded Grouped Rows"
编辑 2: @user11632362 向我指出 another solution 甚至更少的步骤。
一切都发生在分组步骤中。
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}, {"Date", each Table.Max(_, "Value")[Date], type date}})
in
#"Grouped Rows"
这里的关键是each Table.Max(_, "Value")[Date]
。这将 subtable 按 Value
和 returns 排序结果的第一行作为记录(并且 [Date]
后缀 returns 中的值Date
该记录的字段)。
请注意,这只会拉过一列,Date
。如果您需要拉入一堆列,return 完整记录并在另一个步骤中展开所有需要的字段可能更有意义,而不是向分组步骤添加更多列。
例如,
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6hAolKsDpIaI2Q1RlCBJFQ1xshqjKECyWA1SYl5QAhSZI6syATIAeEUNEUWyIpMgRwQTkVTZImsyAzIAeE0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t, Col1 = _t, Col2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}, {"Col1", Int64.Type}, {"Col2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), type nullable date}, {"TopValueRow", each Table.Max(_, "Value"), type record}}),
#"Expanded TopValueRow" = Table.ExpandRecordColumn(#"Grouped Rows", "TopValueRow", {"Date", "Col1", "Col2"}, {"Date", "Col1", "Col2"})
in
#"Expanded TopValueRow"
我最终通过 index
每个类别获得 MAX
。此处描述的想法:
方法 #1 是 R 转换中的单行代码:
library(dplyr)
output <- dataset %>% group_by(Category) %>% mutate(row_no_by_category = row_number(desc(Date)))
方法 #2,完全在 PBI 中完成:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Grouped rows" = Table.Group(Source, {"Category"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Date", Order.Descending}} ), "Index",1,1), type table}} ),
#"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows", "NiceTable", {"Value", "Date", "Index"}, {"Value", "Date", "Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded NiceTable", each ([Index] = 1))
in
#"Filtered Rows"
另一种方法是使用删除重复项功能,但这需要首先对数据进行正确排序,以便出现的第一行是 select 的正确行。
出于技术原因(参见 Whosebug 帖子 , 2, , and this article),我们需要将 table 缓冲到内存中以确保排序“稳定”。
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Category", Order.Ascending}, {"Value", Order.Descending}}),
#"Removed Duplicates" = Table.Distinct(Table.Buffer(#"Sorted Rows"), {"Category"})
in
#"Removed Duplicates"
这一切都可以在 GUI 中完成,除了在最后一步中在 Table.Buffer
包装器中进行编辑。
如何在 Power BI 的 M 中 select 具有每个类别最大值的行。假设我们有 table:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 1 | 2018-07-01 |
| apples | 2 | 2018-07-02 |
| apples | 3 | 2018-07-03 |
| bananas | 7 | 2018-07-04 |
| bananas | 8 | 2018-07-05 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
期望的结果是:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 3 | 2018-07-03 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
这是 PBI 的开始 table:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
我想知道是否有一种方法可以通过添加一些魔法列 IsMax:
仅在一个 table 内在后续步骤中达到预期的结果+----------+-------+------------+-------+
| Category | Value | Date | IsMax |
+----------+-------+------------+-------+
| apples | 1 | 2018-07-01 | 0 |
| apples | 2 | 2018-07-02 | 0 |
| apples | 3 | 2018-07-03 | 1 |
| bananas | 7 | 2018-07-04 | 0 |
| bananas | 8 | 2018-07-05 | 0 |
| bananas | 9 | 2018-07-06 | 1 |
+----------+-------+------------+-------+
在 Power Query 编辑器中进行基本分组(按 Category
分组并取最大值 Value
)得到这个 table:
+----------+-------+
| Category | Value |
+----------+-------+
| apples | 3 |
| bananas | 9 |
+----------+-------+
向此 table 添加一个自定义列 IsMax
,它只是值 1
,然后将其与原始 table 匹配合并(左外连接) Category
和 Value
。最后,展开 IsMax
列以获得所需的 table,但使用 null
而不是 0
除外。如果您愿意,可以替换 null
值。
这是所有这些步骤的 M 代码:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
#"Expanded Added Custom"
编辑: 一个稍微简化的版本来重现“期望的结果”而不是 IsMax
版本:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Grouped Rows", {"Category", "Value"}, #"Changed Type", {"Category", "Value"}, "Grouped Rows", JoinKind.LeftOuter),
#"Expanded Grouped Rows" = Table.ExpandTableColumn(#"Merged Queries", "Grouped Rows", {"Date"}, {"Date"})
in
#"Expanded Grouped Rows"
编辑 2: @user11632362 向我指出 another solution 甚至更少的步骤。
一切都发生在分组步骤中。
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}, {"Date", each Table.Max(_, "Value")[Date], type date}})
in
#"Grouped Rows"
这里的关键是each Table.Max(_, "Value")[Date]
。这将 subtable 按 Value
和 returns 排序结果的第一行作为记录(并且 [Date]
后缀 returns 中的值Date
该记录的字段)。
请注意,这只会拉过一列,Date
。如果您需要拉入一堆列,return 完整记录并在另一个步骤中展开所有需要的字段可能更有意义,而不是向分组步骤添加更多列。
例如,
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6hAolKsDpIaI2Q1RlCBJFQ1xshqjKECyWA1SYl5QAhSZI6syATIAeEUNEUWyIpMgRwQTkVTZImsyAzIAeE0pdhYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t, Col1 = _t, Col2 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}, {"Col1", Int64.Type}, {"Col2", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), type nullable date}, {"TopValueRow", each Table.Max(_, "Value"), type record}}),
#"Expanded TopValueRow" = Table.ExpandRecordColumn(#"Grouped Rows", "TopValueRow", {"Date", "Col1", "Col2"}, {"Date", "Col1", "Col2"})
in
#"Expanded TopValueRow"
我最终通过 index
每个类别获得 MAX
。此处描述的想法:
方法 #1 是 R 转换中的单行代码:
library(dplyr)
output <- dataset %>% group_by(Category) %>% mutate(row_no_by_category = row_number(desc(Date)))
方法 #2,完全在 PBI 中完成:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Grouped rows" = Table.Group(Source, {"Category"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Date", Order.Descending}} ), "Index",1,1), type table}} ),
#"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows", "NiceTable", {"Value", "Date", "Index"}, {"Value", "Date", "Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded NiceTable", each ([Index] = 1))
in
#"Filtered Rows"
另一种方法是使用删除重复项功能,但这需要首先对数据进行正确排序,以便出现的第一行是 select 的正确行。
出于技术原因(参见 Whosebug 帖子
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Category", Order.Ascending}, {"Value", Order.Descending}}),
#"Removed Duplicates" = Table.Distinct(Table.Buffer(#"Sorted Rows"), {"Category"})
in
#"Removed Duplicates"
这一切都可以在 GUI 中完成,除了在最后一步中在 Table.Buffer
包装器中进行编辑。