Excel 幂查询 - 聚合连续 "transitive" 重叠时间间隔
Excel power query - aggregate continuous "transitive" overlapping time intervals
我正在尝试将下面给定的 table 1 到 table 2 与 Excel 幂查询汇总。
目标是将同一组的连续时间间隔合并为一行。对于像事件 5 和 6 这样的直接重叠,这非常容易。但是这种方法仅合并了事件 1 和 2 以及事件 2 和 3,从而产生了两个条目(参见 Table 1b)。
问题是例如“传递”依赖。事件 1 通过事件 2 与事件 3。这种依赖关系可以跨越 3 行以上。所以必须确定传递闭包。
对其进行编程可以迭代应用当前解决方案,直到不再发生更改为止。但是如何在power query中做到呢?
Table 1(原件):
Event ID
Group
Start
End
1
A
20.01.2022 12:00:00
20.01.2022 12:02:00
2
A
20.01.2022 12:01:00
20.01.2022 12:04:20
3
A
20.01.2022 12:03:10
20.01.2022 12:06:00
4
A
20.01.2022 12:08:00
20.01.2022 12:10:00
5
B
20.01.2022 12:00:50
20.01.2022 12:02:00
6
B
20.01.2022 12:01:00
20.01.2022 12:05:00
7
B
20.01.2022 12:06:00
20.01.2022 12:11:00
Table 1b(当前中间解):
Event ID
Group
Start
End
1
A
20.01.2022 12:00:00
20.01.2022 12:04:20
2
A
20.01.2022 12:01:00
20.01.2022 12:06:00
4
A
20.01.2022 12:08:00
20.01.2022 12:10:00
5
B
20.01.2022 12:00:50
20.01.2022 12:05:00
7
B
20.01.2022 12:06:00
20.01.2022 12:11:00
Table 2(期望的结果):
Event ID
Group
Start
End
1
A
20.01.2022 12:00:00
20.01.2022 12:06:00
4
A
20.01.2022 12:08:00
20.01.2022 12:10:00
5
B
20.01.2022 12:00:50
20.01.2022 12:05:00
7
B
20.01.2022 12:06:00
20.01.2022 12:11:00
编辑
未与提供的解决方案完全聚合的示例:
Event ID
Group
Start
End
1
A
20.01.2022 12:02:12
20.01.2022 12:05:34
2
A
20.01.2022 12:02:54
20.01.2022 12:05:37
3
A
20.01.2022 12:05:36
20.01.2022 12:05:49
4
A
20.01.2022 12:05:45
20.01.2022 12:07:22
5
A
20.01.2022 12:06:03
20.01.2022 12:06:10
结果(先前的解决方案):
Event ID
Group
Start
End
1
A
20.01.2022 12:02:12
20.01.2022 12:07:22
5
A
20.01.2022 12:02:54
20.01.2022 12:07:22
结果(接受的答案):
Event ID
Group
Start
End
1
A
20.01.2022 12:02:12
20.01.2022 12:07:22
这很有趣。根据反馈更新
第 1 步:创建单独的查询,将其命名为 进程,在继续之前关闭并加载它
(xtable)=>
// for each group, compare each list against all lists in column Custom, and merge those that overlap
let Source= Table.Buffer(xtable),
#"Added Custom"= Table.AddColumn(
Source,
"Custom2",
each let
begin = [Custom],
mygroup=[Group]
in
List.Accumulate (
Table.SelectRows(Source,each [Group]=mygroup)[Custom],
begin,
(state,current)=> if List.ContainsAny(state,current) then List.Distinct(List.Combine({current,state})) else state
)
),
// count the number of changes made from original version. If this is not zero, we will recurse the changes
x= List.Sum(List.Transform(List.Positions(#"Added Custom"[Custom]), each if #"Added Custom"[Custom]{_} = #"Added Custom"[Custom2]{_} then 0 else 1)),
RemovePrioCustom= Table.RemoveColumns(#"Added Custom",{"Custom"}),
AddNewCustom= Table.RenameColumns(RemovePrioCustom,{{"Custom2", "Custom"}}),
recursive = if x=0 then AddNewCustom else @process( AddNewCustom)
in recursive
第 2 步:使用上述函数的 table 代码:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start", type datetime}, {"End", type datetime}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
(i)=>Table.SelectRows(#"Added Index", each [Group]=i[Group] and
([Start]>=i[Start] and [End]<=i[End] or
[Start]<=i[Start] and [End]>=i[End] or
[Start]<=i[Start] and [End]<=i[End] and [End]>=i[Start]or
[Start]>=i[Start] and [End]>=i[End] and [Start] <=i[End])
)[Index]
),
MergeOverlap= process(#"Added Custom"),
#"Added Custom1" = Table.AddColumn(MergeOverlap, "StartMin", each List.Min(List.Transform([Custom], each MergeOverlap[Start]{_})),type datetime),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "EndMax", each List.Max(List.Transform([Custom], each #"Added Custom1"[End]{_})), type datetime),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Start", "End", "Index", "Custom"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"Group", "StartMin", "EndMax"})
in #"Removed Duplicates"
该函数会调用自身,直到无法进行更多更改,因此应根据需要尽可能深入地工作
我正在尝试将下面给定的 table 1 到 table 2 与 Excel 幂查询汇总。
目标是将同一组的连续时间间隔合并为一行。对于像事件 5 和 6 这样的直接重叠,这非常容易。但是这种方法仅合并了事件 1 和 2 以及事件 2 和 3,从而产生了两个条目(参见 Table 1b)。
问题是例如“传递”依赖。事件 1 通过事件 2 与事件 3。这种依赖关系可以跨越 3 行以上。所以必须确定传递闭包。
对其进行编程可以迭代应用当前解决方案,直到不再发生更改为止。但是如何在power query中做到呢?
Table 1(原件):
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:00:00 | 20.01.2022 12:02:00 |
2 | A | 20.01.2022 12:01:00 | 20.01.2022 12:04:20 |
3 | A | 20.01.2022 12:03:10 | 20.01.2022 12:06:00 |
4 | A | 20.01.2022 12:08:00 | 20.01.2022 12:10:00 |
5 | B | 20.01.2022 12:00:50 | 20.01.2022 12:02:00 |
6 | B | 20.01.2022 12:01:00 | 20.01.2022 12:05:00 |
7 | B | 20.01.2022 12:06:00 | 20.01.2022 12:11:00 |
Table 1b(当前中间解):
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:00:00 | 20.01.2022 12:04:20 |
2 | A | 20.01.2022 12:01:00 | 20.01.2022 12:06:00 |
4 | A | 20.01.2022 12:08:00 | 20.01.2022 12:10:00 |
5 | B | 20.01.2022 12:00:50 | 20.01.2022 12:05:00 |
7 | B | 20.01.2022 12:06:00 | 20.01.2022 12:11:00 |
Table 2(期望的结果):
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:00:00 | 20.01.2022 12:06:00 |
4 | A | 20.01.2022 12:08:00 | 20.01.2022 12:10:00 |
5 | B | 20.01.2022 12:00:50 | 20.01.2022 12:05:00 |
7 | B | 20.01.2022 12:06:00 | 20.01.2022 12:11:00 |
编辑
未与提供的解决方案完全聚合的示例:
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:02:12 | 20.01.2022 12:05:34 |
2 | A | 20.01.2022 12:02:54 | 20.01.2022 12:05:37 |
3 | A | 20.01.2022 12:05:36 | 20.01.2022 12:05:49 |
4 | A | 20.01.2022 12:05:45 | 20.01.2022 12:07:22 |
5 | A | 20.01.2022 12:06:03 | 20.01.2022 12:06:10 |
结果(先前的解决方案):
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:02:12 | 20.01.2022 12:07:22 |
5 | A | 20.01.2022 12:02:54 | 20.01.2022 12:07:22 |
结果(接受的答案):
Event ID | Group | Start | End |
---|---|---|---|
1 | A | 20.01.2022 12:02:12 | 20.01.2022 12:07:22 |
这很有趣。根据反馈更新
第 1 步:创建单独的查询,将其命名为 进程,在继续之前关闭并加载它
(xtable)=>
// for each group, compare each list against all lists in column Custom, and merge those that overlap
let Source= Table.Buffer(xtable),
#"Added Custom"= Table.AddColumn(
Source,
"Custom2",
each let
begin = [Custom],
mygroup=[Group]
in
List.Accumulate (
Table.SelectRows(Source,each [Group]=mygroup)[Custom],
begin,
(state,current)=> if List.ContainsAny(state,current) then List.Distinct(List.Combine({current,state})) else state
)
),
// count the number of changes made from original version. If this is not zero, we will recurse the changes
x= List.Sum(List.Transform(List.Positions(#"Added Custom"[Custom]), each if #"Added Custom"[Custom]{_} = #"Added Custom"[Custom2]{_} then 0 else 1)),
RemovePrioCustom= Table.RemoveColumns(#"Added Custom",{"Custom"}),
AddNewCustom= Table.RenameColumns(RemovePrioCustom,{{"Custom2", "Custom"}}),
recursive = if x=0 then AddNewCustom else @process( AddNewCustom)
in recursive
第 2 步:使用上述函数的 table 代码:
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Start", type datetime}, {"End", type datetime}}),
#"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1, Int64.Type),
#"Added Custom" = Table.AddColumn(#"Added Index", "Custom",
(i)=>Table.SelectRows(#"Added Index", each [Group]=i[Group] and
([Start]>=i[Start] and [End]<=i[End] or
[Start]<=i[Start] and [End]>=i[End] or
[Start]<=i[Start] and [End]<=i[End] and [End]>=i[Start]or
[Start]>=i[Start] and [End]>=i[End] and [Start] <=i[End])
)[Index]
),
MergeOverlap= process(#"Added Custom"),
#"Added Custom1" = Table.AddColumn(MergeOverlap, "StartMin", each List.Min(List.Transform([Custom], each MergeOverlap[Start]{_})),type datetime),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "EndMax", each List.Max(List.Transform([Custom], each #"Added Custom1"[End]{_})), type datetime),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom2",{"Start", "End", "Index", "Custom"}),
#"Removed Duplicates" = Table.Distinct(#"Removed Columns", {"Group", "StartMin", "EndMax"})
in #"Removed Duplicates"
该函数会调用自身,直到无法进行更多更改,因此应根据需要尽可能深入地工作