PowerBI - 比较两组数据的时间戳
PowerBI - Comparing two sets of data on timestamp
我正在尝试比较两组具有相似性的数据。他们没有所有相同的列,但我只需要比较员工 ID、开始时间和结束时间。我在员工 ID 上加入了 table。真的,我需要查看两个 table 的开始和结束时间是否相互重叠。
这是来自数据集 1 的数据:
Emp ID | Start time | End Time
test-a | 11/14/2019 6:48 AM | 11/14/2019 7:35 AM
test-a | 11/14/2019 9:02 AM | 11/14/2019 11:46 AM
test-a | 11/14/2019 1:00 PM | 11/14/2019 2:00 PM
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM
这是来自数据集 2 的数据:
Emp ID | Start time | End Time
test-a | 11/16/2019 4:48 AM | 11/16/2019 7:35 AM
test-a | 11/17/2019 9:02 AM | 11/17/2019 9:46 AM
test-a | 11/14/2019 7:00 PM | 11/14/2019 8:00 PM
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM
期望的输出
Emp ID | Start time | End Time |
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM |
test-a | 11/14/2019 7:00 PM | 11/14/2019 8:00 PM |
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM |
有人可以在 PowerBI 中帮助解决这个问题吗?提前致谢。
如果我理解正确的话,时间重叠可以定义如下。
Given two time periods starting and ending at (StartTime1, End1), (Start2, End2) respectively,
time overlap is the time period (Start3, End3) if Start3 < End3,
where
Start3 = MAX( Start1, Start2 )
and
End3 = MIN( End1, End2 )
您可以通过 Cartesian product Dataset1 和 Dataset2 并逐行比较来提取重叠。
这是一个使用 DAX 计算的示例 table。
Time Overlap =
SELECTCOLUMNS(
FILTER(
CROSSJOIN(
SELECTCOLUMNS(
Dataset1,
"EmployeeID1", Dataset1[EmployeeID],
"StartTime1", Dataset1[StartTime],
"EndTime1", Dataset1[EndTime]
),
SELECTCOLUMNS(
Dataset2,
"EmployeeID2", Dataset2[EmployeeID],
"StartTime2", Dataset2[StartTime],
"EndTime2", Dataset2[EndTime]
)
),
[EmployeeID1] = [EmployeeID2]
&& MAX( [StartTime1], [StartTime2] ) < MIN( [EndTime1], [EndTime2] )
),
"EmployeeID", [EmployeeID1],
"StartTime", MAX( [StartTime1], [StartTime2] ),
"EndTime", MIN( [EndTime1], [EndTime2] )
)
我正在尝试比较两组具有相似性的数据。他们没有所有相同的列,但我只需要比较员工 ID、开始时间和结束时间。我在员工 ID 上加入了 table。真的,我需要查看两个 table 的开始和结束时间是否相互重叠。
这是来自数据集 1 的数据:
Emp ID | Start time | End Time
test-a | 11/14/2019 6:48 AM | 11/14/2019 7:35 AM
test-a | 11/14/2019 9:02 AM | 11/14/2019 11:46 AM
test-a | 11/14/2019 1:00 PM | 11/14/2019 2:00 PM
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM
这是来自数据集 2 的数据:
Emp ID | Start time | End Time
test-a | 11/16/2019 4:48 AM | 11/16/2019 7:35 AM
test-a | 11/17/2019 9:02 AM | 11/17/2019 9:46 AM
test-a | 11/14/2019 7:00 PM | 11/14/2019 8:00 PM
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM
期望的输出
Emp ID | Start time | End Time |
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM |
test-a | 11/14/2019 7:00 PM | 11/14/2019 8:00 PM |
test-a | 11/14/2019 5:00 PM | 11/14/2019 9:15 PM |
有人可以在 PowerBI 中帮助解决这个问题吗?提前致谢。
如果我理解正确的话,时间重叠可以定义如下。
Given two time periods starting and ending at (StartTime1, End1), (Start2, End2) respectively,
time overlap is the time period (Start3, End3) if Start3 < End3,
where
Start3 = MAX( Start1, Start2 )
and
End3 = MIN( End1, End2 )
您可以通过 Cartesian product Dataset1 和 Dataset2 并逐行比较来提取重叠。
这是一个使用 DAX 计算的示例 table。
Time Overlap =
SELECTCOLUMNS(
FILTER(
CROSSJOIN(
SELECTCOLUMNS(
Dataset1,
"EmployeeID1", Dataset1[EmployeeID],
"StartTime1", Dataset1[StartTime],
"EndTime1", Dataset1[EndTime]
),
SELECTCOLUMNS(
Dataset2,
"EmployeeID2", Dataset2[EmployeeID],
"StartTime2", Dataset2[StartTime],
"EndTime2", Dataset2[EndTime]
)
),
[EmployeeID1] = [EmployeeID2]
&& MAX( [StartTime1], [StartTime2] ) < MIN( [EndTime1], [EndTime2] )
),
"EmployeeID", [EmployeeID1],
"StartTime", MAX( [StartTime1], [StartTime2] ),
"EndTime", MIN( [EndTime1], [EndTime2] )
)