固定 window 什么时候开始在 apache beam 中?
When does a fixed window start in apache beam?
我有一份采用固定 windowing 策略的加入工作。所以加入作业从两个流中读取:
1. Record 1: ts: 2022-Mar-23-13:00:00, key: abcdef
2. Record 2: ts: 2022-Mar-23-18:00:00, key: xyzefg
如果我使用 24 小时的固定 window 并且我一直从这个数据流中读取,那么 Record1 和 Record2 的窗格何时开始?
这是正确的吗?
Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-18:00:00, end = 2022-Mar-24-18:00:00, key= xyzefg
或者这是正确的吗?
Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key= xyzefg
或者这是正确的吗?
Pane1: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key =abcdef
Pane2: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key= xyzefg
这是最后一个。对于固定 windows,它们总是 24 小时长*,当它们被分配给一个元素时,它们通常会在元素的时间戳之前和之后延伸(除非它恰好在边界上)。
* Well, technically the end of the window is one microsecond before the next window starts, so as to prevent overlap and ambiguity of which window any timestamp belongs to.
我有一份采用固定 windowing 策略的加入工作。所以加入作业从两个流中读取:
1. Record 1: ts: 2022-Mar-23-13:00:00, key: abcdef
2. Record 2: ts: 2022-Mar-23-18:00:00, key: xyzefg
如果我使用 24 小时的固定 window 并且我一直从这个数据流中读取,那么 Record1 和 Record2 的窗格何时开始?
这是正确的吗?
Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-18:00:00, end = 2022-Mar-24-18:00:00, key= xyzefg
或者这是正确的吗?
Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key= xyzefg
或者这是正确的吗?
Pane1: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key =abcdef
Pane2: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key= xyzefg
这是最后一个。对于固定 windows,它们总是 24 小时长*,当它们被分配给一个元素时,它们通常会在元素的时间戳之前和之后延伸(除非它恰好在边界上)。
* Well, technically the end of the window is one microsecond before the next window starts, so as to prevent overlap and ambiguity of which window any timestamp belongs to.