固定 window 什么时候开始在 apache beam 中?

When does a fixed window start in apache beam?

我有一份采用固定 windowing 策略的加入工作。所以加入作业从两个流中读取:

1. Record 1: ts: 2022-Mar-23-13:00:00, key: abcdef
2. Record 2: ts: 2022-Mar-23-18:00:00, key: xyzefg

如果我使用 24 小时的固定 window 并且我一直从这个数据流中读取,那么 Record1 和 Record2 的窗格何时开始?

这是正确的吗?

Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-18:00:00, end = 2022-Mar-24-18:00:00, key= xyzefg

或者这是正确的吗?

Pane1: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key =abcdef
Pane2: start = 2022-Mar-23-13:00:00, end = 2022-Mar-24-13:00:00, key= xyzefg

或者这是正确的吗?

Pane1: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key =abcdef
Pane2: start = 2022-Mar-23-00:00:00, end = 2022-Mar-24-00:00:00, key= xyzefg

这是最后一个。对于固定 windows,它们总是 24 小时长*,当它们被分配给一个元素时,它们通常会在元素的时间戳之前和之后延伸(除非它恰好在边界上)。

* Well, technically the end of the window is one microsecond before the next window starts, so as to prevent overlap and ambiguity of which window any timestamp belongs to.