使用 Azure 流分析在 Tumbling Window 中获得第一条记录

Get first record in Tumbling Window using Azure Stream Analytics

我项目的某些部分在 Java 中使用 Esper 进行复杂的事件处理。我打算用 Azure 流分析替换 Esper。

用例: FTOD(当天第一张工单)和 FTOP(项目第一张工单)

我不断从 Eventhub 获取工单数据,并希望生成 2 种类型的警报(FTOD 和 FTOP)。我认为 thumblingWindow 最适合这种情况。

但我无法在 window 中选择第一条记录。有什么建议如何在 24 小时内选择第一条记录 window?

下面是 FTOD 的 Esper 查询

     String statementQuery = "context context_" + plantIdStr
          + " select distinct * from TicketInfoComplete as ticket where plantId = '"
          + entry.getKey() + "' and ruleType='FTOD' output first every 24 hours";

下面是我收到的消息数据

[{"DeviceSerialNumber":"190203XXX001TEST","MessageTimestamp":"2019-02-11T13:46:08.0000000Z","PlantId":"141","ProjectId":"Mobitest","ProjectName":"Mobitest","TicketNumber":"84855","TicketDateTimeinUTC":"2019-02-11T13:46:08.0000000Z","AdditionalInfo":{"value123":"value2"},"Timeout":60000,"Traffic":1,"Make":"Z99","TruckMake":"Z99","PlantName":"RMZ","Status":"Valid","PlantMakeSerialNumber":"Z99|190203XXX001TEST","ErrorMessageJsonString":"[]","Timezone":"India Standard Time"}]

根据您的描述,我认为您可以了解 LAST 运算符与 GROUP BY 条件。 LAST 允许在定义的约束内查找事件流中的最新事件。

In Stream Analytics, the scope of LAST (that is, how far back in history from the current event it needs to look) is always limited to a finite time interval, using the LIMIT DURATION clause. LAST can optionally be limited to only consider events that match the current event on a certain property or condition using the PARTITION BY and WHEN clauses. LAST is not affected by predicates in WHERE clause, join conditions in JOIN clause, or grouping expressions in GROUP BY clause of the current query.

请看上面文档中的例子:

SELECT    
       LAST(TicketNumber) OVER (LIMIT DURATION(hour, 24))  
FROM input 

总结一下,获取第一项时需要考虑isFirst方法

精确查询在使用 IsFirst 方法进行 FTOD 和 FTOP 警报后我使用了什么。

SELECT 
DeviceSerialNumber,MessageTimestamp,PlantId,TruckId,ProjectId,ProjectName,
CustomerId,CustomerName,TicketNumber,TicketDateTimeinUTC,TruckSerialNumber,
TruckMake,PlantName,PlantMakeSerialNumber,Timezone,'FTOD' as alertType
INTO
[alertOutput]
FROM
[ticketInput]
where ISFIRST(mi, 2)=1

SELECT 
DeviceSerialNumber,MessageTimestamp,PlantId,TruckId,ProjectId,ProjectName,
CustomerId,CustomerName,TicketNumber,TicketDateTimeinUTC,TruckSerialNumber,
TruckMake,PlantName,PlantMakeSerialNumber,Timezone,'FTOP' as alertType
INTO
[ftopOutput]
FROM
[ticketInput]
where ISFIRST(mi, 2) OVER (PARTITION BY PlantId) = 1