CloudWatch Insights 查询 - 如何从计数中获取单个计数
CloudWatch Insights Query - How to get a single count from counts
我有一个包含 playerId 值的日志文件,一些玩家在该文件中有多个条目。我想获得独特玩家的确切数量,无论他们在日志文件中是否有 1 个或多个条目。
使用下面的查询扫描 497 条记录并找到 346 行(346 是我想要的数字)
查询:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count(playerId) as CT by playerId
如果我将查询更改为使用 count_distinct,我将得到我想要的结果。下面的示例:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count_distinct(playerId) as CT
然而,count_distinct 的问题在于,随着查询扩展到更大的 timeframe/more 记录,条目数会达到数千和数万。由于 Insights count_distinct 行为的性质...
"Returns the number of unique values for the field. If the field has very high cardinality (contains many unique values), the value returned by count_distinct is just an approximation."。
文档:https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
这是不可接受的,因为我需要确切的数字。稍微尝试一下查询,并坚持使用 count(),而不是 count_distinct() 我相信这是答案,但是我无法得出一个数字......不起作用的例子。 .. 有什么想法吗?
示例 1:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count(playerId) as CT by playerId
| stats count(*)
我们无法理解查询。
明确地说,我正在寻找要在显示数字的单行中返回的确切计数。
如果我们引入一个硬编码为“1”的虚拟字段会怎样?这个想法是检索它的最小值,以便它保持为“1”,即使相同的 playerId
出现不止一次。然后我们对这个字段求和。
日志条目可能如下所示:
[1]"playerId": "1b45b168-00ed-42fe-a977-a8553440fe1a"
查询:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "[*]\"playerId\": \"*\"" as dummyValue, playerId
| stats sum(min(dummyValue)) by playerId as CT
使用的参考资料:
我有一个包含 playerId 值的日志文件,一些玩家在该文件中有多个条目。我想获得独特玩家的确切数量,无论他们在日志文件中是否有 1 个或多个条目。
使用下面的查询扫描 497 条记录并找到 346 行(346 是我想要的数字) 查询:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count(playerId) as CT by playerId
如果我将查询更改为使用 count_distinct,我将得到我想要的结果。下面的示例:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count_distinct(playerId) as CT
然而,count_distinct 的问题在于,随着查询扩展到更大的 timeframe/more 记录,条目数会达到数千和数万。由于 Insights count_distinct 行为的性质...
"Returns the number of unique values for the field. If the field has very high cardinality (contains many unique values), the value returned by count_distinct is just an approximation."。
文档:https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_QuerySyntax.html
这是不可接受的,因为我需要确切的数字。稍微尝试一下查询,并坚持使用 count(),而不是 count_distinct() 我相信这是答案,但是我无法得出一个数字......不起作用的例子。 .. 有什么想法吗?
示例 1:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "\"playerId\": \"*\"" as playerId
| stats count(playerId) as CT by playerId
| stats count(*)
我们无法理解查询。
明确地说,我正在寻找要在显示数字的单行中返回的确切计数。
如果我们引入一个硬编码为“1”的虚拟字段会怎样?这个想法是检索它的最小值,以便它保持为“1”,即使相同的 playerId
出现不止一次。然后我们对这个字段求和。
日志条目可能如下所示:
[1]"playerId": "1b45b168-00ed-42fe-a977-a8553440fe1a"
查询:
fields @timestamp, @message
| sort @timestamp desc
| filter @message like /(playerId)/
| parse @message "[*]\"playerId\": \"*\"" as dummyValue, playerId
| stats sum(min(dummyValue)) by playerId as CT
使用的参考资料: