德鲁伊能否执行嵌套查询,使每个查询包含一个维度和一个关联维度列表?
Can druid perform a nested query such that each contain one dimension and a list of associated dimensions?
例如,给定此数据:
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "abc" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "def" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "hgi" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "2" "user": "mno" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "2" "user": "qrs" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "3" "user": "xyz" }
是否可以进行 returns
的高效查询
{
"timestamp": "...",
"event": {
"ip": 1,
"user": ["abc", "def", "hgi"]
},
{
"timestamp": "...",
"event": {
"ip": 2,
"user": ["mno", "qrs"]
},
{
"timestamp": "...",
"event": {
"ip": 3,
"user": ["xyz"]
}
如果是这样,是否可以仅限制 user
列表结果的结果计数?
对于德鲁伊 groupBy 查询,您不能应用 "sub-group" 或 "group_concat" 函数。这些根本不可用。 Druid 将根据您 select 的字段对您的查询进行分组。
当然,您可以按 ip
分组,然后计算行数甚至不同用户的数量。
例如,给定此数据:
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "abc" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "def" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "1" "user": "hgi" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "2" "user": "mno" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "2" "user": "qrs" }
{"timestamp": "2011-01-12T00:00:00.000Z", "ip": "3" "user": "xyz" }
是否可以进行 returns
的高效查询 {
"timestamp": "...",
"event": {
"ip": 1,
"user": ["abc", "def", "hgi"]
},
{
"timestamp": "...",
"event": {
"ip": 2,
"user": ["mno", "qrs"]
},
{
"timestamp": "...",
"event": {
"ip": 3,
"user": ["xyz"]
}
如果是这样,是否可以仅限制 user
列表结果的结果计数?
对于德鲁伊 groupBy 查询,您不能应用 "sub-group" 或 "group_concat" 函数。这些根本不可用。 Druid 将根据您 select 的字段对您的查询进行分组。
当然,您可以按 ip
分组,然后计算行数甚至不同用户的数量。