当我们 运行 对 daliy 和行数据进行相同查询时,德鲁伊计数会有所不同
Druid count differ when we run same query on daliy and row data
当 运行 查询 Druid.I 中的 ABS 数据源时得到了一些计数,但当同一查询 运行 与 ABS_DAILY 数据源时不同。然后我们用 ABS 制作 ABS_DAILY。
{
"queryType" : "groupBy",
"dataSource" : "ABS",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "count",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}
下面 json 用于提交德鲁伊的日常工作,它将在特定时间 ABS_DALIY 创建段
{
"spec": {
"ioConfig": {
"firehose": {
"dataSource": "ABS",
"interval": "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z",
"metrics": null,
"dimensions": null,
"type": "ingestSegment"
},
"type": "index"
},
"dataSchema": {
"granularitySpec": {
"queryGranularity": "day",
"intervals": [
"2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z"
],
"segmentGranularity": "day",
"type": "uniform"
},
"dataSource": "ABS_DAILY",
"metricsSpec": [],
"parser": {
"parseSpec": {
"timestampSpec": {
"column": "server_timestamp",
"format": "dd MMMM, yyyy (HH:mm:ss)"
},
"dimensionsSpec": {
"dimensionExclusions": [
"server_timestamp"
],
"dimensions": []
},
"format": "json"
},
"type": "string"
}
}
},
"type": "index"
}
我要求 ABS_DAILY 在其下方 return 与 ABS 计数不同的结果。它不应该。
{
"queryType" : "groupBy",
"dataSource" : "ERS_DAILY",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "count",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}
您正在计算每日汇总的行数。
要汇总预先汇总的计数,您现在需要对计数列求和(请参阅 type
)
{
"queryType" : "groupBy",
"dataSource" : "ERS_DAILY",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "longSum",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}
当 运行 查询 Druid.I 中的 ABS 数据源时得到了一些计数,但当同一查询 运行 与 ABS_DAILY 数据源时不同。然后我们用 ABS 制作 ABS_DAILY。
{
"queryType" : "groupBy",
"dataSource" : "ABS",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "count",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}
下面 json 用于提交德鲁伊的日常工作,它将在特定时间 ABS_DALIY 创建段
{
"spec": {
"ioConfig": {
"firehose": {
"dataSource": "ABS",
"interval": "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z",
"metrics": null,
"dimensions": null,
"type": "ingestSegment"
},
"type": "index"
},
"dataSchema": {
"granularitySpec": {
"queryGranularity": "day",
"intervals": [
"2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z"
],
"segmentGranularity": "day",
"type": "uniform"
},
"dataSource": "ABS_DAILY",
"metricsSpec": [],
"parser": {
"parseSpec": {
"timestampSpec": {
"column": "server_timestamp",
"format": "dd MMMM, yyyy (HH:mm:ss)"
},
"dimensionsSpec": {
"dimensionExclusions": [
"server_timestamp"
],
"dimensions": []
},
"format": "json"
},
"type": "string"
}
}
},
"type": "index"
}
我要求 ABS_DAILY 在其下方 return 与 ABS 计数不同的结果。它不应该。
{
"queryType" : "groupBy",
"dataSource" : "ERS_DAILY",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "count",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}
您正在计算每日汇总的行数。
要汇总预先汇总的计数,您现在需要对计数列求和(请参阅 type
)
{
"queryType" : "groupBy",
"dataSource" : "ERS_DAILY",
"granularity" : "all",
"intervals" : [ "2018-07-12T00:00:00.000Z/2018-07-13T00:00:00.000Z" ],
"descending" : "false",
"aggregations" : [ {
"type" : "longSum",
"name" : "COUNT",
"fieldName" : "COUNT"
} ],
"postAggregations" : [ ],
"dimensions" : [ "event_id" ]
}