德鲁伊聚合函数
Druid aggregate functions
我正在使用德鲁伊创建一个 UI 来生成报告。对于脚本,我使用以下代码:
{
"type" : "doubleSum",
"name" : "impressions",
"fieldName" : "impressions"
},
{
"type" : "doubleSum",
"name" : "clicks",
"fieldName" : "clicks"
},
{
"type" : "doubleSum",
"name" : "pvconversions",
"fieldName" : "pvconversions"
},
{
"type" : "doubleSum",
"name" : "pcconversions",
"fieldName" : "pcconversions"
}
我还需要两个字段:
Total Conversions = pvconversions+pcconversions
CTR = Clicks / Impressions
关于如何编写它们,我还没有找到关于此事的任何信息。
谁能帮忙
谢谢
您可以通过在时间序列查询中使用聚合来实现。这不是您要找的吗?
您必须在查询中使用 post 聚合。
来自 Druid
的文档
Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires
例如,要计算点击率,这里是 post 聚合:
"postAggregations" : [{
"type" : "arithmetic",
"name" : "average",
"fn" : "*",
"fields" : [
{ "type" : "arithmetic",
"name" : "CTR",
"fn" : "/",
"fields" : [
{ "type" : "fieldAccess", "name" : "clicks", "fieldName" : "clicks" },
{ "type" : "fieldAccess", "name" : "impressions", "fieldName" : "impressions" }
]
}
您的问题可以使用聚合和 postAggregations 解决,如下面的代码片段所示:
{
"queryType":"timeseries",
"dataSource":"data",
"granularity":"hour",
"descending":"false",
"aggregations":[
{"type":"doubleSum", "name":"sum-pvconversions", "fieldName":"pvconversions"},
{"type":"doubleSum", "name":"sum-pcconversions", "fieldName":"pcconversions"},
{"type":"doubleSum", "name":"sum-clicks", "fieldName":"clicks"},
{"type":"doubleSum", "name":"sum-impressions", "fieldName":"impressions"}
],
"postAggregations":[
{
"type":"arithmetic",
"name":"Conversions",
"fn":"+",
"fields":[
{"type":"fieldAccess", "name":"postAgg-proceed", "fieldName":"sum-pvconversions"},
{"type":"fieldAccess", "name":"postAgg-numbers", "fieldName":"sum-pcconversions"}
]
},
{
"type":"arithmetic",
"name":"CTR",
"fn":"/",
"fields":[
{"type":"fieldAccess", "name":"postAgg-click", "fieldName":"sum-clicks"},
{"type":"fieldAccess", "name":"postAgg-impression", "fieldName":"sum-impressions"}
]
}
],
"intervals":["2016-08-22T01/2016-08-29T03"],
"context":{
"skipEmptyBuckets":"true"
}
}
Druid 中的聚合只能与时间序列、topN 和 groupBy 等聚合查询一起使用。
如果您只是根据时间聚合列中的值,最简单的方法是编写时间序列查询。
例如,
{
"queryType": "timeseries",
"dataSource": "<datasource name>",
"granularity": "day",
"aggregations": [
<Your aggregations here>
],
"intervals": [ <Time interval (from/to)> ]
}
我正在使用德鲁伊创建一个 UI 来生成报告。对于脚本,我使用以下代码:
{
"type" : "doubleSum",
"name" : "impressions",
"fieldName" : "impressions"
},
{
"type" : "doubleSum",
"name" : "clicks",
"fieldName" : "clicks"
},
{
"type" : "doubleSum",
"name" : "pvconversions",
"fieldName" : "pvconversions"
},
{
"type" : "doubleSum",
"name" : "pcconversions",
"fieldName" : "pcconversions"
}
我还需要两个字段:
Total Conversions = pvconversions+pcconversions
CTR = Clicks / Impressions
关于如何编写它们,我还没有找到关于此事的任何信息。 谁能帮忙
谢谢
您可以通过在时间序列查询中使用聚合来实现。这不是您要找的吗?
您必须在查询中使用 post 聚合。 来自 Druid
的文档Post-aggregations are specifications of processing that should happen on aggregated values as they come out of Druid. If you include a post aggregation as part of a query, make sure to include all aggregators the post-aggregator requires
例如,要计算点击率,这里是 post 聚合:
"postAggregations" : [{
"type" : "arithmetic",
"name" : "average",
"fn" : "*",
"fields" : [
{ "type" : "arithmetic",
"name" : "CTR",
"fn" : "/",
"fields" : [
{ "type" : "fieldAccess", "name" : "clicks", "fieldName" : "clicks" },
{ "type" : "fieldAccess", "name" : "impressions", "fieldName" : "impressions" }
]
}
您的问题可以使用聚合和 postAggregations 解决,如下面的代码片段所示:
{
"queryType":"timeseries",
"dataSource":"data",
"granularity":"hour",
"descending":"false",
"aggregations":[
{"type":"doubleSum", "name":"sum-pvconversions", "fieldName":"pvconversions"},
{"type":"doubleSum", "name":"sum-pcconversions", "fieldName":"pcconversions"},
{"type":"doubleSum", "name":"sum-clicks", "fieldName":"clicks"},
{"type":"doubleSum", "name":"sum-impressions", "fieldName":"impressions"}
],
"postAggregations":[
{
"type":"arithmetic",
"name":"Conversions",
"fn":"+",
"fields":[
{"type":"fieldAccess", "name":"postAgg-proceed", "fieldName":"sum-pvconversions"},
{"type":"fieldAccess", "name":"postAgg-numbers", "fieldName":"sum-pcconversions"}
]
},
{
"type":"arithmetic",
"name":"CTR",
"fn":"/",
"fields":[
{"type":"fieldAccess", "name":"postAgg-click", "fieldName":"sum-clicks"},
{"type":"fieldAccess", "name":"postAgg-impression", "fieldName":"sum-impressions"}
]
}
],
"intervals":["2016-08-22T01/2016-08-29T03"],
"context":{
"skipEmptyBuckets":"true"
}
}
Druid 中的聚合只能与时间序列、topN 和 groupBy 等聚合查询一起使用。
如果您只是根据时间聚合列中的值,最简单的方法是编写时间序列查询。
例如,
{
"queryType": "timeseries",
"dataSource": "<datasource name>",
"granularity": "day",
"aggregations": [
<Your aggregations here>
],
"intervals": [ <Time interval (from/to)> ]
}