BigQuery 可以在响应 ROLLUP 查询时删除小计行吗
Can BigQuery drop subtotals rows WHEN responding to ROLLUP query
当我使用 ROLLUP 通过潜在的大量分组条件字段查询 BQ 时(在本例中 campaign_group_id
)
例如:
SELECT
campaign_group_id AS campaign_group_id,
DATE(DATE_ADD(TIME, 3, 'HOUR')) AS DAY,
SUM(impressions) AS imps
FROM
[browser_traffic.2016_05_28],
[browser_traffic.2016_05_29]
WHERE
( DATE_ADD( TIME, 3, "HOUR") >= '2016-05-28 00:00:00'
AND DATE_ADD( TIME, 3, "HOUR") < '2016-05-30 00:00:00' )
GROUP EACH BY ROLLUP (campaign_group_id, DAY)
ORDER BY DAY ASC, campaign_group_id ASC
LIMIT 500
BQ returns 很多行的小计不适用于我的用例:
+-------------------+------+-----------+
| campaign_group_id | day | imps |
+-------------------+------+-----------+
| NULL | NULL | 158423933 |
| 61 | NULL | 0 |
| 496 | NULL | 79870 |
| 497 | NULL | 10492 |
| 809 | NULL | 0 |
| 936 | NULL | 2451 |
| 937 | NULL | 0 |
| 940 | NULL | 6844 |
| 942 | NULL | 207685 |
| 946 | NULL | 0 |
| 961 | NULL | 0 |
| 975 | NULL | 16167 |
| 976 | NULL | 15767 |
| 1018 | NULL | 0 |
| 1020 | NULL | 0 |
| 1022 | NULL | 766875 |
| 1039 | NULL | 355765 |
...
我需要以某种方式减少结果中的小计行,但保留完整的总计行(这是上面结果中的第一行)
BQ returns 是否可能只有所选字段的总计行?
如果您对总计感兴趣 - 您很可能不需要 ROLLUP
相反,您可以考虑如下常规 GROUP BY
SELECT
DATE(DATE_ADD(TIME, 3, 'HOUR')) AS DAY,
SUM(impressions) AS imps
FROM
[browser_traffic.2016_05_28],
[browser_traffic.2016_05_29]
WHERE
( DATE_ADD( TIME, 3, "HOUR") >= '2016-05-28 00:00:00'
AND DATE_ADD( TIME, 3, "HOUR") < '2016-05-30 00:00:00' )
GROUP 1
ORDER BY DAY ASC
LIMIT 500
您可以使用另一个 SELECT 语句过滤查询结果:
SELECT campaign_group_id, day, imps
FROM (
... your rollup query with LIMIT removed ...
)
WHERE (day IS NOT NULL) OR (campaign_group_id IS NULL)
LIMIT 500
请注意 GROUPING() 函数也存在,它将帮助您:
SELECT year, name, SUM(number) s,
GROUPING(year) is_grouping_year,
GROUPING(name) is_grouping_name
FROM [bigquery-public-data:usa_names.usa_1910_2013]
WHERE name IN ('John', 'Jovana')
AND year BETWEEN 2012 AND 2013
GROUP BY ROLLUP(name, year)
ORDER BY year, name
year name s is_grouping_year is_grouping_name
null null 21182 1 1
null John 21164 1 0
null Jovana 18 1 0
2012 John 10576 0 0
2012 Jovana 18 0 0
2013 John 10588 0 0
基本上,您要求的是 is_grouping_year
和 is_grouping_name
均为 0 或 1 的行。
来自文档:
When using the ROLLUP function, you can use the GROUPING function to
distinguish between rows that were added because of the ROLLUP
function and rows that actually have a NULL value for the group key.
当我使用 ROLLUP 通过潜在的大量分组条件字段查询 BQ 时(在本例中 campaign_group_id
)
例如:
SELECT
campaign_group_id AS campaign_group_id,
DATE(DATE_ADD(TIME, 3, 'HOUR')) AS DAY,
SUM(impressions) AS imps
FROM
[browser_traffic.2016_05_28],
[browser_traffic.2016_05_29]
WHERE
( DATE_ADD( TIME, 3, "HOUR") >= '2016-05-28 00:00:00'
AND DATE_ADD( TIME, 3, "HOUR") < '2016-05-30 00:00:00' )
GROUP EACH BY ROLLUP (campaign_group_id, DAY)
ORDER BY DAY ASC, campaign_group_id ASC
LIMIT 500
BQ returns 很多行的小计不适用于我的用例:
+-------------------+------+-----------+
| campaign_group_id | day | imps |
+-------------------+------+-----------+
| NULL | NULL | 158423933 |
| 61 | NULL | 0 |
| 496 | NULL | 79870 |
| 497 | NULL | 10492 |
| 809 | NULL | 0 |
| 936 | NULL | 2451 |
| 937 | NULL | 0 |
| 940 | NULL | 6844 |
| 942 | NULL | 207685 |
| 946 | NULL | 0 |
| 961 | NULL | 0 |
| 975 | NULL | 16167 |
| 976 | NULL | 15767 |
| 1018 | NULL | 0 |
| 1020 | NULL | 0 |
| 1022 | NULL | 766875 |
| 1039 | NULL | 355765 |
...
我需要以某种方式减少结果中的小计行,但保留完整的总计行(这是上面结果中的第一行)
BQ returns 是否可能只有所选字段的总计行?
如果您对总计感兴趣 - 您很可能不需要 ROLLUP
相反,您可以考虑如下常规 GROUP BY
SELECT
DATE(DATE_ADD(TIME, 3, 'HOUR')) AS DAY,
SUM(impressions) AS imps
FROM
[browser_traffic.2016_05_28],
[browser_traffic.2016_05_29]
WHERE
( DATE_ADD( TIME, 3, "HOUR") >= '2016-05-28 00:00:00'
AND DATE_ADD( TIME, 3, "HOUR") < '2016-05-30 00:00:00' )
GROUP 1
ORDER BY DAY ASC
LIMIT 500
您可以使用另一个 SELECT 语句过滤查询结果:
SELECT campaign_group_id, day, imps
FROM (
... your rollup query with LIMIT removed ...
)
WHERE (day IS NOT NULL) OR (campaign_group_id IS NULL)
LIMIT 500
请注意 GROUPING() 函数也存在,它将帮助您:
SELECT year, name, SUM(number) s,
GROUPING(year) is_grouping_year,
GROUPING(name) is_grouping_name
FROM [bigquery-public-data:usa_names.usa_1910_2013]
WHERE name IN ('John', 'Jovana')
AND year BETWEEN 2012 AND 2013
GROUP BY ROLLUP(name, year)
ORDER BY year, name
year name s is_grouping_year is_grouping_name
null null 21182 1 1
null John 21164 1 0
null Jovana 18 1 0
2012 John 10576 0 0
2012 Jovana 18 0 0
2013 John 10588 0 0
基本上,您要求的是 is_grouping_year
和 is_grouping_name
均为 0 或 1 的行。
来自文档:
When using the ROLLUP function, you can use the GROUPING function to distinguish between rows that were added because of the ROLLUP function and rows that actually have a NULL value for the group key.