如何 GROUP BY BigQuery 中的外键?
How to GROUP BY a foreign key in BigQuery?
我在 BigQuery 工作。我有三个 table:分支机构、地区(分支机构的集合)和按分支机构每月支出。
CREATE TABLE region (
id integer NOT NULL,
name varchar NOT NULL
);
CREATE TABLE branch (
id integer NOT NULL,
name varchar NOT NULL,
region integer NOT NULL
);
CREATE TABLE spend (
branch integer NOT NULL
amount float,
month timestamp,
item_code int
);
如何按地区按月计算总支出?
我有这个按分行按月计算的总支出:
SELECT branch,
month,
SUM(amount) AS total_amount
FROM [mytable]
GROUP BY branch,
month
但我不知道如何按地区分组。我想我在某处需要一个 IN
子句?
它也是一个相当大的数据集(spend
table 中有 150GB/500m 行),因此大型 JOIN 可能无法工作。
据推测,您需要连接和聚合,我很确定 Bigquery 支持:
SELECT b.region, s.month, SUM(s.amount) AS total_amount
FROM spend s join
branch b
ON s.branch = b.id
GROUP BY b.region, s.month;
SELECT r.name as region, [month], SUM(total_amount) AS total_amount
FROM (
SELECT branch, [month], SUM(amount) AS total_amount
FROM [mydataset.spend]
GROUP EACH BY branch, [month]
) AS s
JOIN [mydataset.branch] AS b ON s.branch = b.id
JOIN [mydataset.region] AS r ON b.region = r.id
GROUP BY 1, 2
GROUP EACH BY 和 sub-select 中的 pre-grouping 用于解决您的问题:large JOINs may not work
。
我在 BigQuery 工作。我有三个 table:分支机构、地区(分支机构的集合)和按分支机构每月支出。
CREATE TABLE region (
id integer NOT NULL,
name varchar NOT NULL
);
CREATE TABLE branch (
id integer NOT NULL,
name varchar NOT NULL,
region integer NOT NULL
);
CREATE TABLE spend (
branch integer NOT NULL
amount float,
month timestamp,
item_code int
);
如何按地区按月计算总支出?
我有这个按分行按月计算的总支出:
SELECT branch,
month,
SUM(amount) AS total_amount
FROM [mytable]
GROUP BY branch,
month
但我不知道如何按地区分组。我想我在某处需要一个 IN
子句?
它也是一个相当大的数据集(spend
table 中有 150GB/500m 行),因此大型 JOIN 可能无法工作。
据推测,您需要连接和聚合,我很确定 Bigquery 支持:
SELECT b.region, s.month, SUM(s.amount) AS total_amount
FROM spend s join
branch b
ON s.branch = b.id
GROUP BY b.region, s.month;
SELECT r.name as region, [month], SUM(total_amount) AS total_amount
FROM (
SELECT branch, [month], SUM(amount) AS total_amount
FROM [mydataset.spend]
GROUP EACH BY branch, [month]
) AS s
JOIN [mydataset.branch] AS b ON s.branch = b.id
JOIN [mydataset.region] AS r ON b.region = r.id
GROUP BY 1, 2
GROUP EACH BY 和 sub-select 中的 pre-grouping 用于解决您的问题:large JOINs may not work
。