我们可以在 BigQuery 中自定义一个函数吗?如何在 BigQuery 中创建日期参数?
Can we self define a function in BigQuery? How to create a date parameter in BigQuery?
我正在尝试计算一段时间内我部门的离职率。比如我想知道01/01/2021 - 05/01/2021的周转率,样本数据是这样的:
UK Status HireDate TermDate
BUV0060 TRM 01/23/2007 12/2/2015
BUV0098 TRM 11/13/2002 2/17/2017
BUV0439 TRM 04/17/2017 2/5/2018
202758 ACT 06/03/1996
17033 TRM 07/01/2019 6/11/2020
92121 ACT 02/24/2020
211343 ACT 04/11/2005
那么我的代码将是这样的:
SELECT *,
Terms /((startheadcount + EndHeadcount) / 2) AS turnover
FROM
(
SELECT
sum(
(
CASE WHEN HireDate < '2021-01-01'
AND TermDate >= '2020-05-01'
OR HireDate < '2021-01-01'
AND TermDate IS NULL THEN 1 ELSE 0 END
)
) as startheadcount,
sum(
(
CASE WHEN HireDate >= '2021-01-01'
AND HireDate <= '2021-05-01' THEN 1 ELSE 0 END
)
) as NewHires,
sum(
(
CASE WHEN TermDate >= '2020-01-01'
AND TermDate <= '2020-05-01' THEN 1 ELSE 0 END
)
) as Terms,
sum(
(
CASE WHEN HireDate < '2021-05-01'
AND TermDate >= '2020-05-01'
OR Status = "ACT" THEN 1 ELSE 0 END
)
) as EndHeadcount
FROM
`XXX.Turnover.Test`
)
结果:
startheadcount NewHires Terms EndHeadcount turnover
4718 231 221 4698 0.046941376380628716
只是为了让我的生活更轻松,我不想每次都输入日期范围。那我们能不能定义一个只要求我输入一次日期的函数,下面的代码会自动运行?
谢谢!!
根据戈登的回答,它显示:
您不需要函数。您可以在派生 table 中添加参数。我会推荐 COUNTIF()
:
SELECT *,
Terms /((startheadcount + EndHeadcount) / 2) AS turnover
FROM (SELECT COUNTIF( t.HireDate < params.DateStart AND
(t.TermDate >= params.DateEnd OR t.TermDate IS NULL)
) as startheadcount,
COUNTIF(t.HireDate >= params.DateStart AND
t.HireDate <= params.DateEnd
) as NewHires,
COUNTIF(t.TermDate >= params.DateStart AND
t.TermDate <= params.DateEnd
) as Terms,
COUNTIF( t.HireDate < params.DateStart AND
t.TermDate >= params.DateEnd OR
t.Status = 'ACT'
) as EndHeadcount
FROM `XXX.Turnover.Test` t CROSS JOIN
(SELECT DATE('2021-05-01') as DateStart,
DATE('2021-05-01') as DateEnd
) params
) t
直接回答您的问题 - BigQuery supports procedures。
创建过程:
CREATE OR REPLACE PROCEDURE amazon.test(input_parameter int64, OUT out1 int64, OUT out2 int64)
BEGIN
SET (out1, out2) = (select as struct input_parameter, 2 * input_parameter);
END;
调用程序:
DECLARE col1, col2 INT64;
CALL amazon.test(5, col1, col2);
SELECT col1, col2
我正在尝试计算一段时间内我部门的离职率。比如我想知道01/01/2021 - 05/01/2021的周转率,样本数据是这样的:
UK Status HireDate TermDate
BUV0060 TRM 01/23/2007 12/2/2015
BUV0098 TRM 11/13/2002 2/17/2017
BUV0439 TRM 04/17/2017 2/5/2018
202758 ACT 06/03/1996
17033 TRM 07/01/2019 6/11/2020
92121 ACT 02/24/2020
211343 ACT 04/11/2005
那么我的代码将是这样的:
SELECT *,
Terms /((startheadcount + EndHeadcount) / 2) AS turnover
FROM
(
SELECT
sum(
(
CASE WHEN HireDate < '2021-01-01'
AND TermDate >= '2020-05-01'
OR HireDate < '2021-01-01'
AND TermDate IS NULL THEN 1 ELSE 0 END
)
) as startheadcount,
sum(
(
CASE WHEN HireDate >= '2021-01-01'
AND HireDate <= '2021-05-01' THEN 1 ELSE 0 END
)
) as NewHires,
sum(
(
CASE WHEN TermDate >= '2020-01-01'
AND TermDate <= '2020-05-01' THEN 1 ELSE 0 END
)
) as Terms,
sum(
(
CASE WHEN HireDate < '2021-05-01'
AND TermDate >= '2020-05-01'
OR Status = "ACT" THEN 1 ELSE 0 END
)
) as EndHeadcount
FROM
`XXX.Turnover.Test`
)
结果:
startheadcount NewHires Terms EndHeadcount turnover
4718 231 221 4698 0.046941376380628716
只是为了让我的生活更轻松,我不想每次都输入日期范围。那我们能不能定义一个只要求我输入一次日期的函数,下面的代码会自动运行?
谢谢!!
根据戈登的回答,它显示:
您不需要函数。您可以在派生 table 中添加参数。我会推荐 COUNTIF()
:
SELECT *,
Terms /((startheadcount + EndHeadcount) / 2) AS turnover
FROM (SELECT COUNTIF( t.HireDate < params.DateStart AND
(t.TermDate >= params.DateEnd OR t.TermDate IS NULL)
) as startheadcount,
COUNTIF(t.HireDate >= params.DateStart AND
t.HireDate <= params.DateEnd
) as NewHires,
COUNTIF(t.TermDate >= params.DateStart AND
t.TermDate <= params.DateEnd
) as Terms,
COUNTIF( t.HireDate < params.DateStart AND
t.TermDate >= params.DateEnd OR
t.Status = 'ACT'
) as EndHeadcount
FROM `XXX.Turnover.Test` t CROSS JOIN
(SELECT DATE('2021-05-01') as DateStart,
DATE('2021-05-01') as DateEnd
) params
) t
直接回答您的问题 - BigQuery supports procedures。
创建过程:
CREATE OR REPLACE PROCEDURE amazon.test(input_parameter int64, OUT out1 int64, OUT out2 int64)
BEGIN
SET (out1, out2) = (select as struct input_parameter, 2 * input_parameter);
END;
调用程序:
DECLARE col1, col2 INT64;
CALL amazon.test(5, col1, col2);
SELECT col1, col2