Postgres 递归 CTE 或交叉表函数
Postgres recursive CTE or crosstab function
我尝试从包含日志信息的 table 生成一些用户统计信息。
**TABLE users**
user_id | user_name
-------------------
1 | julia
2 | bob
3 | sebastian
**TABLE logs**
user_id | action | timepoint
------------------------------------
1 | create_quote | 2015-01-01
1 | send_quote | 2015-02-03
1 | create_quote | 2015-02-02
1 | start_job | 2015-01-15
2 | start_job | 2015-02-23
2 | send_quote | 2015-03-04
2 | start_job | 2014-12-02
我想要的输出如下table
user_id | username | create_quote | send_quote | start_job
-----------------------------------------------------------
1 | julia |2 | 1 | 1
2 | bob |0 | 1 | 1
3 | sebastian |0 | 0 | 0
它包括所有用户(即使没有记录任何内容),但仅包括日期“2015-01-01”和“2015-05-31”之间的操作。操作 counted/grouped 按操作类型和用户分类。
SQL 语句可能类似于
SELECT * FROM myfunction() WHERE to_char(timepoint, 'YY/MM') BETWEEN '15/01' AND '15/05';
你知道如何管理这个吗?我一直在尝试使用 CTE 和递归以及交叉表函数,但找不到任何解决方案。
我认为交叉表功能会更优雅,但如果您没有加载扩展,或者像我一样,在语法上挣扎,这是一种笨拙的蛮力你可以这样做的方式:
CREATE OR REPLACE FUNCTION get_stats(
from_date date,
thru_date date)
RETURNS table (
user_id integer,
username text,
create_quote bigint,
send_quote bigint,
start_job bigint
) AS
$BODY$
select
l.user_id, u.username,
sum (case when action = 'create_quote' then 1 else 0 end) as create_quote,
sum (case when action = 'send_quote' then 1 else 0 end) as send_quote,
sum (case when action = 'start_job' then 1 else 0 end) as start_job
from
logs l
join users u on l.user_id = u.user_id
where
l.timepoint between from_date and thru_date
group by
l.user_id, u.username
$BODY$
LANGUAGE sql VOLATILE
COST 100
ROWS 1000;
然后您的查询将是:
select * from get_stats('2015-01-01', '2015-05-31')
就我个人而言,我会跳过该函数并将其创建为查询,但可以想象,出于某些原因您会需要函数包装器。
-- 编辑--
根据一次尝试的编辑,我发现您可以接受一个查询。此外,您需要没有条目的用户。
考虑到所有这些,我认为这可能有效:
select
u.user_id, u.username,
sum (case when action = 'create_quote' then 1 else 0 end) as create_quote,
sum (case when action = 'send_quote' then 1 else 0 end) as send_quote,
sum (case when action = 'start_job' then 1 else 0 end) as start_job
from
users u
left join logs l on
l.user_id = u.user_id and
l.timepoint between '2015-01-01' and '2015-05-31'
group by
u.user_id, u.username
我尝试从包含日志信息的 table 生成一些用户统计信息。
**TABLE users**
user_id | user_name
-------------------
1 | julia
2 | bob
3 | sebastian
**TABLE logs**
user_id | action | timepoint
------------------------------------
1 | create_quote | 2015-01-01
1 | send_quote | 2015-02-03
1 | create_quote | 2015-02-02
1 | start_job | 2015-01-15
2 | start_job | 2015-02-23
2 | send_quote | 2015-03-04
2 | start_job | 2014-12-02
我想要的输出如下table
user_id | username | create_quote | send_quote | start_job
-----------------------------------------------------------
1 | julia |2 | 1 | 1
2 | bob |0 | 1 | 1
3 | sebastian |0 | 0 | 0
它包括所有用户(即使没有记录任何内容),但仅包括日期“2015-01-01”和“2015-05-31”之间的操作。操作 counted/grouped 按操作类型和用户分类。
SQL 语句可能类似于
SELECT * FROM myfunction() WHERE to_char(timepoint, 'YY/MM') BETWEEN '15/01' AND '15/05';
你知道如何管理这个吗?我一直在尝试使用 CTE 和递归以及交叉表函数,但找不到任何解决方案。
我认为交叉表功能会更优雅,但如果您没有加载扩展,或者像我一样,在语法上挣扎,这是一种笨拙的蛮力你可以这样做的方式:
CREATE OR REPLACE FUNCTION get_stats(
from_date date,
thru_date date)
RETURNS table (
user_id integer,
username text,
create_quote bigint,
send_quote bigint,
start_job bigint
) AS
$BODY$
select
l.user_id, u.username,
sum (case when action = 'create_quote' then 1 else 0 end) as create_quote,
sum (case when action = 'send_quote' then 1 else 0 end) as send_quote,
sum (case when action = 'start_job' then 1 else 0 end) as start_job
from
logs l
join users u on l.user_id = u.user_id
where
l.timepoint between from_date and thru_date
group by
l.user_id, u.username
$BODY$
LANGUAGE sql VOLATILE
COST 100
ROWS 1000;
然后您的查询将是:
select * from get_stats('2015-01-01', '2015-05-31')
就我个人而言,我会跳过该函数并将其创建为查询,但可以想象,出于某些原因您会需要函数包装器。
-- 编辑--
根据一次尝试的编辑,我发现您可以接受一个查询。此外,您需要没有条目的用户。
考虑到所有这些,我认为这可能有效:
select
u.user_id, u.username,
sum (case when action = 'create_quote' then 1 else 0 end) as create_quote,
sum (case when action = 'send_quote' then 1 else 0 end) as send_quote,
sum (case when action = 'start_job' then 1 else 0 end) as start_job
from
users u
left join logs l on
l.user_id = u.user_id and
l.timepoint between '2015-01-01' and '2015-05-31'
group by
u.user_id, u.username