如何在 AWS redshift 中计算不同时间 window 的不同
How to count distinct in different time window in AWS redshift
SELECT count(DISTINCT(c.visitid)) as count1,
t.prod
FROM x.t1 c
JOIN y.t1 t
ON c.headingid = t.prod_heading_id
WHERE
c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02'
AND c.evaluation > 0
GROUP BY t.prod
ORDER BY count1 DESC
LIMIT 100
我还有时间 window 从“2020-01-01”到“2020-04-02”,我想按组进行与 count2 相同的计数。
您可以使用条件聚合:
SELECT count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02') as cnt1,
count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02') as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;
在 Redshift(或许多其他数据库)中,语法为:
SELECT count(DISTINCT case when c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02' then c.visitid end) as cnt1,
count(DISTINCT case when c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02' then c.visitid end) as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;
SELECT count(DISTINCT(c.visitid)) as count1,
t.prod
FROM x.t1 c
JOIN y.t1 t
ON c.headingid = t.prod_heading_id
WHERE
c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02'
AND c.evaluation > 0
GROUP BY t.prod
ORDER BY count1 DESC
LIMIT 100
我还有时间 window 从“2020-01-01”到“2020-04-02”,我想按组进行与 count2 相同的计数。
您可以使用条件聚合:
SELECT count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02') as cnt1,
count(DISTINCT c.visitid) filter (where c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02') as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;
在 Redshift(或许多其他数据库)中,语法为:
SELECT count(DISTINCT case when c.eventtimestamp BETWEEN '2021-01-01' AND '2021-04-02' then c.visitid end) as cnt1,
count(DISTINCT case when c.eventtimestamp BETWEEN '2020-01-01' AND '2020-04-02' then c.visitid end) as cnt2,
t.prod
FROM x.t1 c JOIN
y.t1 t
ON c.headingid = t.prod_heading_id
WHERE c.evaluation > 0 AND
c.eventtimestamp BETWEEN '2020-01-01' AND '2021-04-02'
GROUP BY t.prod
ORDER BY cnt1 DESC;