如何对具有分组列的整数进行聚合,我只希望其中包含一些?
How do I make an aggregate on an integer with a grouped column, for which I only want some included?
我有一个 table prices
持有一些产品的所有价格:
CREATE TABLE prices (
id INT,
product_id INT, /*Foreign key*/
created_at TIMESTAMP,
price INT
);
product_id 的第一个实体是它的初始销售价格。如果随后减少产品,将添加一个新实体。
我想找出所有产品每天的平均价格和总价格变化。
这是一些示例数据:
INSERT INTO prices (id, product_id, created_at, price) VALUES (1, 1, '2020-01-01', 11000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (2, 2, '2020-01-01', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (3, 3, '2020-01-01', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (4, 4, '2020-01-01', 2000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (5, 1, '2020-01-02', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (6, 2, '2020-01-02', 2999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (7, 5, '2020-01-02', 2999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (8, 1, '2020-01-03', 8999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (9, 1, '2020-01-03 10:00:00', 7000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (10, 5, '2020-01-03', 4000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (11, 6, '2020-01-03', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (12, 3, '2020-01-03', 6999);
预期结果应该是:
date mean_price_change total_price_change
2020-01-01 0 0
2020-01-02 1000.5 2001
2020-01-03 1666 4998
解释:
- “2020-01-01”的平均降价和总降价为 0,因为该日期所有产品都是新产品。
- 然而,在“2020-01-02”上,平均价格变化为:(11000-9999 + 3999-2999)/2 = 1000.5,因为 product_id
1
和 2
当天减价到9999和2999,他们之前的价格分别是11000和3999,总共减价为:(11000-9999 + 3999-2999) = 2001.
- 在“2020-01-03”仅更改了 product_id、
1
、3
和 5
。 1
在当天的两个不同时间:9999 => 8999 => 7000(最后一个管理)和 3
:从 9999 => 6999 然后 5
:从 2999 上升=> 4000。总计:(9999-7000 + 9999-6999 + 2999-4000) = 4998 当天平均降价:1666
我这里也添加了数据:https://www.db-fiddle.com/f/tJgoKFMJxcyg5gLDZMEP77/1
我说过要玩一些 DISTINCT ON
但似乎并没有做到...
试试这个
select
created_at,
avg(change),
sum(change)
from
(
with cte as
(
select
id,
product_id,
created_at,
lag(created_at) over(order by product_id, created_at) as last_date,
price
from prices
)
select
c.id,
c.product_id,
c.created_at,
c.last_date,
p.price as last_price,
c.price,
COALESCE(p.price - c.price,0) as change
from cte c
left join prices p on c.product_id =p.product_id and c.last_date =p.created_at
where p.price != c.price or p.price is null
) tmp
group by created_at
order by created_at
下面的查询跟踪所有价格变化,注意我们根据
加入当前和更早的价格
- 他们的产品是一样的
- earlier确实比current早
- earlier 是早于当前日期的最新项目
- current 是当天的最新项目
select today.product_id, (today.price - coalesce(earlier.price)), today.created_at as difference
from prices current
join prices earlier
on today.product_id = earlier.product_id and earlier.created_at < current.created_at
where not exists (
select 1
from prices later
where later.product_id = today.product_id and
(
((today.created_at = later.created_at) and (today.id < later.id)) or
((earlier.created_at <= later.created_at) and (earlier.id < later.id))
)
);
现在,让我们做一些聚合:
select created_at, avg(today.price - coalesce(earlier.price)) as mean, sum(today.price - coalesce(earlier.price)) as total
from prices current
left join prices earlier
on today.product_id = earlier.product_id and earlier.created_at < current.created_at
where not exists (
select 1
from prices later
where later.product_id = today.product_id and
(
((today.created_at = later.created_at) and (today.id < later.id)) or
((earlier.created_at <= later.created_at) and (earlier.id < later.id))
)
)
group by created_at
order by created_at;
您似乎想要 lag()
和聚合:
select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
from prices p
) p
group by created_at
order by created_at;
产品 1 在 2020-01-03 有两个价格。一旦我解决了这个问题,我就会得到与你的问题相同的结果。 Here 是 db<>fiddle.
编辑:
每天处理多个价格:
select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
from (select distinct on (product_id, created_at::date) p.*
from prices p
order by product_id, created_at::date
) p
) p
group by created_at
order by created_at;
我有一个 table prices
持有一些产品的所有价格:
CREATE TABLE prices (
id INT,
product_id INT, /*Foreign key*/
created_at TIMESTAMP,
price INT
);
product_id 的第一个实体是它的初始销售价格。如果随后减少产品,将添加一个新实体。
我想找出所有产品每天的平均价格和总价格变化。
这是一些示例数据:
INSERT INTO prices (id, product_id, created_at, price) VALUES (1, 1, '2020-01-01', 11000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (2, 2, '2020-01-01', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (3, 3, '2020-01-01', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (4, 4, '2020-01-01', 2000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (5, 1, '2020-01-02', 9999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (6, 2, '2020-01-02', 2999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (7, 5, '2020-01-02', 2999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (8, 1, '2020-01-03', 8999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (9, 1, '2020-01-03 10:00:00', 7000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (10, 5, '2020-01-03', 4000);
INSERT INTO prices (id, product_id, created_at, price) VALUES (11, 6, '2020-01-03', 3999);
INSERT INTO prices (id, product_id, created_at, price) VALUES (12, 3, '2020-01-03', 6999);
预期结果应该是:
date mean_price_change total_price_change
2020-01-01 0 0
2020-01-02 1000.5 2001
2020-01-03 1666 4998
解释:
- “2020-01-01”的平均降价和总降价为 0,因为该日期所有产品都是新产品。
- 然而,在“2020-01-02”上,平均价格变化为:(11000-9999 + 3999-2999)/2 = 1000.5,因为 product_id
1
和2
当天减价到9999和2999,他们之前的价格分别是11000和3999,总共减价为:(11000-9999 + 3999-2999) = 2001. - 在“2020-01-03”仅更改了 product_id、
1
、3
和5
。1
在当天的两个不同时间:9999 => 8999 => 7000(最后一个管理)和3
:从 9999 => 6999 然后5
:从 2999 上升=> 4000。总计:(9999-7000 + 9999-6999 + 2999-4000) = 4998 当天平均降价:1666
我这里也添加了数据:https://www.db-fiddle.com/f/tJgoKFMJxcyg5gLDZMEP77/1
我说过要玩一些 DISTINCT ON
但似乎并没有做到...
试试这个
select
created_at,
avg(change),
sum(change)
from
(
with cte as
(
select
id,
product_id,
created_at,
lag(created_at) over(order by product_id, created_at) as last_date,
price
from prices
)
select
c.id,
c.product_id,
c.created_at,
c.last_date,
p.price as last_price,
c.price,
COALESCE(p.price - c.price,0) as change
from cte c
left join prices p on c.product_id =p.product_id and c.last_date =p.created_at
where p.price != c.price or p.price is null
) tmp
group by created_at
order by created_at
下面的查询跟踪所有价格变化,注意我们根据
加入当前和更早的价格- 他们的产品是一样的
- earlier确实比current早
- earlier 是早于当前日期的最新项目
- current 是当天的最新项目
select today.product_id, (today.price - coalesce(earlier.price)), today.created_at as difference
from prices current
join prices earlier
on today.product_id = earlier.product_id and earlier.created_at < current.created_at
where not exists (
select 1
from prices later
where later.product_id = today.product_id and
(
((today.created_at = later.created_at) and (today.id < later.id)) or
((earlier.created_at <= later.created_at) and (earlier.id < later.id))
)
);
现在,让我们做一些聚合:
select created_at, avg(today.price - coalesce(earlier.price)) as mean, sum(today.price - coalesce(earlier.price)) as total
from prices current
left join prices earlier
on today.product_id = earlier.product_id and earlier.created_at < current.created_at
where not exists (
select 1
from prices later
where later.product_id = today.product_id and
(
((today.created_at = later.created_at) and (today.id < later.id)) or
((earlier.created_at <= later.created_at) and (earlier.id < later.id))
)
)
group by created_at
order by created_at;
您似乎想要 lag()
和聚合:
select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
from prices p
) p
group by created_at
order by created_at;
产品 1 在 2020-01-03 有两个价格。一旦我解决了这个问题,我就会得到与你的问题相同的结果。 Here 是 db<>fiddle.
编辑:
每天处理多个价格:
select created_at, avg(prev_price - price), sum(prev_price - price)
from (select p.*, lag(price) over (partition by product_id order by created_at) as prev_price
from (select distinct on (product_id, created_at::date) p.*
from prices p
order by product_id, created_at::date
) p
) p
group by created_at
order by created_at;