如何计算行数并避免重复？

Question

对于一个大学项目，我必须根据一个 table 的数据计算一个 kpi。 table 存储有关超市购物篮和购物订单项及其产品类别的数据。我必须计算在特定商店购买的所有产品类别的数量。所以在 tables 中它看起来像这样：

StoreId   BasketID  CategoryId
1           1           1
1           1           2
1           1           3
1           2           1
1           2           3
1           2           4
2           3           1
2           3           2
2           3           3
2           4           1

作为查询的结果，我想要一个 table 来计算与商店关联的所有购物篮中的不同产品类别。像这样：

StoreId   Count(CategoryId)
1            4
2            3

如果我用硬值做一个非动态语句，它是有效的。

select basket_hash, store_id, count(DISTINCT retailer_category_id)
from promo.checkout_item
where store_id = 2
  and basket_hash = 123
GROUP BY basket_hash, store_id;

但是当我尝试以动态方式编写时，sql 计算每个篮子的金额并将单个金额加在一起。

select store_id,  Count(DISTINCT retailer_category_id) 
from promo.checkout_item
group by store_id;

但像这样它并没有比较与商店关联的所有购物篮的类别，我得到了重复项，因为一个类别可以在购物车 1 和购物车 2 中。

有人可以帮忙吗？！

谢谢！

Answer 1

作为您的预期结果，您想要以下语句吗？

SELECT StoreId,  COUNT(*)
FROM (
       SELECT DISTINCT StoreId, CategoryId 
       FROM table_name
)
GROUP BY StoreId;

请将声明中的 "table_name" 替换为您 table 的姓名。

我不确定 "dynamic way" 是什么意思。

Answer 2

我对你的要求感到困惑。这就是我想你的意思：

with checkout_item (store_id, basket_hash, retailer_category_id) as (
    values 
    (1,1,1),(1,1,2),(1,1,3),(1,2,1),(1,2,3),
    (1,2,4),(2,3,1),(2,3,2),(2,3,3),(2,4,1)
)
select distinct store_id, basket_hash, store_cats, basket_cats
from (
    select store_id, basket_hash,
        max(store_cats) over (partition by store_id) as store_cats,
        max(basket_cats) over (partition by basket_hash) as basket_cats
    from (
        select store_id, basket_hash,
            dense_rank() over (
                partition by store_id
                order by retailer_category_id
            ) as store_cats,
            dense_rank() over (
                partition by basket_hash
                order by retailer_category_id
            ) as basket_cats
        from checkout_item
    ) s
) s
order by 1, 2
;
 store_id | basket_hash | store_cats | basket_cats 
----------+-------------+------------+-------------
        1 |           1 |          4 |           3
        1 |           2 |          4 |           3
        2 |           3 |          3 |           3
        2 |           4 |          3 |           1

如何计算行数并避免重复？

How to count over rows and avoid duplicates?

sql

postgresql

amazon-redshift