SQL 获取产品和消息分组依据

SQL get products and messages group by

我需要编写 Oracle SQL 查询。我有两个 tables 产品和消息。产品 table 看起来像这样:

product_id creation_date user_id category_id
p1 2017-03-01 u1 c1
p2 2018-05-23 u1 c3
p3 2019-06-21 u2 c1

消息 table 看起来像这样:

message_id creation_date product_id user_from
m1 2018-03-01 p1 u2
m2 2019-08-19 p1 u5
m3 2020-10-10 p3 u7

我想列出一个类别中的所有产品,按消息总数排序,以及每个产品的前 5 个买家(联系这些产品的用户按发送的消息总数排序)

示例输出 table:

category_id product_id total_messages_for_product user_id messages
c1 p1 200 u1 10
c1 p1 200 u2 9
c1 p1 200 u3 7
c1 p1 200 u4 5
c1 p1 200 u5 4
c1 p2 150 u7 11
c1 p2 150 u8 10
c1 p2 150 u9 9
c1 p2 150 u10 7
c1 p2 150 u4 6

类似这样的东西(未测试,因为您没有以可用的形式提供测试数据):

with
  agg (product_id, user_from, messages, total_messages_for_product) as (
    select product_id, user_from, count(*),
           sum(count(*)) over (partition by product_id)
    from   messages
    group  by product_id, user_from
  )
select p.category_id, a.product_id, a.total_messages_for_product,
       a.user_from, a.messages
from   products p join agg a on p.product_id = a.product_id
order  by product_id, user_from   --  if/as needed
;

主要工作在agg子查询中完成(仅使用messages table)。请注意按产品划分的分析 sum() 函数的使用,以按产品获取消息总数。然后通过加入 products table.

获得 category_id

您似乎想要:

SELECT p.category_id,
       p.product_id,
       m.total_messages_for_product,
       m.user_from AS user_id,
       m.messages
FROM   products p
       INNER JOIN (
         SELECT product_id,
                user_from,
                COUNT(*) AS messages,
                SUM( COUNT(*) ) OVER ( PARTITION BY product_id )
                  AS total_messages_for_product,
                RANK() OVER (
                  PARTITION BY product_id ORDER BY COUNT(*) DESC
                ) AS messages_rank
         FROM   messages
         GROUP BY product_id, user_from
       ) m
       ON ( p.product_id = m.product_id )
WHERE  m.messages_rank <= 5;

(注意:您可以改为使用 ROW_NUMBER 来获取前 5 个条目而不是 RANK,其中 returns 前 5 个条目具有领带。)

其中,对于您的示例数据:

CREATE TABLE products ( product_id, creation_date, user_id, category_id ) AS
SELECT 'p1', DATE '2017-03-01', 'u1', 'c1' FROM DUAL UNION ALL
SELECT 'p2', DATE '2018-05-23', 'u1', 'c3' FROM DUAL UNION ALL
SELECT 'p3', DATE '2019-06-21', 'u2', 'c1' FROM DUAL;

CREATE TABLE messages( message_id, creation_date, product_id, user_from ) AS
SELECT 'm1', DATE '2018-03-01', 'p1', 'u2' FROM DUAL UNION ALL
SELECT 'm2', DATE '2019-08-19', 'p1', 'u5' FROM DUAL UNION ALL
SELECT 'm3', DATE '2020-10-10', 'p3', 'u7' FROM DUAL;

输出:

CATEGORY_ID | PRODUCT_ID | TOTAL_MESSAGES_FOR_PRODUCT | USER_ID | MESSAGES
:---------- | :--------- | -------------------------: | :------ | -------:
c1          | p1         |                          2 | u5      |        1
c1          | p1         |                          2 | u2      |        1
c1          | p3         |                          1 | u7      |        1

db<>fiddle here