在 SQL 中查找占总营业额 70% 的类别

Finding categories that make up 70% of total turnover in SQL

我想找出占销售额一定比例的类别,并根据它们在SQL中的百分比对它们进行细分。为此,我必须首先按收入降序对它们进行排序,然后 select 排名前 N%。例如总收入为20M:

Category Revenue
1        6.000.000
2        4.000.000
3        4.000.000
4        3.000.000
5        1.500.000
6        500.000
7        400.000
8        300.000
9        200.000
10       100.000
        
Total    20.000.000

-占收入 70% (1400 万) 的类别 - 细分市场 A

-占 15% (3M) 的类别 - 细分市场 B

-占 10% (2M) 的类别 - 段 C

-占 5% (1M) 的类别 - 段 D

所以,细分应该是这样的:

Category Segment
1        A
2        A
3        A
4        B
5        C
6        C
7        D
8        D
9        D
10       D
        
 

我敢肯定有一种更简单的方法可以得到这个结果,但这里有一个 [相当长] 的查询,它根据您的逻辑对类别进行分类:

with 
q as (
  select *,
    sum(revenue) over(order by revenue desc) as acc_revenue,
    sum(revenue) over() as tot_revenue
  from t
),
a as (
  select * from q where acc_revenue <= 0.7 * tot_revenue
),
b as (
  select * 
  from q
  where acc_revenue - (select max(acc_revenue) from a) <= 0.15 * tot_revenue
    and category not in (select category from a)
),
c as (
  select *
  from q
  where acc_revenue - (select max(acc_revenue) from b) <= 0.10 * tot_revenue
    and category not in (select category from a)
    and category not in (select category from b)
),
d as (
  select *
  from q
  where category not in (select category from a)
    and category not in (select category from b)
    and category not in (select category from c)
)
select *, 'A' as segment from a
union all select *, 'B' from b
union all select *, 'C' from c
union all select *, 'D' from d

结果:

 category  revenue  acc_revenue  tot_revenue  segment 
 --------- -------- ------------ ------------ ------- 
 1         6000000  6000000      20000000     A       
 2         4000000  14000000     20000000     A       
 3         4000000  14000000     20000000     A       
 4         3000000  17000000     20000000     B       
 5         1500000  18500000     20000000     C       
 6         500000   19000000     20000000     C       
 7         400000   19400000     20000000     D       
 8         300000   19700000     20000000     D       
 9         200000   19900000     20000000     D       
 10        100000   20000000     20000000     D       

请参阅 DB Fiddle 中的 运行 示例。

此答案使用 window 与上一个答案类似的函数,但包含一个 CASE 表达式而不是多个 CTE。

WITH
cte AS (
SELECT category, revenue,
    sum(revenue) OVER(ORDER BY revenue DESC, category)*1. /*multiply by 1. (or use CAST) to avoid integer truncation in immediate next step*/
        /sum(revenue) OVER() RunTtlPct /*Running total of sales, as a percent of the grand total*/
FROM t)

SELECT category, revenue,
    CASE
        WHEN RunTtlPct <= 0.7 THEN 'A'
        WHEN RunTtlPct <= 0.85 THEN 'B'
        WHEN RunTtlPct <= 0.95 THEN 'C'
        ELSE 'D'
    END Segment
FROM cte /*cte was included to avoid repeating RunTtlPct's expression in every WHEN clause.*/