Select 其中累计和小于一个数字（按优先顺序）

Question

我有一个包含 id、cost 和 priority 列的 table：

create table a_test_table (id number(4,0), cost number(15,2), priority number(4,0));

insert into a_test_table (id, cost, priority) values (1, 1000000, 10);
insert into a_test_table (id, cost, priority) values (2, 10000000, 9);
insert into a_test_table (id, cost, priority) values (3, 5000000, 8);
insert into a_test_table (id, cost, priority) values (4, 19000000, 7);
insert into a_test_table (id, cost, priority) values (5, 20000000, 6);
insert into a_test_table (id, cost, priority) values (6, 15000000, 5);
insert into a_test_table (id, cost, priority) values (7, 2000000, 4);
insert into a_test_table (id, cost, priority) values (8, 3000000, 3);
insert into a_test_table (id, cost, priority) values (9, 3000000, 2);
insert into a_test_table (id, cost, priority) values (10, 8000000, 1);
commit;

select 
    id,
    to_char(cost, '9,999,999') as cost,
    priority
from 
    a_test_table;

        ID COST            PRIORITY
---------- ------------- ----------
         1    ,000,000         10
         2   ,000,000          9
         3    ,000,000          8
         4   ,000,000          7
         5   ,000,000          6
         6   ,000,000          5
         7    ,000,000          4
         8    ,000,000          3
         9    ,000,000          2
        10    ,000,000          1

从最高优先级（降序）开始，我想 select cost 加起来小于（或等于）$20,000,000 的行。

结果如下所示：

       ID COST            PRIORITY
---------- ------------- ----------
         1    ,000,000         10
         2   ,000,000          9
         3    ,000,000          8
         7    ,000,000          4

      Total: ,000,000

如何使用 Oracle SQL 执行此操作？

Answer 1

我太笨了，无法简单地做到这一点 SQL，所以我尝试了 PL/SQL - return 是 table[ 的函数=20=]。方法如下：循环遍历 table 中的所有行，计算总和；如果它低于限制，没问题 - 将行的 ID 添加到数组中并继续。

SQL> create or replace function f_pri (par_limit in number) 2 return sys.odcinumberlist 3 is 4 l_sum number := 0; 5 l_arr sys.odcinumberlist := sys.odcinumberlist(); 6 begin 7 for cur_r in (select id, cost, priority 8 from a_test_table 9 order by priority desc 10 ) 11 loop 12 l_sum := l_sum + cur_r.cost; 13 if l_sum <= par_limit then 14 l_arr.extend; 15 l_arr(l_arr.last) := cur_r.id; 16 else 17 l_sum := l_sum - cur_r.cost; 18 end if; 19 end loop; 20 return (l_arr); 21 end; 22 / Function created.

正在准备 SQL*Plus 环境，使输出看起来更漂亮：

SQL> break on report SQL> compute sum of cost on report SQL> set ver off

测试：

SQL> select t.id, t.cost, t.priority 2 from table(f_pri(&par_limit)) x join a_test_table t on t.id = x.column_value 3 order by t.priority desc; Enter value for par_limit: 20000000 ID COST PRIORITY ---------- ---------- ---------- 1 1000000 10 2 10000000 9 3 5000000 8 7 2000000 4 ---------- sum 18000000 SQL> / Enter value for par_limit: 30000000 ID COST PRIORITY ---------- ---------- ---------- 1 1000000 10 2 10000000 9 3 5000000 8 7 2000000 4 8 3000000 3 9 3000000 2 ---------- sum 24000000 6 rows selected. SQL>

Answer 2

这里有一种纯 SQL 的方法。我不会发誓没有更好的方法。

基本上，它使用递归通用 table 表达式（即 WITH costed...）来计算总计小于 20,000,000 的所有可能的元素组合。

然后它从该结果中获取第一个完整路径。

然后，它获取该路径中的所有行。

注意：逻辑假定没有 id 长于 5 位数字。那是 LPAD(id,5,'0') 东西。

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000),
best_path as (  
SELECT *
FROM   costed c 
where not exists ( SELECT 'longer path' FROM costed c2 WHERE c2.path like c.path || '|%' )
order by path
fetch first 1 row only )
SELECT att.* 
FROM best_path cross join a_test_table att
WHERE best_path.path like '%' || lpad(att.id,5,'0') || '%'
order by att.priority desc;

+----+----------+----------+
| ID |   COST   | PRIORITY |
+----+----------+----------+
|  1 |  1000000 |       10 |
|  2 | 10000000 |        9 |
|  3 |  5000000 |        8 |
|  7 |  2000000 |        4 |
+----+----------+----------+

更新 - 更短的版本

此版本使用 MATCH_RECOGNIZE 在递归 CTE 之后查找最佳组中的所有行：

WITH costed (id, cost, priority, running_cost, path) as 
( SELECT id, cost, priority, cost running_cost, lpad(id,5,'0') path
  FROM   a_test_table
  WHERE  cost <= 20000000
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost, costed.path || '|' || lpad(a.id,5,'0')
  FROM   costed, a_test_table a 
  WHERE  a.priority < costed.priority
  AND    a.cost + costed.running_cost <= 20000000)
  search depth first by priority desc set ord
SELECT id, cost, priority
FROM   costed c 
MATCH_RECOGNIZE (
  ORDER BY path
  MEASURES
    MATCH_NUMBER() AS mno
  ALL ROWS PER MATCH
  PATTERN (STRT ADDON*)
  DEFINE
    ADDON AS ADDON.PATH = PREV(ADDON.PATH) || '|' || LPAD(ADDON.ID,5,'0')
    )
WHERE mno = 1
ORDER BY priority DESC;

更新——更短的版本，使用来自 SQL*Server link OP 发布的巧妙想法

*编辑：删除了在递归 CTE 的锚点部分使用 ROWNUM=1，因为它取决于返回行的任意顺序。令我惊讶的是，没有人在这件事上指责我。 *

WITH costed (id, cost, priority, running_cost) as 
( SELECT id, cost, priority, cost running_cost
  FROM   ( SELECT * FROM a_test_table
           WHERE  cost <= 20000000
           ORDER BY priority desc
           FETCH FIRST 1 ROW ONLY )
  UNION ALL 
  SELECT a.id, a.cost, a.priority, a.cost + costed.running_Cost
  FROM   costed CROSS APPLY ( SELECT b.*
                              FROM   a_test_table b 
                              WHERE  b.priority < costed.priority
                              AND    b.cost + costed.running_cost <= 20000000
                              FETCH FIRST 1 ROW ONLY
                              ) a
)
CYCLE id SET is_cycle TO 'Y' DEFAULT 'N'
select id, cost, priority from costed
order by priority desc

Answer 3

@ypercubeᵀᴹ 在 DBA-SE 聊天中发布 this solution。很简洁。

with  rt (id, cost, running_total, priority) as
(
    (
    select 
        id,
        cost,
        cost as running_total,
        priority
    from 
        a_test_table
    where cost <= 20000000 
    order by priority desc
    fetch first 1 rows only
    )

    union all

        select 
            t.id,
            t.cost,
            t.cost + rt.running_total,
            t.priority
        from a_test_table  t
             join rt 
             on t.priority < rt.priority      -- redundant but throws
                                              -- "cycle detected error" if omitted

             and t.priority =                             -- needed 
                 ( select max(tm.priority) from a_test_table tm
                   where tm.priority < rt.priority
                     and tm.cost + rt.running_total <= 20000000 )
    )
    select *
    from rt ;

（@ypercubeᵀᴹ 没兴趣自己发。）

Select 其中累计和小于一个数字（按优先顺序）

Select where cumulative sum is less than a number (in order of priority)

sql

oracle

cumulative-sum

oracle12c

更新 - 更短的版本

更新——更短的版本，使用来自 SQL*Server link OP 发布的巧妙想法