如何在 Oracle table 中遍历每个组的每一行（同时执行 "group by"）

Question

我有一个 table 这样的：

我想根据“customer_id”列按 table 分组并计算“日-日[0]”列。 “Day-day[0]”是每个组中的“Day”字段，“day[0]”是组中日期的第一行。同时，我必须计算 总风险 如下：

这是分组后的table：

这是总风险公式：

事实上，我必须遍历每组的每一行来计算总风险。

我的样本table是这样的：

  CREATE TABLE risk_test
    (id          VARCHAR2 (32) NOT NULL PRIMARY KEY,
    customer_id  varchar2 (40BYTE),
    risk number,
    day VARCHAR2(50 BYTE))

  insert into risk_test values(1,102,15,1);
  insert into risk_test values(2,102,16,1); 
  insert into risk_test values(3,104,11,1);  
  insert into risk_test values(4,102,17,2);
  insert into risk_test values(5,102,10,2);
  insert into risk_test values(6,102,13,3);
  insert into risk_test values(7,104,14,2);
  insert into risk_test values(8,104,13,2);
  insert into risk_test values(9,104,17,1);
  insert into risk_test values(10,104,16,2);

示例答案是这样的：

能否请您指导我如何在 Oracle 数据库中执行此场景？

非常感谢任何帮助。

Answer 1

你的总风险计算对我来说就像一个加权平均值。也就是说，每个客户行的平均风险，根据日期偏移量 (day-day[0]) 加权，以便以后几天的风险更重要。

要计算它，您需要一个通用的 table 表达式来首先计算每行的日加权风险。然后你可以通过除以计算加权平均值。

下面的查询说明了该方法，并附有评论。

-- This first WITH clause is just sample data.  In your database you would
-- get rid of this and replace all references to "input" with your actual
-- table name
with input ( customer_id, risk, day ) AS ( 
  SELECT 1053, 100, 1 FROM DUAL UNION ALL
  SELECT 1053, 100, 1 FROM DUAL UNION ALL
  SELECT 1053, 100, 2 FROM DUAL UNION ALL
  SELECT 1053, 100, 2 FROM DUAL UNION ALL
  SELECT 1053, 100, 3 FROM DUAL UNION ALL
  SELECT 1054, 200, 1 FROM DUAL UNION ALL
  SELECT 1054, 200, 1 FROM DUAL UNION ALL
  SELECT 1054, 200, 3 FROM DUAL UNION ALL
  SELECT 1054, 200, 3 FROM DUAL UNION ALL
  SELECT 1054, 200, 4 FROM DUAL
  ),
-- This CTE computes the day offset for each row and multiplies by the risk to 
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
weighted_input AS (
  SELECT i.customer_id, 
         i.risk, 
         i.day, 
         i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
         ( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
  FROM   input i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
       sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM   weighted_input wi;

+-------------+------+-----+------------+-------------------+------------+
| CUSTOMER_ID | RISK | DAY | DAY_OFFSET | DAY_WEIGHTED_RISK | TOTAL_RISK |
+-------------+------+-----+------------+-------------------+------------+
|        1053 |  100 |   1 |          1 |               100 |        100 |
|        1053 |  100 |   1 |          1 |               100 |        100 |
|        1053 |  100 |   2 |          2 |               200 |        100 |
|        1053 |  100 |   2 |          2 |               200 |        100 |
|        1053 |  100 |   3 |          3 |               300 |        100 |
|        1054 |  200 |   1 |          1 |               200 |        200 |
|        1054 |  200 |   1 |          1 |               200 |        200 |
|        1054 |  200 |   3 |          3 |               600 |        200 |
|        1054 |  200 |   3 |          3 |               600 |        200 |
|        1054 |  200 |   4 |          4 |               800 |        200 |
+-------------+------+-----+------------+-------------------+------------+

对于您的数据库，具有实际的 table 而不需要 input CTE，它将是：

WITH weighted_input AS (
-- This CTE computes the day offset for each row and multiplies by the risk to 
-- compute a day-weighted risk.
-- I added +1 to the day_offset, otherwise risks on the 1st day would not contribute
-- to the total risk, which I think is not what you intended(?)
  SELECT i.customer_id, 
         i.risk, 
         i.day, 
         i.day - min(i.day) over ( partition by i.customer_id ) + 1 day_offset,
         ( i.day - min(i.day) over ( partition by i.customer_id ) + 1 ) * i.risk day_weighted_risk
  FROM   my_table i )
-- This is the main SELECT clause that gets all the weighted risks and computes
-- the group total risk, which appears the same in every row in each group.
SELECT wi.*,
       sum(wi.day_weighted_risk) over ( partition by wi.customer_id ) / sum(wi.day_offset) over ( partition by wi.customer_id ) total_risk
FROM   weighted_input wi;

Answer 2

使用提供的示例数据，我相信此查询应该正确计算风险：

查询

  SELECT o.*,
         ROUND (
               SUM (day_minus_day0 * risk) OVER (PARTITION BY customer_id)
             / SUM (day_minus_day0) OVER (PARTITION BY customer_id),
             5)    AS total_risk
    FROM (SELECT rt.*, (rt.day - MIN (rt.day) OVER (PARTITION BY customer_id)) + 1 AS day_minus_day0
            FROM risk_test rt) o
ORDER BY customer_id, TO_NUMBER (day), TO_NUMBER (id);

结果

   ID    CUSTOMER_ID    RISK    DAY    DAY_MINUS_DAY0    TOTAL_RISK
_____ ______________ _______ ______ _________________ _____________
1     102                 15 1                      1      13.77778
2     102                 16 1                      1      13.77778
4     102                 17 2                      2      13.77778
5     102                 10 2                      2      13.77778
6     102                 13 3                      3      13.77778
3     104                 11 1                      1         14.25
9     104                 17 1                      1         14.25
7     104                 14 2                      2         14.25
8     104                 13 2                      2         14.25
10    104                 16 2                      2         14.25

如何在 Oracle table 中遍历每个组的每一行（同时执行 "group by"）

how to loop through each row of every group (while doing "group by") in Oracle table

oracle

plsql

oracle-data-integrator

查询

结果