Oracle SQL 基于月份的数据迁移行到列

Oracle SQL data migration row to column based in month

CODE1   CODE2   CODE3   RATE    VALUE   MONTH
A       B       C       1       1       202001
A       B       C       1       1       202002
A       B       C       1       1       202003
A       B       C       2       1       202004
A       B       C       2       1       202005
A       B       C       1       1       202006
A       B       C       1       1       202007
A       B       C       1       1       202008
A       B       C       1       1       202009

我正在将数据从旧系统迁移到新系统。 作为每月维护的旧系统数据的一部分,如果数据更新并且 table 一个月包含一行,则将更新同一行 我正在迁移到新闻系统,它包含开始日期和结束日期以制作活动记录。所以更新新数据需要插入和更新旧行结束日期

我的预期数据

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       1       1       20200101    20200331
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230

如果数据有效,我们会将日期更新为无穷大,所以 999912

但我只得到两条记录,我的查询如下

CODE1   CODE2   CODE3   RATE    VALUE   START_DT    END_DT
A       B       C       2       1       20200401    20200531
A       B       C       1       1       20200601    99991230


SELECT CODE1, CODE2, CODE3 RATE, VALUE,
 TO_DATE(MIN(bus_month), 'yyyymm') AS START_DT,
 last_day(TO_DATE(replace(MAX(bus_month), $CURRENTMONTG, '999912'), 'yyyymm')) AS end_date
FROM TEST_TABLE
GROUP BY CODE1, CODE2, CODE3, RATE, VALUE

因为我正在根据 CODE1、CODE2、CODE3、RATE、VALUE 进行分组并根据分组获取最新数据,但我无法获取旧数据

请帮助我获得预期的 table 结构。 提前致谢

如果需要更多详细信息,请发表评论

这是一个 gaps-and-islands 问题,您希望将具有相同速率和值的“相邻”行组合在一起。

一种方法使用行号之间的差异来构建组。假设这三个代码定义了基本组,并且您希望在比率或值发生变化时分成一个新行:

select code1, code2, code3, rate, value, min(month) start_dt, 
    case when row_number() over(partition by code1, code2, code3 order by max(month) desc) = 1 then 999912 else max(month) end end_dt
from (
    select t.*,
        row_number() over(partition by code1, code2, code3 order by month) rn1,
        row_number() over(partition by code1, code2, code3, rate, value order by month) rn2
    from mytable t
) t
group by code1, code2, code3, rate, value, rn1 - rn2
order by start_dt

外部查询中的条件表达式将“最后”期间的结束日期设置为 999912

Demo on DB Fiddle:

CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT
:---- | :---- | :---- | ---: | ----: | -------: | -----:
A     | B     | C     |    1 |     1 |   202001 | 202003
A     | B     | C     |    2 |     1 |   202004 | 202005
A     | B     | C     |    1 |     1 |   202006 | 999912

您可以使用 MATCH_RECOGNIZE 对数据进行 row-by-row 比较:

SELECT code1,
       code2,
       code3,
       rate,
       value,
       start_dt,
       CASE end_dt
       WHEN TO_NUMBER( TO_CHAR( SYSDATE, 'YYYYMM' ) )
       THEN 999912
       ELSE end_dt
       END AS end_dt
FROM   table_name
MATCH_RECOGNIZE (
   PARTITION BY code1, code2, code3
   ORDER BY     month
   MEASURES     FIRST( rate ) AS rate,
                FIRST( value ) AS value,
                FIRST( month ) AS start_dt,
                LAST( month ) AS end_dt
   ONE ROW PER MATCH
   PATTERN      (FIRST_ROW EQUAL_ROWS*)
   DEFINE       EQUAL_ROWS AS (
                      EQUAL_ROWS.rate  = PREV(EQUAL_ROWS.rate)
                  AND EQUAL_ROWS.value = PREV(EQUAL_ROWS.value)
                  AND TO_DATE( EQUAL_ROWS.month, 'YYYYMM' )
                        = ADD_MONTHS( TO_DATE( PREV(EQUAL_ROWS.month), 'YYYYMM' ), 1 )
                )
)

因此,对于您的示例数据:

CREATE TABLE table_name ( CODE1, CODE2, CODE3, RATE, VALUE, MONTH ) AS
SELECT 'A', 'B', 'C', 1, 1, 201912 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202001 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202002 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202003 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 2, 1, 202004 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 2, 1, 202005 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202006 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202007 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202008 FROM DUAL UNION ALL
SELECT 'A', 'B', 'C', 1, 1, 202009 FROM DUAL;

这输出:

CODE1 | CODE2 | CODE3 | RATE | VALUE | START_DT | END_DT
:---- | :---- | :---- | ---: | ----: | -------: | -----:
A     | B     | C     |    1 |     1 |   201912 | 202003
A     | B     | C     |    2 |     1 |   202004 | 202005
A     | B     | C     |    1 |     1 |   202006 | 999912

db<>fiddle here

甲骨文SQL:

SELECT
     code1,code2,code3,rate,value,min(MONTH) start_dt,
     CASE
          WHEN ROW_NUMBER() OVER(PARTITION BY code1, code2, code3 ORDER BYmax(MONTH) DESC) = 1 THEN 99991230
          ELSE max(MONTH)
     END end_dt
FROM
     (
     SELECT
          t.*,
          ROW_NUMBER() OVER(PARTITION BY code1, code2, code3 ORDER BY MONTH) rn1,
          ROW_NUMBER() OVER(PARTITION BY code1, code2, code3, rate, value ORDERBY MONTH) rn2
     FROM
          TBLTEST t
) t
GROUP BY
     code1,code2,code3,rate,value,rn1 - rn2
ORDER BY
     start_dt

以自然的思维方式完成任务是相当简单的。我们比较前五列行之间的相邻值,当值相同时将当前行和上一行放在同一组中,如果不同则创建一个新组,直到比较最后一条记录。由于SQL集合是无序的,我们需要先以极其复杂的方式手动创建两列索引,然后根据两列索引之间的关系进行分组。你需要非常聪明才能想出解决方案。

但使用开源集算器SPL编写代码很容易:

  A
1 =connect("oracle")
2 =A1.query@x("SELECT * FROM TBLTEST ORDER BY MONTH")
3 =A2.groups@o(CODE1,CODE2,CODE3,RATE,VALUE;min(MONTH)/"01":STARTDT,string(date((max(MONTH)+1)/"01","yyyyMMdd")-1,"yyyyMMdd"):ENDDT)
4 >A3.m(-1).modify("99991230":ENDDT)

SPL直接支持ordered sets,当相邻值不同时可以方便的进行分组。