SQL:获取每次值变化时增加的行号

SQL: Get row number which increases every time a value changes

我在 Vertica 中有以下 table:

+----------+----------+----------+
| column_1 | column_2 | column_3 |
+----------+----------+----------+
| a        |        1 |        1 |
| a        |        2 |        1 |
| a        |        3 |        1 |
| b        |        1 |        1 |
| b        |        2 |        1 |
| b        |        3 |        1 |
| c        |        1 |        1 |
| c        |        2 |        1 |
| c        |        3 |        1 |
| c        |        1 |        2 |
| c        |        2 |        2 |
| c        |        3 |        2 |
+----------+----------+----------+

table 由 column_1 和 column_3 排序。 我想添加一个行号,每次 column_1 或 column_3 更改它们的值时都会增加。它看起来像这样:

+----------+----------+----------+------------+
| column_1 | column_2 | column_3 | row_number |
+----------+----------+----------+------------+
| a        |        1 |        1 |          1 |
| a        |        2 |        1 |          1 |
| a        |        3 |        1 |          1 |
| b        |        1 |        1 |          2 |
| b        |        2 |        1 |          2 |
| b        |        3 |        1 |          2 |
| c        |        1 |        1 |          3 |
| c        |        2 |        1 |          3 |
| c        |        3 |        1 |          3 |
| c        |        1 |        2 |          4 |
| c        |        2 |        2 |          4 |
| c        |        3 |        2 |          4 |
+----------+----------+----------+------------+

我试过使用 partition over 但找不到正确的语法。

在没有 ORDER BY 的情况下,SQL 数据集是无序的。因此,为了在您的示例中建立顺序,我假设可以使用 ORDER BY column_1, column_3, column_2

对数据集进行排序
  • 如果该假设不起作用,您必须添加额外的列,数据可以根据这些列进行确定性排序。

这给出了以下查询...

SELECT
  yourTable.*,
  DENSE_RANK() OVER (ORDER BY column_1, column_3) AS row_number
FROM
  yourTable
ORDER BY
  column_1, column_3, column_2

Vertica 具有 CONDITIONAL_CHANGE_EVENT() 分析功能。 它从 0 开始,每次构成第一个参数的表达式发生变化时都会递增 1。

像这样:

WITH
indata(column_1,column_2,column_3,rn) AS (
          SELECT 'a',1,1,1
UNION ALL SELECT 'a',2,1,1
UNION ALL SELECT 'a',3,1,1
UNION ALL SELECT 'b',1,1,2
UNION ALL SELECT 'b',2,1,2
UNION ALL SELECT 'b',3,1,2
UNION ALL SELECT 'c',1,1,3
UNION ALL SELECT 'c',2,1,3
UNION ALL SELECT 'c',3,1,3
UNION ALL SELECT 'c',1,2,4
UNION ALL SELECT 'c',2,2,4
UNION ALL SELECT 'c',3,2,4
)
SELECT
  *
, CONDITIONAL_CHANGE_EVENT(
  column_1||column_3::VARCHAR
  ) OVER w + 1 AS rownum
FROM indata
WINDOW w AS (ORDER BY column_1,column_3,column_2)
;
-- out  column_1 | column_2 | column_3 | rn | rownum 
-- out ----------+----------+----------+----+--------
-- out  a        |        1 |        1 |  1 |      1
-- out  a        |        2 |        1 |  1 |      1
-- out  a        |        3 |        1 |  1 |      1
-- out  b        |        1 |        1 |  2 |      2
-- out  b        |        2 |        1 |  2 |      2
-- out  b        |        3 |        1 |  2 |      2
-- out  c        |        1 |        1 |  3 |      3
-- out  c        |        2 |        1 |  3 |      3
-- out  c        |        3 |        1 |  3 |      3
-- out  c        |        1 |        2 |  4 |      4
-- out  c        |        2 |        2 |  4 |      4
-- out  c        |        3 |        2 |  4 |      4

这也可以,不需要 table 排序

  1. 从 column_1 和 column_3 中找到不同的值并为它们提供新索引
  2. 在 column_1 和 column_3
  3. 上合并原 table
select t1.*, t2.row_number 
from
your_table t1
join 
(select column_1, column_2, row_number() over (partition by temp) as row_number from (select distinct column_1, column_2, 1 as temp from your_table) foo) t2
on
t1.column_1=t2.column_1 and t1.column_2=t2.column_2;