在 PostgreSQL 中解析文本数据

Question

我有一个 PostgreSQL 数据库，一个 table 有 2 个文本列，存储的数据如下：

id|         col1              |                     col2                      |
------------------------------------------------------------------------------|
1 | value_1, value_2, value_3 | name_1(date_1), name_2(date_2), name_3(date_3)|
2 | value_4, value_5, value_6 | name_4(date_4), name_5(date_5), name_6(date_6)|

我需要像这样解析新 table 中的行：

id |  col1   |  col2  |  col3  |
1  | value_1 | name_1 | date_1 |
1  | value_2 | name_2 | date_2 |
...|   ...   |  ...   |  ...   |
2  | value_6 | name_6 | date_6 |

我该怎么做？

Answer 1

step-by-step demo:db<>fiddle

SELECT
    id,
    u_col1 as col1,
    col2_matches[1] as col2,                                     -- 5
    col2_matches[2] as col3
FROM 
    mytable,
    unnest(                                                      -- 3
        regexp_split_to_array(col1, ', '),                       -- 1
        regexp_split_to_array(col2, ', ')                        -- 2
    ) as u (u_col1, u_col2),
    regexp_matches(u_col2, '(.+)\((.+)\)') as col2_matches       -- 4

将第一列的数据拆分为一个数组
将第二列的数据拆分为 {a(a), b(b), c(c)}
将所有数组元素转置到自己的记录中
将a(b)形式的元素拆分为{a,b}
显示必填列。对于 col2 和 col3 显示步骤 4

在 PostgreSQL 中解析文本数据

Parse text data in PostgreSQL

postgresql

postgresql-9.5