将列中的逗号分隔值转换为行
Convert comma delimited values in a column into rows
我想将逗号分隔值转换为 Redshift 中的行
例如:
store |location |products
-----------------------------
1 |New York |fruit, drinks, candy...
期望的输出是:
store |location | products
-------------------------------
1 |New York | fruit
1 |New York | drinks
1 |New York | candy
是否有任何简单的解决方案可以根据分隔符拆分单词并转换为行?我正在研究这个解决方案,但它还不起作用:https://help.looker.com/hc/en-us/articles/360024266693-Splitting-Strings-into-Rows-in-the-Absence-of-Table-Generating-Functions
如有任何建议,我们将不胜感激。
如果你知道值的最大数量,我想你可以split_part()
:
select t.store, t.location, split_part(products, ',', n.n) as product
from t join
(select 1 as n union all
select 2 union all
select 3 union all
select 4
) n
on split_part(products, ',', n.n) <> '';
您还可以使用:
select t.store, t.location, split_part(products, ',', 1) as product
from t
union all
select t.store, t.location, split_part(products, ',', 2) as product
from t
where split_part(products, ',', 2) <> ''
union all
select t.store, t.location, split_part(products, ',', 3) as product
from t
where split_part(products, ',', 3) <> ''
union all
select t.store, t.location, split_part(products, ',', 4) as product
from t
where split_part(products, ',', 4) <> ''
union all
. . .
首先,您需要创建一个数字 table,因为加入另一个 table 是 redshift 将一行变成多行的唯一方法(没有展平或取消嵌套功能).
- 例如,一个 table 有 1024 行,其中的值是 1..1024
然后你可以加入并使用split_part()
SELECT
yourTable.*,
numbers.ordinal,
split_part(your_table.products, ',', numbers.ordinal) AS product
FROM
yourTable
INNER JOIN
numbers
ON numbers.ordinal >= 1
AND numbers.ordinal <= regexp_count(your_table.products, ',') + 1
但是...
Redshift 在预测所需行数方面很糟糕。它将连接整个 1024 行,然后拒绝不匹配的行。
它表现得像狗。
因为设计假设是这样的处理总是在加载到 Redshift 之前完成。
CREATE TABLE temptbl
(
store INT,
location NVARCHAR(MAX),
products NVARCHAR(MAX)
)
INSERT temptbl SELECT 1, 'New York', 'Fruit, drinks, candy'
创建的输出 table 当你被创建时
select * from temptbl
;WITH tmp(store, location, DataItem, products) AS
(
SELECT
store,
location,
LEFT(products, CHARINDEX(',', products + ',') - 1),
STUFF(products, 1, CHARINDEX(',', products + ','), '')
FROM temptbl
UNION all
SELECT
store ,
location,
LEFT(products, CHARINDEX(',', products + ',') - 1),
STUFF(products, 1, CHARINDEX(',', products + ','), '')
FROM tmp
WHERE
products > ''
)
SELECT
store,
location,
DataItem
FROM tmp
您希望在多行中使用逗号分隔值:
运行 以上命令后你想要的输出:
希望您找到解决方案:)))
MYSQL is also fine
CREATE TABLE test
SELECT 1 store, 'New York' location, 'fruit,drinks,candy' products;
SELECT store, location, product
FROM test
CROSS JOIN JSON_TABLE(CONCAT('["', REPLACE(products, ',', '","'), '"]'),
"$[*]" COLUMNS (product VARCHAR(255) PATH "$")) jsontable
store
location
product
1
New York
fruit
1
New York
drinks
1
New York
candy
db<>fiddle here
在 MySQL 中,这将适用于最多四个逗号分隔值。注意 UNION
,而不是 UNION ALL
。 Fiddle
SELECT store, location,
TRIM(SUBSTRING_INDEX(products, ',', 1)) product
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 2), ',', -1))
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 3), ',', -1))
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 4), ',', -1))
FROM inventory
我会回应其他人所说的。恕我直言,逗号分隔值是一个糟糕的 table 设计。
- 丑陋 SQL。能够阅读和推理 SQL 非常重要。清晰总是赢。
- 而且,AWS 的股东会因此喜欢你,因为你会在 redshift 上花费 很多 额外的钱。
我想将逗号分隔值转换为 Redshift 中的行
例如:
store |location |products
-----------------------------
1 |New York |fruit, drinks, candy...
期望的输出是:
store |location | products
-------------------------------
1 |New York | fruit
1 |New York | drinks
1 |New York | candy
是否有任何简单的解决方案可以根据分隔符拆分单词并转换为行?我正在研究这个解决方案,但它还不起作用:https://help.looker.com/hc/en-us/articles/360024266693-Splitting-Strings-into-Rows-in-the-Absence-of-Table-Generating-Functions
如有任何建议,我们将不胜感激。
如果你知道值的最大数量,我想你可以split_part()
:
select t.store, t.location, split_part(products, ',', n.n) as product
from t join
(select 1 as n union all
select 2 union all
select 3 union all
select 4
) n
on split_part(products, ',', n.n) <> '';
您还可以使用:
select t.store, t.location, split_part(products, ',', 1) as product
from t
union all
select t.store, t.location, split_part(products, ',', 2) as product
from t
where split_part(products, ',', 2) <> ''
union all
select t.store, t.location, split_part(products, ',', 3) as product
from t
where split_part(products, ',', 3) <> ''
union all
select t.store, t.location, split_part(products, ',', 4) as product
from t
where split_part(products, ',', 4) <> ''
union all
. . .
首先,您需要创建一个数字 table,因为加入另一个 table 是 redshift 将一行变成多行的唯一方法(没有展平或取消嵌套功能).
- 例如,一个 table 有 1024 行,其中的值是 1..1024
然后你可以加入并使用split_part()
SELECT
yourTable.*,
numbers.ordinal,
split_part(your_table.products, ',', numbers.ordinal) AS product
FROM
yourTable
INNER JOIN
numbers
ON numbers.ordinal >= 1
AND numbers.ordinal <= regexp_count(your_table.products, ',') + 1
但是...
Redshift 在预测所需行数方面很糟糕。它将连接整个 1024 行,然后拒绝不匹配的行。
它表现得像狗。
因为设计假设是这样的处理总是在加载到 Redshift 之前完成。
CREATE TABLE temptbl
(
store INT,
location NVARCHAR(MAX),
products NVARCHAR(MAX)
)
INSERT temptbl SELECT 1, 'New York', 'Fruit, drinks, candy'
创建的输出 table 当你被创建时
select * from temptbl
;WITH tmp(store, location, DataItem, products) AS
(
SELECT
store,
location,
LEFT(products, CHARINDEX(',', products + ',') - 1),
STUFF(products, 1, CHARINDEX(',', products + ','), '')
FROM temptbl
UNION all
SELECT
store ,
location,
LEFT(products, CHARINDEX(',', products + ',') - 1),
STUFF(products, 1, CHARINDEX(',', products + ','), '')
FROM tmp
WHERE
products > ''
)
SELECT
store,
location,
DataItem
FROM tmp
您希望在多行中使用逗号分隔值: 运行 以上命令后你想要的输出:
希望您找到解决方案:)))
MYSQL is also fine
CREATE TABLE test SELECT 1 store, 'New York' location, 'fruit,drinks,candy' products; SELECT store, location, product FROM test CROSS JOIN JSON_TABLE(CONCAT('["', REPLACE(products, ',', '","'), '"]'), "$[*]" COLUMNS (product VARCHAR(255) PATH "$")) jsontable
store location product 1 New York fruit 1 New York drinks 1 New York candy
db<>fiddle here
在 MySQL 中,这将适用于最多四个逗号分隔值。注意 UNION
,而不是 UNION ALL
。 Fiddle
SELECT store, location,
TRIM(SUBSTRING_INDEX(products, ',', 1)) product
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 2), ',', -1))
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 3), ',', -1))
FROM inventory
UNION
SELECT store, location,
TRIM(SUBSTRING_INDEX(SUBSTRING_INDEX(products, ',', 4), ',', -1))
FROM inventory
我会回应其他人所说的。恕我直言,逗号分隔值是一个糟糕的 table 设计。
- 丑陋 SQL。能够阅读和推理 SQL 非常重要。清晰总是赢。
- 而且,AWS 的股东会因此喜欢你,因为你会在 redshift 上花费 很多 额外的钱。