SQL 左连接拆分
SQL LEFT JOIN WITH SPLIT
我想在 table 上进行左连接,其中两列的格式不同。我使用 REPLACE 删除“[ ]”,但我无法将其中一行分成两行,因此无法完成连接。
emp_tbl state_tbl
emp state id name
+--------+-------+ +------+-----+
| Steve | [1] | | 1 | AL |
| Greg | [2|3] | | 2 | NV |
| Steve | [4] | | 3 | AZ |
+--------+-------+ | 4 | NH |
+------+-----+
Desired output:
+--------+------+
| Steve | AL |
| Greg | NV |
| Greg | AZ |
| Steve | NH |
+--------+------+
SELECT emp_tbl.emp, state_tbl.name
FROM emp_tbl
LEFT JOIN state_tbl on state_tbl.id = REPLACE(REPLACE(emp_tbl.state, '[', ''), ']', '')
通过这个查询,我可以删除“[]”并进行连接,但是具有两个“状态”的行确实不起作用。
您的查询永远不会产生 4 行,因为左侧 table 只有 3 行。您需要在连接之前展平包含多个 state_ids 的行。
- 准备table和数据:
create or replace table emp_tbl (emp varchar, state string);
create or replace table state_tbl (id varchar, name varchar);
insert into emp_tbl values
('Steve', '[1]'), ('Greg', '[2|3]'), ('Steve', '[4]');
insert into state_tbl values
(1, 'AL'), (2, 'NV'), (3, 'AZ'), (4, 'NH');
- 那么下面的查询应该会给你你想要的数据:
with emp_tbl_tmp as (
select emp, parse_json(replace(state, '|', ',')) as states from emp_tbl
),
flattened_tbl as (
select emp, value as state_id from emp_tbl_tmp, table(flatten(input => states))
)
select emp, name from flattened_tbl emp
left join state_tbl state on (emp.state_id = state.id);
或者如果你想节省一步:
with flattened_emp_tbl as (
select emp, value as state_id
from emp_tbl,
table(flatten(
input => parse_json(replace(state, '|', ','))
))
)
select emp, name from flattened_emp_tbl emp
left join state_tbl state
on (emp.state_id = state.id);
这里是你如何做到的:
select emp_tbl.emp, state_tbl.name
from emp_tbl tw
lateral flatten (input=>split(parse_json(tw.state), '|')) s
left join state_tbl on s.value = state_tbl.id
我想在 table 上进行左连接,其中两列的格式不同。我使用 REPLACE 删除“[ ]”,但我无法将其中一行分成两行,因此无法完成连接。
emp_tbl state_tbl
emp state id name
+--------+-------+ +------+-----+
| Steve | [1] | | 1 | AL |
| Greg | [2|3] | | 2 | NV |
| Steve | [4] | | 3 | AZ |
+--------+-------+ | 4 | NH |
+------+-----+
Desired output:
+--------+------+
| Steve | AL |
| Greg | NV |
| Greg | AZ |
| Steve | NH |
+--------+------+
SELECT emp_tbl.emp, state_tbl.name
FROM emp_tbl
LEFT JOIN state_tbl on state_tbl.id = REPLACE(REPLACE(emp_tbl.state, '[', ''), ']', '')
通过这个查询,我可以删除“[]”并进行连接,但是具有两个“状态”的行确实不起作用。
您的查询永远不会产生 4 行,因为左侧 table 只有 3 行。您需要在连接之前展平包含多个 state_ids 的行。
- 准备table和数据:
create or replace table emp_tbl (emp varchar, state string);
create or replace table state_tbl (id varchar, name varchar);
insert into emp_tbl values
('Steve', '[1]'), ('Greg', '[2|3]'), ('Steve', '[4]');
insert into state_tbl values
(1, 'AL'), (2, 'NV'), (3, 'AZ'), (4, 'NH');
- 那么下面的查询应该会给你你想要的数据:
with emp_tbl_tmp as (
select emp, parse_json(replace(state, '|', ',')) as states from emp_tbl
),
flattened_tbl as (
select emp, value as state_id from emp_tbl_tmp, table(flatten(input => states))
)
select emp, name from flattened_tbl emp
left join state_tbl state on (emp.state_id = state.id);
或者如果你想节省一步:
with flattened_emp_tbl as (
select emp, value as state_id
from emp_tbl,
table(flatten(
input => parse_json(replace(state, '|', ','))
))
)
select emp, name from flattened_emp_tbl emp
left join state_tbl state
on (emp.state_id = state.id);
这里是你如何做到的:
select emp_tbl.emp, state_tbl.name
from emp_tbl tw
lateral flatten (input=>split(parse_json(tw.state), '|')) s
left join state_tbl on s.value = state_tbl.id