如何在包含空值的同时取消嵌套多列
How to unnest multiple columns while including nulls
我有一个 table 看起来像:
id
site_names
site_addresses
industries
feis
30
Borden Incorporated
198 Saluda St , Chester , SC , 29706-1579 , United States|198 Saluda St, Chester, SC 29706, USA|198 Saluda St Chester SC 29706-1579 United States
Food and Cosmetics
12345|45678
31
Butterkrust Bakeries, Inc.|Flowers Baking Co. of Lakeland, LLC|Southern Bakeries, Inc. dba Butterkrust Bakeries
null
Food|Food and Cosmetics
12345
33
Church & Dwight Canada Corp.
5485 RUE FERRIER , , MONTREAL, QUEBEC Quebec , , -- , CA
null
null
我想将 table 拆分为实体化视图,其中每一行都是拆分 site_names、site_addresses、行业和 feis 时可能的组合之一。因此,例如,此数据中的几行将是:
id
site_name
site_address
industry
fei
30
Borden Incorporated
198 Saluda St , Chester , SC
Food and Cosmetics
12345
30
Borden Incorporated
198 Saluda St , Chester , SC
Food and Cosmetics
45678
30
Borden Incorporated
198 Saluda St, Chester, SC 29706, USA
Food and Cosmetics
12345
30
Borden Incorporated
198 Saluda St, Chester, SC 29706, USA
Food and Cosmetics
45678
...
31
Butterkrust Bakeries, Inc.
null
Food
12345
31
Flowers Baking Co. of Lakeland, LLC
null
Food
12345
我已经尝试了几种方法来实现这一点。我得到的最接近的是这段代码:
(
with Expanded2 as (
select raw_site_data.id as id_fei,
feis.feis
from raw_site_data,
unnest(string_to_array(raw_site_data.feis, '|')) feis
),
Expanded3 as (
select raw_site_data.id as id_name,
site_names.site_names
from raw_site_data,
unnest(string_to_array(raw_site_data.site_names, '|')) site_names
)
,
Expanded4 as (
select raw_site_data.id as id_address,
site_addresses.site_addresses
from raw_site_data,
unnest(string_to_array(raw_site_data.site_addresses, '|')) site_addresses)
,
Expanded5 as (
select raw_site_data.id as id_industry,
industries.industries
from raw_site_data,
unnest(string_to_array(raw_site_data.industries, '|')) industries)
select id_fei as site_id, feis as fei, site_names as site_name, site_addresses as site_address, industries as industry
from Expanded2, Expanded3, Expanded4, Expanded5 where Expanded2.id_fei = Expanded3.id_name and Expanded3.id_name = Expanded4.id_address and Expanded4.id_address = Expanded5.id_industry
);
这非常接近,但它不包含任何包含空值的行。有谁知道如何在结果中包含空值行时执行此查询?
结合更多可能相关的背景要点:
- id 在原始 table
中是唯一的非空整数
- Site_names、site_addresses、industries 和 feis 都需要拆分,并且都可能包含我想要包含的空值
- 我正在使用 Postgres 13
谢谢!
如果您想要所有组合,可以使用单个查询:
select rd.id, site_name, site_address, fei
from raw_data rd left join lateral
regexp_split_to_table(rd.site_names, '\|') site_name
on 1=1 left join lateral
regexp_split_to_table(rd.site_addresses, '\|') site_address
on 1=1 left join lateral
regexp_split_to_table(rd.feis, '\|') fei
on 1=1;
我有一个 table 看起来像:
id | site_names | site_addresses | industries | feis |
---|---|---|---|---|
30 | Borden Incorporated | 198 Saluda St , Chester , SC , 29706-1579 , United States|198 Saluda St, Chester, SC 29706, USA|198 Saluda St Chester SC 29706-1579 United States | Food and Cosmetics | 12345|45678 |
31 | Butterkrust Bakeries, Inc.|Flowers Baking Co. of Lakeland, LLC|Southern Bakeries, Inc. dba Butterkrust Bakeries | null | Food|Food and Cosmetics | 12345 |
33 | Church & Dwight Canada Corp. | 5485 RUE FERRIER , , MONTREAL, QUEBEC Quebec , , -- , CA | null | null |
我想将 table 拆分为实体化视图,其中每一行都是拆分 site_names、site_addresses、行业和 feis 时可能的组合之一。因此,例如,此数据中的几行将是:
id | site_name | site_address | industry | fei |
---|---|---|---|---|
30 | Borden Incorporated | 198 Saluda St , Chester , SC | Food and Cosmetics | 12345 |
30 | Borden Incorporated | 198 Saluda St , Chester , SC | Food and Cosmetics | 45678 |
30 | Borden Incorporated | 198 Saluda St, Chester, SC 29706, USA | Food and Cosmetics | 12345 |
30 | Borden Incorporated | 198 Saluda St, Chester, SC 29706, USA | Food and Cosmetics | 45678 |
... | ||||
31 | Butterkrust Bakeries, Inc. | null | Food | 12345 |
31 | Flowers Baking Co. of Lakeland, LLC | null | Food | 12345 |
我已经尝试了几种方法来实现这一点。我得到的最接近的是这段代码:
(
with Expanded2 as (
select raw_site_data.id as id_fei,
feis.feis
from raw_site_data,
unnest(string_to_array(raw_site_data.feis, '|')) feis
),
Expanded3 as (
select raw_site_data.id as id_name,
site_names.site_names
from raw_site_data,
unnest(string_to_array(raw_site_data.site_names, '|')) site_names
)
,
Expanded4 as (
select raw_site_data.id as id_address,
site_addresses.site_addresses
from raw_site_data,
unnest(string_to_array(raw_site_data.site_addresses, '|')) site_addresses)
,
Expanded5 as (
select raw_site_data.id as id_industry,
industries.industries
from raw_site_data,
unnest(string_to_array(raw_site_data.industries, '|')) industries)
select id_fei as site_id, feis as fei, site_names as site_name, site_addresses as site_address, industries as industry
from Expanded2, Expanded3, Expanded4, Expanded5 where Expanded2.id_fei = Expanded3.id_name and Expanded3.id_name = Expanded4.id_address and Expanded4.id_address = Expanded5.id_industry
);
这非常接近,但它不包含任何包含空值的行。有谁知道如何在结果中包含空值行时执行此查询?
结合更多可能相关的背景要点:
- id 在原始 table 中是唯一的非空整数
- Site_names、site_addresses、industries 和 feis 都需要拆分,并且都可能包含我想要包含的空值
- 我正在使用 Postgres 13
谢谢!
如果您想要所有组合,可以使用单个查询:
select rd.id, site_name, site_address, fei
from raw_data rd left join lateral
regexp_split_to_table(rd.site_names, '\|') site_name
on 1=1 left join lateral
regexp_split_to_table(rd.site_addresses, '\|') site_address
on 1=1 left join lateral
regexp_split_to_table(rd.feis, '\|') fei
on 1=1;