如何快速识别 array-columns 中的不同组合,然后在 sql 中取消嵌套
How do I identify distinct combinations across array-columns and then unnest in sql presto
我有一个名为程序的数据库创建为
CREATE TABLE programs (
name varchar(200) NOT NULL,
role varchar(200) NOT NULL,
section text[] NOT NULL,
sub_section text[] NOT NULL,
title text[] NOT NULL
);
INSERT INTO programs (name, role, section, sub_section, title) VALUES
('John','Lead','{"VII","VII","VII"}','{"A","A","C"}','{"STUDY","STUDY","STUDY"}'),
('Olga','Member','{"VII","VII"}','{"A","A"}','{"STUDY","STUDY"}'),
('Ben','Co-Lead','{"XI","X"}','{"A","B"}','{"STUDY","TRAVEL"}'),
('Ana','Member','{"VII","II","VI"}','{"A","ALL","B"}','{"STUDY","STUDY","TRAVEL"}');
这是 table 的样子
| name | role | section | sub_section | title |
| ---- | ------- | ------------ | ----------- | ------------------------ |
| John | Lead | VII,VII,VII | A,A,C | STUDY,STUDY,STUDY |
| Olga | Member | VII,VII | A,A | STUDY,STUDY |
| Ben | Co-Lead | XI,X | A,B | STUDY,TRAVEL |
| Ana | Member | VII,II,VI | A,ALL,B | STUDY,STUDY,TRAVEL |
我想识别整个部分、sub-section 和标题列的不同组合,以及取消嵌套以将其作为输出
| name | role | section.sub_section | title |
| ---- | ------- | ------------------- | ------------------------ |
| John | Lead | VII.A | STUDY
| John | Lead | VII.C | STUDY
| Olga | Member | VII.A | STUDY
| Ben | Co-Lead | XI.A | STUDY
| Ben | Co-Lead | X.B | TRAVEL
| Ana | Member | VII.A | STUDY
| Ana | Member | II.ALL | STUDY
| Ana | Member | VI.B | TRAVEL
我是 SQL 的新手,我真的很难获得所需的输出。非常感谢您的帮助。
你想要的数据没有显示“组合横跨部分,sub-section,和标题列”,看来你需要根据位置匹配相应的数组,所以你可以unnest
并按要区分的字段分组。
假设相应的列包含 varchars 数组(如果不是 - 您将需要使用一些字符串函数来转换它们):
-- sample data
WITH dataset (name, role, section, sub_section, title) AS (
VALUES ('John','Lead',array['VII','VII','VII'],array['A','A','C'],array['STUDY','STUDY','STUDY']),
('Olga','Member',array['VII','VII'],array['A','A'],array['STUDY','STUDY']),
('Ben','Co-Lead',array['XI','X'],array['A','B'],array['STUDY','TRAVEL']),
('Ana','Member',array['VII','II','VI'],array['A','ALL','B'],array['STUDY','STUDY','TRAVEL'])
)
--query
select name,
role,
sec || '.' || sub_sec "section.sub_section",
t title
from dataset
cross join unnest(section, sub_section, title) as t(sec, sub_sec, t)
group by name, role, sec, sub_sec, t
order by name
输出:
name
role
section.sub_section
title
Ana
Member
VII.A
STUDY
Ana
Member
II.ALL
STUDY
Ana
Member
VI.B
TRAVEL
Ben
Co-Lead
XI.A
STUDY
Ben
Co-Lead
X.B
TRAVEL
John
Lead
VII.A
STUDY
John
Lead
VII.C
STUDY
Olga
Member
VII.A
STUDY
我有一个名为程序的数据库创建为
CREATE TABLE programs (
name varchar(200) NOT NULL,
role varchar(200) NOT NULL,
section text[] NOT NULL,
sub_section text[] NOT NULL,
title text[] NOT NULL
);
INSERT INTO programs (name, role, section, sub_section, title) VALUES
('John','Lead','{"VII","VII","VII"}','{"A","A","C"}','{"STUDY","STUDY","STUDY"}'),
('Olga','Member','{"VII","VII"}','{"A","A"}','{"STUDY","STUDY"}'),
('Ben','Co-Lead','{"XI","X"}','{"A","B"}','{"STUDY","TRAVEL"}'),
('Ana','Member','{"VII","II","VI"}','{"A","ALL","B"}','{"STUDY","STUDY","TRAVEL"}');
这是 table 的样子
| name | role | section | sub_section | title |
| ---- | ------- | ------------ | ----------- | ------------------------ |
| John | Lead | VII,VII,VII | A,A,C | STUDY,STUDY,STUDY |
| Olga | Member | VII,VII | A,A | STUDY,STUDY |
| Ben | Co-Lead | XI,X | A,B | STUDY,TRAVEL |
| Ana | Member | VII,II,VI | A,ALL,B | STUDY,STUDY,TRAVEL |
我想识别整个部分、sub-section 和标题列的不同组合,以及取消嵌套以将其作为输出
| name | role | section.sub_section | title |
| ---- | ------- | ------------------- | ------------------------ |
| John | Lead | VII.A | STUDY
| John | Lead | VII.C | STUDY
| Olga | Member | VII.A | STUDY
| Ben | Co-Lead | XI.A | STUDY
| Ben | Co-Lead | X.B | TRAVEL
| Ana | Member | VII.A | STUDY
| Ana | Member | II.ALL | STUDY
| Ana | Member | VI.B | TRAVEL
我是 SQL 的新手,我真的很难获得所需的输出。非常感谢您的帮助。
你想要的数据没有显示“组合横跨部分,sub-section,和标题列”,看来你需要根据位置匹配相应的数组,所以你可以unnest
并按要区分的字段分组。
假设相应的列包含 varchars 数组(如果不是 - 您将需要使用一些字符串函数来转换它们):
-- sample data
WITH dataset (name, role, section, sub_section, title) AS (
VALUES ('John','Lead',array['VII','VII','VII'],array['A','A','C'],array['STUDY','STUDY','STUDY']),
('Olga','Member',array['VII','VII'],array['A','A'],array['STUDY','STUDY']),
('Ben','Co-Lead',array['XI','X'],array['A','B'],array['STUDY','TRAVEL']),
('Ana','Member',array['VII','II','VI'],array['A','ALL','B'],array['STUDY','STUDY','TRAVEL'])
)
--query
select name,
role,
sec || '.' || sub_sec "section.sub_section",
t title
from dataset
cross join unnest(section, sub_section, title) as t(sec, sub_sec, t)
group by name, role, sec, sub_sec, t
order by name
输出:
name | role | section.sub_section | title |
---|---|---|---|
Ana | Member | VII.A | STUDY |
Ana | Member | II.ALL | STUDY |
Ana | Member | VI.B | TRAVEL |
Ben | Co-Lead | XI.A | STUDY |
Ben | Co-Lead | X.B | TRAVEL |
John | Lead | VII.A | STUDY |
John | Lead | VII.C | STUDY |
Olga | Member | VII.A | STUDY |