通过 Netezza 查询将行折叠成一个字段
Collapse Rows into a Single Field via Netezza Query
我在 table 中工作,变量以 long/tall 格式存储。我需要将其转换为宽格式以便在项目中使用。基本上我需要聚合一个文本字段,或按名称折叠。下面的示例数据,我正在使用的 table 有 ~400k 行:
IID NAME LANGUAGE TID
1 William English 76
1 William French 82
1 William Spanish 12
1 William German 63
2 George German 39
2 George French 53
3 Dave English 29
我需要得到的是每个人 ID/Name 的一行,其中一个字段列出了该人使用的所有语言。我不需要考虑交易 ID。
IID NAME LANGUAGES
1 William English_French_German_Spanish
2 George French_German
3 Dave English
我的数据库是 Netezza,它是 PostgreSQL 的衍生产品。我创建了一个使用 PostgreSQL 的 SQL fiddle。我设法捕获了两种语言,但我的查询遗漏了超过 2 种语言,并且在只有 1 种语言时显示了双倍。有人能指出我正确的方向吗?
http://sqlfiddle.com/#!15/55706/1
SELECT T1.IID, T1.NAME,
MIN(T1.LANGUAGE) || '_' || MAX(T1.LANGUAGE) AS LANGUAGES
FROM Table1 AS T1
GROUP BY T1.IID, T1.NAME
ORDER BY T1.IID
;
尝试使用 group_concat
您的查询将类似于:
SELECT T1.IID, T1.NAME,
GROUP_CONCAT(T1.LANGUAGE,'_') AS LANGUAGES
FROM Table1 AS T1
GROUP BY T1.IID, T1.NAME
ORDER BY T1.IID;
一个博客link,可以帮助您更好地理解这个解析函数
希望对您有所帮助
我在 Dhaval's answer. There's a thread on IBM's DeveloperWorks community that addresses this exact question, Group Concat in Netezza. The solution that worked for me is within the 5th response, written by Diwakar Nahata 中搜索有关该命令的文档时找到了答案。这是为我解决问题的代码:
SELECT A.IID, A.NAME,
RTRIM(MAX(CASE RNO WHEN 1 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 2 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 3 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 4 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 5 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 6 THEN A.LANGUAGE ELSE '' END),',') AS LANGUAGES
FROM (SELECT
IID,
NAME,
LANGUAGE,
ROW_NUMBER()
OVER (PARTITION BY IID, NAME ORDER BY LANGUAGE) AS RNO
FROM Table1 ) AS A
GROUP BY A.IID, A.NAME
;
这里有一个 link 解决了 SQL fiddle。此 fiddle 设置为 PostgreSQL,但此查询在 Netezza 中也非常适合我。
我在 table 中工作,变量以 long/tall 格式存储。我需要将其转换为宽格式以便在项目中使用。基本上我需要聚合一个文本字段,或按名称折叠。下面的示例数据,我正在使用的 table 有 ~400k 行:
IID NAME LANGUAGE TID
1 William English 76
1 William French 82
1 William Spanish 12
1 William German 63
2 George German 39
2 George French 53
3 Dave English 29
我需要得到的是每个人 ID/Name 的一行,其中一个字段列出了该人使用的所有语言。我不需要考虑交易 ID。
IID NAME LANGUAGES
1 William English_French_German_Spanish
2 George French_German
3 Dave English
我的数据库是 Netezza,它是 PostgreSQL 的衍生产品。我创建了一个使用 PostgreSQL 的 SQL fiddle。我设法捕获了两种语言,但我的查询遗漏了超过 2 种语言,并且在只有 1 种语言时显示了双倍。有人能指出我正确的方向吗?
http://sqlfiddle.com/#!15/55706/1
SELECT T1.IID, T1.NAME,
MIN(T1.LANGUAGE) || '_' || MAX(T1.LANGUAGE) AS LANGUAGES
FROM Table1 AS T1
GROUP BY T1.IID, T1.NAME
ORDER BY T1.IID
;
尝试使用 group_concat 您的查询将类似于:
SELECT T1.IID, T1.NAME,
GROUP_CONCAT(T1.LANGUAGE,'_') AS LANGUAGES
FROM Table1 AS T1
GROUP BY T1.IID, T1.NAME
ORDER BY T1.IID;
一个博客link,可以帮助您更好地理解这个解析函数
希望对您有所帮助
我在 Dhaval's answer. There's a thread on IBM's DeveloperWorks community that addresses this exact question, Group Concat in Netezza. The solution that worked for me is within the 5th response, written by Diwakar Nahata 中搜索有关该命令的文档时找到了答案。这是为我解决问题的代码:
SELECT A.IID, A.NAME,
RTRIM(MAX(CASE RNO WHEN 1 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 2 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 3 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 4 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 5 THEN A.LANGUAGE ELSE '' END)||','||
MAX(CASE RNO WHEN 6 THEN A.LANGUAGE ELSE '' END),',') AS LANGUAGES
FROM (SELECT
IID,
NAME,
LANGUAGE,
ROW_NUMBER()
OVER (PARTITION BY IID, NAME ORDER BY LANGUAGE) AS RNO
FROM Table1 ) AS A
GROUP BY A.IID, A.NAME
;
这里有一个 link 解决了 SQL fiddle。此 fiddle 设置为 PostgreSQL,但此查询在 Netezza 中也非常适合我。