FULL OUTER JOIN 将表与 PostgreSQL 合并
FULL OUTER JOIN to merge tables with PostgreSQL
跟随 I still have an issue when I apply the answer given by @Vao Tsun 到一个更大的数据集,这次有 4 个表,而不是上面提到的相关 post 中的 2 个表。
这是我的数据集:
-- Table 'brcht' (empty)
insee | annee | nb
-------+--------+-----
-- Table 'cana'
insee | annee | nb
-------+--------+-----
036223 | 2017 | 1
086001 | 2016 | 2
-- Table 'font' (empty)
insee | annee | nb
-------+--------+-----
-- Table 'nr'
insee | annee | nb
-------+--------+-----
036223 | 2013 | 1
036223 | 2014 | 1
086001 | 2013 | 1
086001 | 2014 | 2
086001 | 2015 | 4
086001 | 2016 | 2
这里是查询:
SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
在结果中,我仍然有两行而不是 insee='086001'
的一行(见下文)。我需要为每个 insee
获取一行,在此示例中,两个 2
值应与显示 4
值的 total
列位于同一行。
再次感谢您的帮助!
这里有 SQL 脚本可以轻松创建上面的表格:
CREATE TABLE public.brcht (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.cana (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.font (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.nr (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
INSERT INTO public.cana (insee, annee, nb) VALUES ('036223', 2017, 1), ('086001', 2016, 2);
INSERT INTO public.nr(insee, annee, nb) VALUES ('036223', 2013, 1), ('036223', 2014, 1), ('086001', 2013, 1), ('086001', 2014, 2), ('086001', 2015, 4), ('086001', 2016, 2);
尝试:
t=# SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON cana.insee = nr.insee AND cana.annee = nr.annee
ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
insee | annee | brcht | cana | font | nr | total
--------+-------+-------+------+------+----+-------
036223 | 2013 | 0 | 0 | 0 | 1 | 1
036223 | 2014 | 0 | 0 | 0 | 1 | 1
036223 | 2017 | 0 | 1 | 0 | 0 | 1
086001 | 2013 | 0 | 0 | 0 | 1 | 1
086001 | 2014 | 0 | 0 | 0 | 2 | 2
086001 | 2015 | 0 | 0 | 0 | 4 | 4
086001 | 2016 | 0 | 2 | 0 | 2 | 4
(7 rows)
在您的示例中,您加入 nr
反对 font
,而您可能想加入反对 cana
?..
另请查看此处:
https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-JOIN
In the absence of parentheses, JOIN clauses nest left-to-right
更新
解释逻辑:
尝试 select * from public.brcht
,添加其他 table 个,一个一个
来自 "righter" table 的列出现了,所以当你 运行 所有四个都加入时,你会得到:
t=# select *
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
t-# ;
insee | annee | nb | insee | annee | nb | insee | annee | nb | insee | annee | nb
-------+-------+----+--------+-------+----+-------+-------+----+--------+-------+----
| | | 036223 | 2017 | 1 | | | | | |
| | | 086001 | 2016 | 2 | | | | | |
| | | | | | | | | 036223 | 2013 | 1
| | | | | | | | | 036223 | 2014 | 1
| | | | | | | | | 086001 | 2013 | 1
| | | | | | | | | 086001 | 2014 | 2
| | | | | | | | | 086001 | 2015 | 4
| | | | | | | | | 086001 | 2016 | 2
(8 rows)
所以第 8 列是 font.annee
(请注意 - 它到处都是空的) - 你用 nr.insee
加入它 - 没有匹配 - 所以完全连接需要前三行的所有行 tables 连接和来自 nr
table 的所有行 - 你得到 8 行
您需要对 bigint 列执行 GROUP BY 和 SUM(),针对您现在使用的查询。
select
insee, annee
, sum(brcht) brcht
, sum(cana) cana
, sum(font) font
, sum(nr) nr
, sum(total) total
from (
SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
) d
group by
insee, annee
受到其他答案的启发,但可能组织得更好:
SELECT *,
brcht + cana + font + nr AS total
FROM (SELECT insee,
annee,
SUM(Coalesce(brcht.nb, 0)) brcht,
SUM(Coalesce(cana.nb, 0)) cana,
SUM(Coalesce(font.nb, 0)) font,
SUM(Coalesce(nr.nb, 0)) nr
FROM brcht
full outer join cana USING (insee, annee)
full outer join font USING (insee, annee)
full outer join nr USING (insee, annee)
GROUP BY insee,
annee) t
ORDER BY insee,
annee;
给予:
insee | annee | brcht | cana | font | nr | total
--------+-------+-------+------+------+----+-------
036223 | 2013 | 0 | 0 | 0 | 1 | 1
036223 | 2014 | 0 | 0 | 0 | 1 | 1
036223 | 2017 | 0 | 1 | 0 | 0 | 1
086001 | 2013 | 0 | 0 | 0 | 1 | 1
086001 | 2014 | 0 | 0 | 0 | 2 | 2
086001 | 2015 | 0 | 0 | 0 | 4 | 4
086001 | 2016 | 0 | 2 | 0 | 2 | 4
(7 rows)
跟随
这是我的数据集:
-- Table 'brcht' (empty)
insee | annee | nb
-------+--------+-----
-- Table 'cana'
insee | annee | nb
-------+--------+-----
036223 | 2017 | 1
086001 | 2016 | 2
-- Table 'font' (empty)
insee | annee | nb
-------+--------+-----
-- Table 'nr'
insee | annee | nb
-------+--------+-----
036223 | 2013 | 1
036223 | 2014 | 1
086001 | 2013 | 1
086001 | 2014 | 2
086001 | 2015 | 4
086001 | 2016 | 2
这里是查询:
SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
在结果中,我仍然有两行而不是 insee='086001'
的一行(见下文)。我需要为每个 insee
获取一行,在此示例中,两个 2
值应与显示 4
值的 total
列位于同一行。
再次感谢您的帮助!
这里有 SQL 脚本可以轻松创建上面的表格:
CREATE TABLE public.brcht (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.cana (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.font (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
CREATE TABLE public.nr (insee CHARACTER VARYING(10), annee INTEGER, nb INTEGER);
INSERT INTO public.cana (insee, annee, nb) VALUES ('036223', 2017, 1), ('086001', 2016, 2);
INSERT INTO public.nr(insee, annee, nb) VALUES ('036223', 2013, 1), ('036223', 2014, 1), ('086001', 2013, 1), ('086001', 2014, 2), ('086001', 2015, 4), ('086001', 2016, 2);
尝试:
t=# SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON cana.insee = nr.insee AND cana.annee = nr.annee
ORDER BY COALESCE(brcht.insee, cana.insee, font.insee, nr.insee), COALESCE(brcht.annee, cana.annee, font.annee, nr.annee);
insee | annee | brcht | cana | font | nr | total
--------+-------+-------+------+------+----+-------
036223 | 2013 | 0 | 0 | 0 | 1 | 1
036223 | 2014 | 0 | 0 | 0 | 1 | 1
036223 | 2017 | 0 | 1 | 0 | 0 | 1
086001 | 2013 | 0 | 0 | 0 | 1 | 1
086001 | 2014 | 0 | 0 | 0 | 2 | 2
086001 | 2015 | 0 | 0 | 0 | 4 | 4
086001 | 2016 | 0 | 2 | 0 | 2 | 4
(7 rows)
在您的示例中,您加入 nr
反对 font
,而您可能想加入反对 cana
?..
另请查看此处: https://www.postgresql.org/docs/current/static/queries-table-expressions.html#QUERIES-JOIN
In the absence of parentheses, JOIN clauses nest left-to-right
更新
解释逻辑:
尝试 select * from public.brcht
,添加其他 table 个,一个一个
来自 "righter" table 的列出现了,所以当你 运行 所有四个都加入时,你会得到:
t=# select *
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
t-# ;
insee | annee | nb | insee | annee | nb | insee | annee | nb | insee | annee | nb
-------+-------+----+--------+-------+----+-------+-------+----+--------+-------+----
| | | 036223 | 2017 | 1 | | | | | |
| | | 086001 | 2016 | 2 | | | | | |
| | | | | | | | | 036223 | 2013 | 1
| | | | | | | | | 036223 | 2014 | 1
| | | | | | | | | 086001 | 2013 | 1
| | | | | | | | | 086001 | 2014 | 2
| | | | | | | | | 086001 | 2015 | 4
| | | | | | | | | 086001 | 2016 | 2
(8 rows)
所以第 8 列是 font.annee
(请注意 - 它到处都是空的) - 你用 nr.insee
加入它 - 没有匹配 - 所以完全连接需要前三行的所有行 tables 连接和来自 nr
table 的所有行 - 你得到 8 行
您需要对 bigint 列执行 GROUP BY 和 SUM(),针对您现在使用的查询。
select
insee, annee
, sum(brcht) brcht
, sum(cana) cana
, sum(font) font
, sum(nr) nr
, sum(total) total
from (
SELECT
COALESCE(brcht.insee, cana.insee, font.insee, nr.insee) AS insee,
COALESCE(brcht.annee, cana.annee, font.annee, nr.annee) AS annee,
COALESCE(brcht.nb,0) AS brcht,
COALESCE(cana.nb,0) AS cana,
COALESCE(font.nb,0) AS font,
COALESCE(nr.nb,0) AS nr,
COALESCE(brcht.nb,0) + COALESCE(cana.nb,0) + COALESCE(font.nb,0) + COALESCE(nr.nb,0) AS total
FROM public.brcht
FULL OUTER JOIN public.cana ON brcht.insee = cana.insee AND brcht.annee = cana.annee
FULL OUTER JOIN public.font ON cana.insee = font.insee AND cana.annee = font.annee
FULL OUTER JOIN public.nr ON font.insee = nr.insee AND font.annee = nr.annee
) d
group by
insee, annee
受到其他答案的启发,但可能组织得更好:
SELECT *,
brcht + cana + font + nr AS total
FROM (SELECT insee,
annee,
SUM(Coalesce(brcht.nb, 0)) brcht,
SUM(Coalesce(cana.nb, 0)) cana,
SUM(Coalesce(font.nb, 0)) font,
SUM(Coalesce(nr.nb, 0)) nr
FROM brcht
full outer join cana USING (insee, annee)
full outer join font USING (insee, annee)
full outer join nr USING (insee, annee)
GROUP BY insee,
annee) t
ORDER BY insee,
annee;
给予:
insee | annee | brcht | cana | font | nr | total
--------+-------+-------+------+------+----+-------
036223 | 2013 | 0 | 0 | 0 | 1 | 1
036223 | 2014 | 0 | 0 | 0 | 1 | 1
036223 | 2017 | 0 | 1 | 0 | 0 | 1
086001 | 2013 | 0 | 0 | 0 | 1 | 1
086001 | 2014 | 0 | 0 | 0 | 2 | 2
086001 | 2015 | 0 | 0 | 0 | 4 | 4
086001 | 2016 | 0 | 2 | 0 | 2 | 4
(7 rows)