难以在 SELF JOIN (SQL) 中生成没有重复项的不同行
Having difficulty generating distinct rows without duplicates in a SELF JOIN (SQL)
我是 SQL 的新手,我想我可以用它来为我雇主的客户创建一个列表。不幸的是,那里的大多数客户都拥有多个帐户,并且文件中每个帐户都有不同的行。
我试图使用自联接为每个客户创建一行,并为帐户创建多列。
SELECT DISTINCT A.Account_Number AS Account_1, B.Account_Number AS Account_2, A.Client_Name
FROM client_table AS A, client_table AS B
WHERE A.Account_Number <> B.Account_Number
AND A.Client_Name = B.Client_Name
ORDER BY A.Client_Name;
不幸的是,结果是这样的 table 看起来像:
Account_1
Account_2
Client_name
000001
000002
Joe Shmo
000001
000003
Joe Shmo
000002
000003
Joe Shmo
000002
000001
Joe Shmo
我知道对于两个以上的帐户,我需要两个以上的连接,但我还不知道该怎么做。
有没有办法防止重复输入?
顺便说一句,我正在使用 BigQuery。
您可以使用 LEFT JOIN
执行此操作。根据需要对用户可能拥有的多个帐户重复此操作。
SELECT DISTINCT ct.Client_Name, ct1.Account_Number AS Account_1, ct2.Account_Number AS Account_2,
ct3.Account_Number AS Account_3, ct4.Account_Number AS Account_4
FROM client_table ct
LEFT JOIN client_table ct1 ON ct1.Client_Name = ct.Client_Name AND ct1.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name)
LEFT JOIN client_table ct2 ON ct2.Client_Name = ct.Client_Name AND ct2.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct1.Account_Number)
LEFT JOIN client_table ct3 ON ct3.Client_Name = ct.Client_Name AND ct3.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct2.Account_Number)
LEFT JOIN client_table ct4 ON ct4.Client_Name = ct.Client_Name AND ct4.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct3.Account_Number)
您可以使用 GROUP_CONCAT()
获取逗号分隔的列表。
或者您可以使用 RANK()
在 CTE
中订购您的帐户,然后列出它们。
注意:不会显示任何第 7 个或以上的帐户!
create table clients (
cname varchar(10),
account char(6));
insert into clients values
('Joe Shmo','000001'),
('Joe Shmo','000002'),
('Joe Shmo','000003'),
('Joe Shmo','000004');
select
cname "Name",
group_concat(account ) account_numbers
from clients
group by cname
order by cname;
Name | account_numbers
:------- | :--------------------------
Joe Shmo | 000001,000002,000003,000004
with c as
(select
cname,
account,
rank() over (partition by cname order by account) ranking
from clients)
select
coalesce(max(case when ranking=1 then account end),' -') account1,
coalesce(max(case when ranking=2 then account end),' -') account2,
coalesce(max(case when ranking=3 then account end),' -') account3,
coalesce(max(case when ranking=4 then account end),' -') account4,
coalesce(max(case when ranking=5 then account end),' -') account5,
coalesce(max(case when ranking=6 then account end),' -') account6
from c
group by cname
account1 | account2 | account3 | account4 | account5 | account6
:------- | :------- | :------- | :------- | :------- | :-------
000001 | 000002 | 000003 | 000004 | - | -
db<>fiddle here
有关 bigQuery 中 group_concat 的更多信息,请参阅以下 post:
在第一步中,为每个客户的帐户编号。然后使用条件聚合为每个帐户获取一列。
select
client_name,
max(case when rn = 1 then account_number end) as account1,
max(case when rn = 2 then account_number end) as account2,
max(case when rn = 3 then account_number end) as account3,
max(case when rn = 4 then account_number end) as account4,
max(case when rn = 5 then account_number end) as account5,
max(case when rn = 6 then account_number end) as account6
from
(
select
client_name,
account_number,
row_number() over (partition by client_name order by account_number) as rn
from client_table
) numbered
group by client_name
order by client_name;
我是 SQL 的新手,我想我可以用它来为我雇主的客户创建一个列表。不幸的是,那里的大多数客户都拥有多个帐户,并且文件中每个帐户都有不同的行。
我试图使用自联接为每个客户创建一行,并为帐户创建多列。
SELECT DISTINCT A.Account_Number AS Account_1, B.Account_Number AS Account_2, A.Client_Name
FROM client_table AS A, client_table AS B
WHERE A.Account_Number <> B.Account_Number
AND A.Client_Name = B.Client_Name
ORDER BY A.Client_Name;
不幸的是,结果是这样的 table 看起来像:
Account_1 | Account_2 | Client_name |
---|---|---|
000001 | 000002 | Joe Shmo |
000001 | 000003 | Joe Shmo |
000002 | 000003 | Joe Shmo |
000002 | 000001 | Joe Shmo |
我知道对于两个以上的帐户,我需要两个以上的连接,但我还不知道该怎么做。
有没有办法防止重复输入?
顺便说一句,我正在使用 BigQuery。
您可以使用 LEFT JOIN
执行此操作。根据需要对用户可能拥有的多个帐户重复此操作。
SELECT DISTINCT ct.Client_Name, ct1.Account_Number AS Account_1, ct2.Account_Number AS Account_2,
ct3.Account_Number AS Account_3, ct4.Account_Number AS Account_4
FROM client_table ct
LEFT JOIN client_table ct1 ON ct1.Client_Name = ct.Client_Name AND ct1.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name)
LEFT JOIN client_table ct2 ON ct2.Client_Name = ct.Client_Name AND ct2.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct1.Account_Number)
LEFT JOIN client_table ct3 ON ct3.Client_Name = ct.Client_Name AND ct3.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct2.Account_Number)
LEFT JOIN client_table ct4 ON ct4.Client_Name = ct.Client_Name AND ct4.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct3.Account_Number)
您可以使用 GROUP_CONCAT()
获取逗号分隔的列表。
或者您可以使用 RANK()
在 CTE
中订购您的帐户,然后列出它们。
注意:不会显示任何第 7 个或以上的帐户!
create table clients ( cname varchar(10), account char(6)); insert into clients values ('Joe Shmo','000001'), ('Joe Shmo','000002'), ('Joe Shmo','000003'), ('Joe Shmo','000004');
select cname "Name", group_concat(account ) account_numbers from clients group by cname order by cname;
Name | account_numbers :------- | :-------------------------- Joe Shmo | 000001,000002,000003,000004
with c as (select cname, account, rank() over (partition by cname order by account) ranking from clients) select coalesce(max(case when ranking=1 then account end),' -') account1, coalesce(max(case when ranking=2 then account end),' -') account2, coalesce(max(case when ranking=3 then account end),' -') account3, coalesce(max(case when ranking=4 then account end),' -') account4, coalesce(max(case when ranking=5 then account end),' -') account5, coalesce(max(case when ranking=6 then account end),' -') account6 from c group by cname
account1 | account2 | account3 | account4 | account5 | account6 :------- | :------- | :------- | :------- | :------- | :------- 000001 | 000002 | 000003 | 000004 | - | -
db<>fiddle here
有关 bigQuery 中 group_concat 的更多信息,请参阅以下 post:
在第一步中,为每个客户的帐户编号。然后使用条件聚合为每个帐户获取一列。
select
client_name,
max(case when rn = 1 then account_number end) as account1,
max(case when rn = 2 then account_number end) as account2,
max(case when rn = 3 then account_number end) as account3,
max(case when rn = 4 then account_number end) as account4,
max(case when rn = 5 then account_number end) as account5,
max(case when rn = 6 then account_number end) as account6
from
(
select
client_name,
account_number,
row_number() over (partition by client_name order by account_number) as rn
from client_table
) numbered
group by client_name
order by client_name;