难以在 SELF JOIN (SQL) 中生成没有重复项的不同行

Having difficulty generating distinct rows without duplicates in a SELF JOIN (SQL)

我是 SQL 的新手,我想我可以用它来为我雇主的客户创建一个列表。不幸的是,那里的大多数客户都拥有多个帐户,并且文件中每个帐户都有不同的行。

我试图使用自联接为每个客户创建一行,并为帐户创建多列。

SELECT DISTINCT A.Account_Number AS Account_1, B.Account_Number AS Account_2, A.Client_Name
FROM client_table AS A, client_table AS B
WHERE A.Account_Number <> B.Account_Number
AND A.Client_Name = B.Client_Name
ORDER BY A.Client_Name;

不幸的是,结果是这样的 table 看起来像:

Account_1 Account_2 Client_name
000001 000002 Joe Shmo
000001 000003 Joe Shmo
000002 000003 Joe Shmo
000002 000001 Joe Shmo

我知道对于两个以上的帐户,我需要两个以上的连接,但我还不知道该怎么做。

有没有办法防止重复输入?

顺便说一句,我正在使用 BigQuery。

您可以使用 LEFT JOIN 执行此操作。根据需要对用户可能拥有的多个帐户重复此操作。

SELECT DISTINCT ct.Client_Name, ct1.Account_Number AS Account_1, ct2.Account_Number AS Account_2,
                ct3.Account_Number AS Account_3, ct4.Account_Number AS Account_4
FROM client_table ct
LEFT JOIN client_table ct1 ON ct1.Client_Name = ct.Client_Name AND ct1.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name)
LEFT JOIN client_table ct2 ON ct2.Client_Name = ct.Client_Name AND ct2.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct1.Account_Number)
LEFT JOIN client_table ct3 ON ct3.Client_Name = ct.Client_Name AND ct3.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct2.Account_Number)
LEFT JOIN client_table ct4 ON ct4.Client_Name = ct.Client_Name AND ct4.Account_Number = (SELECT MIN(Account_Number) FROM client_table WHERE Client_Name = ct.Client_Name AND Account_Number > ct3.Account_Number)

您可以使用 GROUP_CONCAT() 获取逗号分隔的列表。
或者您可以使用 RANK()CTE 中订购您的帐户,然后列出它们。
注意:不会显示任何第 7 个或以上的帐户!

create table clients (
cname varchar(10),
account char(6));
insert into clients values
('Joe Shmo','000001'),
('Joe Shmo','000002'),
('Joe Shmo','000003'),
('Joe Shmo','000004');
select 
   cname "Name", 
   group_concat(account ) account_numbers
from clients
group by cname
order by cname;

Name     | account_numbers            
:------- | :--------------------------
Joe Shmo | 000001,000002,000003,000004
with c as 
(select
cname,
account,
rank() over (partition by cname order by account) ranking
from clients)
select 
coalesce(max(case when ranking=1 then account end),'   -') account1,
coalesce(max(case when ranking=2 then account end),'   -') account2,
coalesce(max(case when ranking=3 then account end),'   -') account3,
coalesce(max(case when ranking=4 then account end),'   -') account4,
coalesce(max(case when ranking=5 then account end),'   -') account5,
coalesce(max(case when ranking=6 then account end),'   -') account6
from c
group by cname
account1 | account2 | account3 | account4 | account5 | account6
:------- | :------- | :------- | :------- | :------- | :-------
000001   | 000002   | 000003   | 000004   |    -     |    -    

db<>fiddle here

有关 bigQuery 中 group_concat 的更多信息,请参阅以下 post:

在第一步中,为每个客户的帐户编号。然后使用条件聚合为每个帐户获取一列。

select 
  client_name,
  max(case when rn = 1 then account_number end) as account1,
  max(case when rn = 2 then account_number end) as account2,
  max(case when rn = 3 then account_number end) as account3,
  max(case when rn = 4 then account_number end) as account4,
  max(case when rn = 5 then account_number end) as account5,
  max(case when rn = 6 then account_number end) as account6
from
(
  select
    client_name,
    account_number,
    row_number() over (partition by client_name order by account_number) as rn
  from client_table
) numbered
group by client_name
order by client_name;