确定列中的唯一家庭

Determining the unique household in an column

Name Name_Partner Startdate_relation Enddate_relation
John Wilma 01-01-1990 NULL
Wilma John 01-01-1990 NULL
John Lucy 01-01-1995 31-01-1995
Lucy John 01-01-1995 31-01-1995
Lucy Ronaldo 01-02-1995 NULL
Ronaldo Lucy 01-02-1995 NULL
Ronaldo Kim 01-01-1995 31-01-1995
Ronaldo Kim 01-01-1998 24-07-1998
Kim Ronaldo 01-01-1995 31-01-1995
Kim Ronaldo 01-01-1998 24-07-1998
Kim Angelina 01-02-1995 NULL
Angelina Kim 01-02-1995 NULL

我得到了一份包含上述信息的 table(比如 Table“information_Client”)。 table 具有唯一的客户编号而不是名称,但对于此示例,我使用了虚构的名称。我想查询可以确定的独特家庭。 table 包含两种方式的关系,这意味着您在名为“name”和“name_partner”的列中得到同一个人。如果他们有关系,UniqueHousehold 列应该始终是“name”列的第一人称。我不是只寻找未结束的关系,而是寻找所有的关系。我想通过 years/months 展示独特的家庭。 sql-server 中是否有对此的查询?

预期结果:

Name Name_Partner UniqueHousehold Startdate_relation Enddate_relation
John Wilma John 01-01-1990 NULL
Wilma John John 01-01-1990 NULL
John Lucy John 01-01-1995 31-01-1995
Lucy John John 01-01-1995 31-01-1995
Lucy Ronaldo Lucy 01-02-1995 NULL
Ronaldo Lucy Lucy 01-02-1995 NULL
Ronaldo Kim Kim 01-01-1995 31-01-1995
Ronaldo Kim Kim 01-01-1998 24-07-1998
Kim Ronaldo Kim 01-01-1995 31-01-1995
Kim Ronaldo Kim 01-01-1998 24-07-1998
Kim Angelina Angelina 01-02-1995 NULL
Angelina Kim Angelina 01-02-1995 NULL

我尝试使用的代码是这样的:

SELECT NAME, NAME_PARTNER, Startdate_relation, Enddate_relation, Load_DateTime
INTO #Customer
FROM
    Information_Client
;
WITH RowNum AS(
 SELECT *, ROW_NUMBER() OVER(ORDER BY NAME) AS Id
  FROM #Customer)

INSERT INTO Client
(      [Name]
      ,[Name_Partner]
      ,[UniqueHousehold]
      ,[Startdate_relation]
      ,[Enddate_relation]
      ,[Load_DateTime]
)
  SELECT
    C.NAME
    , C.NAME_PARTNER
    , COALESCE(HH.NAME, C.NAME) AS UniqueHousehold
    , C.Startdate_relation
    , C.Enddate_relation
    , C.Load_DateTime

FROM
    RowNum           AS C
    LEFT JOIN RowNum AS HH
        ON C.NAME_PARTNER = HH.NAME
        AND C.Id > HH.Id

我得到的是一行有多行 --> 因为生成了 ID。我只想像上面的 table

那样只有 1 行

这是我现在得到的结果:

Name Name_Partner UniqueHousehold Startdate_relation Enddate_relation C.ID HH.ID
Ronaldo Kim Kim 01-01-1995 31-01-1995 29 NULL
Ronaldo Kim Kim 01-01-1998 24-07-1998 30 NULL
Kim Ronaldo Kim 01-01-1995 31-01-1995 6177 29
Kim Ronaldo Kim 01-01-1995 31-01-1995 6177 30
Kim Ronaldo Kim 01-01-1998 24-07-1998 6178 29
Kim Ronaldo Kim 01-01-1998 24-07-1998 6178 30

期望的结果:

Name Name_Partner UniqueHousehold Startdate_relation Enddate_relation
Ronaldo Kim Kim 01-01-1995 31-01-1995
Ronaldo Kim Kim 01-01-1998 24-07-1998
Kim Ronaldo Kim 01-01-1995 31-01-1995
Kim Ronaldo Kim 01-01-1998 24-07-1998

要识别一个独特的家庭,我们需要包括 2 个独特的标识符。为了确定 1-2 与 1-2 相同,我们指定将按字母顺序排在第一位的放在第一位。在这里我们使用名称,但相同的主体将与 id 一起使用 - 只要我们通过使用分隔符连接它们而不是数字加法来连接它们。我们需要知道A+BC和AB+C的区别,10+20和20+10的区别。

create table t (Name  varchar(10),Name_Partner    varchar(10),Startdate_relation  date,Enddate_relation date);
insert into t values
('John','Lucy','1995-01-01','1995-01-31'),
('Lucy','John','1995-01-01','1995-01-31'),
('Ronaldo','Kim','1995-01-01','1995-01-31'),
('Ronaldo','Kim','1998-01-01','1998-07-24'),
('Kim','Ronaldo','1995-01-01','1995-01-31'),
('Kim','Ronaldo','1998-01-01','1998-07-24');
insert into t (Name, Name_Partner, Startdate_relation) values
('John','Wilma','1990-01-01' ),
('Wilma','John','1990-01-01' ),
('Lucy','Ronaldo','1995-02-01' ),
('Ronaldo','Lucy','1995-02-01' ),
('Kim','Angelina','1995-02-01' ),
('Angelina','Kim','1995-02-01' );
select
  name,
  name_partner,
  case when name < name_partner then concat(name,'-',name_partner)
       else concat(name_partner,'-',name) end unique_household,
  Startdate_relation,
  Enddate_relation
from t
order by 3;
GO
name     | name_partner | unique_household | Startdate_relation | Enddate_relation
:------- | :----------- | :--------------- | :----------------- | :---------------
Kim      | Angelina     | Angelina-Kim     | 1995-02-01         | null            
Angelina | Kim          | Angelina-Kim     | 1995-02-01         | null            
John     | Lucy         | John-Lucy        | 1995-01-01         | 1995-01-31      
Lucy     | John         | John-Lucy        | 1995-01-01         | 1995-01-31      
John     | Wilma        | John-Wilma       | 1990-01-01         | null            
Wilma    | John         | John-Wilma       | 1990-01-01         | null            
Ronaldo  | Kim          | Kim-Ronaldo      | 1995-01-01         | 1995-01-31      
Ronaldo  | Kim          | Kim-Ronaldo      | 1998-01-01         | 1998-07-24      
Kim      | Ronaldo      | Kim-Ronaldo      | 1995-01-01         | 1995-01-31      
Kim      | Ronaldo      | Kim-Ronaldo      | 1998-01-01         | 1998-07-24      
Lucy     | Ronaldo      | Lucy-Ronaldo     | 1995-02-01         | null            
Ronaldo  | Lucy         | Lucy-Ronaldo     | 1995-02-01         | null            

db<>fiddle here