确定列中的唯一家庭
Determining the unique household in an column
Name
Name_Partner
Startdate_relation
Enddate_relation
John
Wilma
01-01-1990
NULL
Wilma
John
01-01-1990
NULL
John
Lucy
01-01-1995
31-01-1995
Lucy
John
01-01-1995
31-01-1995
Lucy
Ronaldo
01-02-1995
NULL
Ronaldo
Lucy
01-02-1995
NULL
Ronaldo
Kim
01-01-1995
31-01-1995
Ronaldo
Kim
01-01-1998
24-07-1998
Kim
Ronaldo
01-01-1995
31-01-1995
Kim
Ronaldo
01-01-1998
24-07-1998
Kim
Angelina
01-02-1995
NULL
Angelina
Kim
01-02-1995
NULL
我得到了一份包含上述信息的 table(比如 Table“information_Client”)。 table 具有唯一的客户编号而不是名称,但对于此示例,我使用了虚构的名称。我想查询可以确定的独特家庭。 table 包含两种方式的关系,这意味着您在名为“name”和“name_partner”的列中得到同一个人。如果他们有关系,UniqueHousehold 列应该始终是“name”列的第一人称。我不是只寻找未结束的关系,而是寻找所有的关系。我想通过 years/months 展示独特的家庭。 sql-server 中是否有对此的查询?
预期结果:
Name
Name_Partner
UniqueHousehold
Startdate_relation
Enddate_relation
John
Wilma
John
01-01-1990
NULL
Wilma
John
John
01-01-1990
NULL
John
Lucy
John
01-01-1995
31-01-1995
Lucy
John
John
01-01-1995
31-01-1995
Lucy
Ronaldo
Lucy
01-02-1995
NULL
Ronaldo
Lucy
Lucy
01-02-1995
NULL
Ronaldo
Kim
Kim
01-01-1995
31-01-1995
Ronaldo
Kim
Kim
01-01-1998
24-07-1998
Kim
Ronaldo
Kim
01-01-1995
31-01-1995
Kim
Ronaldo
Kim
01-01-1998
24-07-1998
Kim
Angelina
Angelina
01-02-1995
NULL
Angelina
Kim
Angelina
01-02-1995
NULL
我尝试使用的代码是这样的:
SELECT NAME, NAME_PARTNER, Startdate_relation, Enddate_relation, Load_DateTime
INTO #Customer
FROM
Information_Client
;
WITH RowNum AS(
SELECT *, ROW_NUMBER() OVER(ORDER BY NAME) AS Id
FROM #Customer)
INSERT INTO Client
( [Name]
,[Name_Partner]
,[UniqueHousehold]
,[Startdate_relation]
,[Enddate_relation]
,[Load_DateTime]
)
SELECT
C.NAME
, C.NAME_PARTNER
, COALESCE(HH.NAME, C.NAME) AS UniqueHousehold
, C.Startdate_relation
, C.Enddate_relation
, C.Load_DateTime
FROM
RowNum AS C
LEFT JOIN RowNum AS HH
ON C.NAME_PARTNER = HH.NAME
AND C.Id > HH.Id
我得到的是一行有多行 --> 因为生成了 ID。我只想像上面的 table
那样只有 1 行
这是我现在得到的结果:
Name
Name_Partner
UniqueHousehold
Startdate_relation
Enddate_relation
C.ID
HH.ID
Ronaldo
Kim
Kim
01-01-1995
31-01-1995
29
NULL
Ronaldo
Kim
Kim
01-01-1998
24-07-1998
30
NULL
Kim
Ronaldo
Kim
01-01-1995
31-01-1995
6177
29
Kim
Ronaldo
Kim
01-01-1995
31-01-1995
6177
30
Kim
Ronaldo
Kim
01-01-1998
24-07-1998
6178
29
Kim
Ronaldo
Kim
01-01-1998
24-07-1998
6178
30
期望的结果:
Name
Name_Partner
UniqueHousehold
Startdate_relation
Enddate_relation
Ronaldo
Kim
Kim
01-01-1995
31-01-1995
Ronaldo
Kim
Kim
01-01-1998
24-07-1998
Kim
Ronaldo
Kim
01-01-1995
31-01-1995
Kim
Ronaldo
Kim
01-01-1998
24-07-1998
要识别一个独特的家庭,我们需要包括 2 个独特的标识符。为了确定 1-2 与 1-2 相同,我们指定将按字母顺序排在第一位的放在第一位。在这里我们使用名称,但相同的主体将与 id 一起使用 - 只要我们通过使用分隔符连接它们而不是数字加法来连接它们。我们需要知道A+BC和AB+C的区别,10+20和20+10的区别。
create table t (Name varchar(10),Name_Partner varchar(10),Startdate_relation date,Enddate_relation date);
insert into t values
('John','Lucy','1995-01-01','1995-01-31'),
('Lucy','John','1995-01-01','1995-01-31'),
('Ronaldo','Kim','1995-01-01','1995-01-31'),
('Ronaldo','Kim','1998-01-01','1998-07-24'),
('Kim','Ronaldo','1995-01-01','1995-01-31'),
('Kim','Ronaldo','1998-01-01','1998-07-24');
insert into t (Name, Name_Partner, Startdate_relation) values
('John','Wilma','1990-01-01' ),
('Wilma','John','1990-01-01' ),
('Lucy','Ronaldo','1995-02-01' ),
('Ronaldo','Lucy','1995-02-01' ),
('Kim','Angelina','1995-02-01' ),
('Angelina','Kim','1995-02-01' );
select
name,
name_partner,
case when name < name_partner then concat(name,'-',name_partner)
else concat(name_partner,'-',name) end unique_household,
Startdate_relation,
Enddate_relation
from t
order by 3;
GO
name | name_partner | unique_household | Startdate_relation | Enddate_relation
:------- | :----------- | :--------------- | :----------------- | :---------------
Kim | Angelina | Angelina-Kim | 1995-02-01 | null
Angelina | Kim | Angelina-Kim | 1995-02-01 | null
John | Lucy | John-Lucy | 1995-01-01 | 1995-01-31
Lucy | John | John-Lucy | 1995-01-01 | 1995-01-31
John | Wilma | John-Wilma | 1990-01-01 | null
Wilma | John | John-Wilma | 1990-01-01 | null
Ronaldo | Kim | Kim-Ronaldo | 1995-01-01 | 1995-01-31
Ronaldo | Kim | Kim-Ronaldo | 1998-01-01 | 1998-07-24
Kim | Ronaldo | Kim-Ronaldo | 1995-01-01 | 1995-01-31
Kim | Ronaldo | Kim-Ronaldo | 1998-01-01 | 1998-07-24
Lucy | Ronaldo | Lucy-Ronaldo | 1995-02-01 | null
Ronaldo | Lucy | Lucy-Ronaldo | 1995-02-01 | null
db<>fiddle here
Name | Name_Partner | Startdate_relation | Enddate_relation |
---|---|---|---|
John | Wilma | 01-01-1990 | NULL |
Wilma | John | 01-01-1990 | NULL |
John | Lucy | 01-01-1995 | 31-01-1995 |
Lucy | John | 01-01-1995 | 31-01-1995 |
Lucy | Ronaldo | 01-02-1995 | NULL |
Ronaldo | Lucy | 01-02-1995 | NULL |
Ronaldo | Kim | 01-01-1995 | 31-01-1995 |
Ronaldo | Kim | 01-01-1998 | 24-07-1998 |
Kim | Ronaldo | 01-01-1995 | 31-01-1995 |
Kim | Ronaldo | 01-01-1998 | 24-07-1998 |
Kim | Angelina | 01-02-1995 | NULL |
Angelina | Kim | 01-02-1995 | NULL |
我得到了一份包含上述信息的 table(比如 Table“information_Client”)。 table 具有唯一的客户编号而不是名称,但对于此示例,我使用了虚构的名称。我想查询可以确定的独特家庭。 table 包含两种方式的关系,这意味着您在名为“name”和“name_partner”的列中得到同一个人。如果他们有关系,UniqueHousehold 列应该始终是“name”列的第一人称。我不是只寻找未结束的关系,而是寻找所有的关系。我想通过 years/months 展示独特的家庭。 sql-server 中是否有对此的查询?
预期结果:
Name | Name_Partner | UniqueHousehold | Startdate_relation | Enddate_relation |
---|---|---|---|---|
John | Wilma | John | 01-01-1990 | NULL |
Wilma | John | John | 01-01-1990 | NULL |
John | Lucy | John | 01-01-1995 | 31-01-1995 |
Lucy | John | John | 01-01-1995 | 31-01-1995 |
Lucy | Ronaldo | Lucy | 01-02-1995 | NULL |
Ronaldo | Lucy | Lucy | 01-02-1995 | NULL |
Ronaldo | Kim | Kim | 01-01-1995 | 31-01-1995 |
Ronaldo | Kim | Kim | 01-01-1998 | 24-07-1998 |
Kim | Ronaldo | Kim | 01-01-1995 | 31-01-1995 |
Kim | Ronaldo | Kim | 01-01-1998 | 24-07-1998 |
Kim | Angelina | Angelina | 01-02-1995 | NULL |
Angelina | Kim | Angelina | 01-02-1995 | NULL |
我尝试使用的代码是这样的:
SELECT NAME, NAME_PARTNER, Startdate_relation, Enddate_relation, Load_DateTime
INTO #Customer
FROM
Information_Client
;
WITH RowNum AS(
SELECT *, ROW_NUMBER() OVER(ORDER BY NAME) AS Id
FROM #Customer)
INSERT INTO Client
( [Name]
,[Name_Partner]
,[UniqueHousehold]
,[Startdate_relation]
,[Enddate_relation]
,[Load_DateTime]
)
SELECT
C.NAME
, C.NAME_PARTNER
, COALESCE(HH.NAME, C.NAME) AS UniqueHousehold
, C.Startdate_relation
, C.Enddate_relation
, C.Load_DateTime
FROM
RowNum AS C
LEFT JOIN RowNum AS HH
ON C.NAME_PARTNER = HH.NAME
AND C.Id > HH.Id
我得到的是一行有多行 --> 因为生成了 ID。我只想像上面的 table
那样只有 1 行这是我现在得到的结果:
Name | Name_Partner | UniqueHousehold | Startdate_relation | Enddate_relation | C.ID | HH.ID |
---|---|---|---|---|---|---|
Ronaldo | Kim | Kim | 01-01-1995 | 31-01-1995 | 29 | NULL |
Ronaldo | Kim | Kim | 01-01-1998 | 24-07-1998 | 30 | NULL |
Kim | Ronaldo | Kim | 01-01-1995 | 31-01-1995 | 6177 | 29 |
Kim | Ronaldo | Kim | 01-01-1995 | 31-01-1995 | 6177 | 30 |
Kim | Ronaldo | Kim | 01-01-1998 | 24-07-1998 | 6178 | 29 |
Kim | Ronaldo | Kim | 01-01-1998 | 24-07-1998 | 6178 | 30 |
期望的结果:
Name | Name_Partner | UniqueHousehold | Startdate_relation | Enddate_relation |
---|---|---|---|---|
Ronaldo | Kim | Kim | 01-01-1995 | 31-01-1995 |
Ronaldo | Kim | Kim | 01-01-1998 | 24-07-1998 |
Kim | Ronaldo | Kim | 01-01-1995 | 31-01-1995 |
Kim | Ronaldo | Kim | 01-01-1998 | 24-07-1998 |
要识别一个独特的家庭,我们需要包括 2 个独特的标识符。为了确定 1-2 与 1-2 相同,我们指定将按字母顺序排在第一位的放在第一位。在这里我们使用名称,但相同的主体将与 id 一起使用 - 只要我们通过使用分隔符连接它们而不是数字加法来连接它们。我们需要知道A+BC和AB+C的区别,10+20和20+10的区别。
create table t (Name varchar(10),Name_Partner varchar(10),Startdate_relation date,Enddate_relation date); insert into t values ('John','Lucy','1995-01-01','1995-01-31'), ('Lucy','John','1995-01-01','1995-01-31'), ('Ronaldo','Kim','1995-01-01','1995-01-31'), ('Ronaldo','Kim','1998-01-01','1998-07-24'), ('Kim','Ronaldo','1995-01-01','1995-01-31'), ('Kim','Ronaldo','1998-01-01','1998-07-24'); insert into t (Name, Name_Partner, Startdate_relation) values ('John','Wilma','1990-01-01' ), ('Wilma','John','1990-01-01' ), ('Lucy','Ronaldo','1995-02-01' ), ('Ronaldo','Lucy','1995-02-01' ), ('Kim','Angelina','1995-02-01' ), ('Angelina','Kim','1995-02-01' );
select name, name_partner, case when name < name_partner then concat(name,'-',name_partner) else concat(name_partner,'-',name) end unique_household, Startdate_relation, Enddate_relation from t order by 3; GO
name | name_partner | unique_household | Startdate_relation | Enddate_relation :------- | :----------- | :--------------- | :----------------- | :--------------- Kim | Angelina | Angelina-Kim | 1995-02-01 | null Angelina | Kim | Angelina-Kim | 1995-02-01 | null John | Lucy | John-Lucy | 1995-01-01 | 1995-01-31 Lucy | John | John-Lucy | 1995-01-01 | 1995-01-31 John | Wilma | John-Wilma | 1990-01-01 | null Wilma | John | John-Wilma | 1990-01-01 | null Ronaldo | Kim | Kim-Ronaldo | 1995-01-01 | 1995-01-31 Ronaldo | Kim | Kim-Ronaldo | 1998-01-01 | 1998-07-24 Kim | Ronaldo | Kim-Ronaldo | 1995-01-01 | 1995-01-31 Kim | Ronaldo | Kim-Ronaldo | 1998-01-01 | 1998-07-24 Lucy | Ronaldo | Lucy-Ronaldo | 1995-02-01 | null Ronaldo | Lucy | Lucy-Ronaldo | 1995-02-01 | null
db<>fiddle here