如何连接 sql 中的一些元素
How to concatenate some elements in sql
我在 SQL、
中遇到问题
我提出了很多要求,现在我想在某些特定元素上连接我的第一行和最后一行。
这是一个完美的例子:
id enter_date exit_date money
1 02/02/2020 28/02/2020 200$
1 28/02/2020 28/02/2020 220$
1 28/02/2020 04/05/2020 250$
2 12/08/2020 17/12/2020 500$
2 17/12/2020 . 700$
我的目标是:
id enter_date exit_date money
1 02/02/2020 04/05/2020 250$
2 12/08/2020 . 700$
如您所见,我从第一行提取了 enter_date,并从最后一行提取了所有其他元素([=23=] 除外)。我想连接我的第一行和我的最后一行
您可以使用 window 函数,如果它们在您的 RDBMS 中可用。
SELECT DISTINCT
id
,FIRST_VALUE(enter_date) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
,LAST_VALUE(exit_date) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
,LAST_VALUE(money) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM YourTable
只要您没有重复的 ID,exit_date 对您还可以使用:
select AGG_TABLE.id
,AGG_TABLE.min_enter_date as enter_date
,AGG_TABLE.max_exit_date as exit_date
,SOURCE_TABLE.money
FROM (
SELECT id
,MIN(enter_date) as min_enter_date
,MAX(exit_date) as max_exit_date
FROM YourTable
GROUP BY id ) as AGG_TABLE
INNER JOIN YourTable as SOURCE_TABLE
ON SOURCE_TABLE.id=AGG_TABLE.id
AND AGG_TABLE.max_exit_date=SOURCE_TABLE.exit_date
由于这是在 SAS 中进行的,因此如果在 SAS 服务器上进行处理,则数据步骤将比 SQL 更有效。您可以使用分组处理、retain
语句以及 first.
和 last.
逻辑来执行此操作。
data want;
set have;
by id enter_date;
/* Do not reset this value at the run boundary */
retain first_enter_date;
/* For the first ID in the group, store the enter date */
if(first.id) then first_enter_date = enter_date;
/* If it's the last ID, set enter_date to the
stored enter date and output only the last row */
if(last.id) then do;
enter_date = first_enter_date;
output;
end;
drop first_enter_date;
run;
输出:
id enter_date exit_date money
1 02/02/2020 04/05/2020 0
2 12/08/2020 . 0
我在 SQL、
中遇到问题我提出了很多要求,现在我想在某些特定元素上连接我的第一行和最后一行。
这是一个完美的例子:
id enter_date exit_date money
1 02/02/2020 28/02/2020 200$
1 28/02/2020 28/02/2020 220$
1 28/02/2020 04/05/2020 250$
2 12/08/2020 17/12/2020 500$
2 17/12/2020 . 700$
我的目标是:
id enter_date exit_date money
1 02/02/2020 04/05/2020 250$
2 12/08/2020 . 700$
如您所见,我从第一行提取了 enter_date,并从最后一行提取了所有其他元素([=23=] 除外)。我想连接我的第一行和我的最后一行
您可以使用 window 函数,如果它们在您的 RDBMS 中可用。
SELECT DISTINCT
id
,FIRST_VALUE(enter_date) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
,LAST_VALUE(exit_date) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING )
,LAST_VALUE(money) OVER (PARTITION BY id ORDER BY enter_date, exit_date ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM YourTable
只要您没有重复的 ID,exit_date 对您还可以使用:
select AGG_TABLE.id
,AGG_TABLE.min_enter_date as enter_date
,AGG_TABLE.max_exit_date as exit_date
,SOURCE_TABLE.money
FROM (
SELECT id
,MIN(enter_date) as min_enter_date
,MAX(exit_date) as max_exit_date
FROM YourTable
GROUP BY id ) as AGG_TABLE
INNER JOIN YourTable as SOURCE_TABLE
ON SOURCE_TABLE.id=AGG_TABLE.id
AND AGG_TABLE.max_exit_date=SOURCE_TABLE.exit_date
由于这是在 SAS 中进行的,因此如果在 SAS 服务器上进行处理,则数据步骤将比 SQL 更有效。您可以使用分组处理、retain
语句以及 first.
和 last.
逻辑来执行此操作。
data want;
set have;
by id enter_date;
/* Do not reset this value at the run boundary */
retain first_enter_date;
/* For the first ID in the group, store the enter date */
if(first.id) then first_enter_date = enter_date;
/* If it's the last ID, set enter_date to the
stored enter date and output only the last row */
if(last.id) then do;
enter_date = first_enter_date;
output;
end;
drop first_enter_date;
run;
输出:
id enter_date exit_date money
1 02/02/2020 04/05/2020 0
2 12/08/2020 . 0