如何将一系列记录转换为 SQL 中该范围之后的记录值?
How to transform a range of records to the values of the record after that range in SQL?
我正在尝试用正确的记录替换特定日期范围内的一些错误输入记录。但是,我不确定是否有有效的方法。因此,我的问题是如何将(静态)记录范围转换为 SQL 中该范围之后的记录值?您将在下面找到一个示例来阐明我试图实现的目标。
在此示例中,您可以看到客户编号 1 在 25-06-2020 到 29-06-2020 期间属于组编号 0 (None)。从 30-06-2020 到 05-07-2020 此组号从 0 更改为客户编号 1 的 11。此静态期间包含错误记录,应更改为在 06-07-2020 有效的值(组号 == 10)。有办法吗?
我认为 window 函数 first_value()
可以满足您的要求:
select
date,
customer_number,
first_value(group_number) over(partition by customer_number order by date) group_number,
first_value(role) over(partition by customer_number order by date) role
from mytable
如果我理解正确,您可以使用 window 函数获取特定日期的数据,并使用 case
逻辑将其分配到特定日期范围:
select t.*,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then group_number end) over (partition by customer_number)
else group_number
end) as imputed_group_number,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then role end) over (partition by customer_number)
else role
end) as imputed_role
from t;
如果要更新值,可以使用 JOIN
:
update t
set group_number = tt.group_number,
role = tt.role
from tt
where tt.customer_number = t.customer_number and tt.date = '2020-07-06'
您可以按照以下示例进行操作。在这里,我选择了以下条件:如果 role='Leader' 它是一个坏记录,因此您将应用下一个可用的 group_number --> 在 group_number1 列和 role1.
我在 excel 示例中使用了较小的行子集。
select date1
,customer_number
,group_number
,case when role='Leader' then
(select t1.group_number
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else group_number
end as group_number1
,role
,case when role='Leader' then
(select t1.role
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else role
end as role1
from t
order by 1
+------------+-----------------+--------------+---------------+--------+--------+
| DATE1 | CUSTOMER_NUMBER | GROUP_NUMBER | GROUP_NUMBER1 | ROLE | ROLE1 |
+------------+-----------------+--------------+---------------+--------+--------+
| 2020-06-25 | 1 | 0 | 0 | None | None |
| 2020-06-26 | 1 | 0 | 0 | None | None |
| 2020-06-27 | 1 | 0 | 0 | None | None |
| 2020-06-28 | 1 | 0 | 0 | None | None |
| 2020-06-29 | 1 | 0 | 0 | None | None |
| 2020-06-30 | 1 | 11 | 10 | Leader | Member |
| 2020-07-01 | 1 | 11 | 10 | Leader | Member |
| 2020-07-06 | 1 | 10 | 10 | Member | Member |
+------------+-----------------+--------------+---------------+--------+--------+
db fiddle link
https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=c95d12ced067c1df94947848b5a94c14
我正在尝试用正确的记录替换特定日期范围内的一些错误输入记录。但是,我不确定是否有有效的方法。因此,我的问题是如何将(静态)记录范围转换为 SQL 中该范围之后的记录值?您将在下面找到一个示例来阐明我试图实现的目标。
在此示例中,您可以看到客户编号 1 在 25-06-2020 到 29-06-2020 期间属于组编号 0 (None)。从 30-06-2020 到 05-07-2020 此组号从 0 更改为客户编号 1 的 11。此静态期间包含错误记录,应更改为在 06-07-2020 有效的值(组号 == 10)。有办法吗?
我认为 window 函数 first_value()
可以满足您的要求:
select
date,
customer_number,
first_value(group_number) over(partition by customer_number order by date) group_number,
first_value(role) over(partition by customer_number order by date) role
from mytable
如果我理解正确,您可以使用 window 函数获取特定日期的数据,并使用 case
逻辑将其分配到特定日期范围:
select t.*,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then group_number end) over (partition by customer_number)
else group_number
end) as imputed_group_number,
(case when date >= '2020-07-01' and date <= '2020-07-05'
then max(case when date = '2020-07-06' then role end) over (partition by customer_number)
else role
end) as imputed_role
from t;
如果要更新值,可以使用 JOIN
:
update t
set group_number = tt.group_number,
role = tt.role
from tt
where tt.customer_number = t.customer_number and tt.date = '2020-07-06'
您可以按照以下示例进行操作。在这里,我选择了以下条件:如果 role='Leader' 它是一个坏记录,因此您将应用下一个可用的 group_number --> 在 group_number1 列和 role1.
我在 excel 示例中使用了较小的行子集。
select date1
,customer_number
,group_number
,case when role='Leader' then
(select t1.group_number
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else group_number
end as group_number1
,role
,case when role='Leader' then
(select t1.role
from t t1
where t1.date1>t.date1
and t1.role<>'Leader'
order by t1.date1 asc
limit 1
)
else role
end as role1
from t
order by 1
+------------+-----------------+--------------+---------------+--------+--------+
| DATE1 | CUSTOMER_NUMBER | GROUP_NUMBER | GROUP_NUMBER1 | ROLE | ROLE1 |
+------------+-----------------+--------------+---------------+--------+--------+
| 2020-06-25 | 1 | 0 | 0 | None | None |
| 2020-06-26 | 1 | 0 | 0 | None | None |
| 2020-06-27 | 1 | 0 | 0 | None | None |
| 2020-06-28 | 1 | 0 | 0 | None | None |
| 2020-06-29 | 1 | 0 | 0 | None | None |
| 2020-06-30 | 1 | 11 | 10 | Leader | Member |
| 2020-07-01 | 1 | 11 | 10 | Leader | Member |
| 2020-07-06 | 1 | 10 | 10 | Member | Member |
+------------+-----------------+--------------+---------------+--------+--------+
db fiddle link https://dbfiddle.uk/?rdbms=db2_11.1&fiddle=c95d12ced067c1df94947848b5a94c14