如何用前一行的值填充空列?
How do I fill null columns with values from the previous row?
我想在这里做点什么。我想让所有的列都充满价值。但是当我有空列时,我想让它填充上一个非空列的值。
with cte as (
select '2019-11-12 16:01:55' as timestamp, null as owner_id, null as owner_assigneddate, null as lastmodifieddate union all
select '2019-11-12 19:03:18' as timestamp, 39530934 as owner_id, '2019-11-12 19:03:18' as owner_assigneddate, '2019-11-12 19:03:18' as lastmodifieddate union all
select '2019-11-12 19:03:19' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:19' as lastmodifieddate union all
select '2019-11-12 19:03:20' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:20' as lastmodifieddate union all
select '2019-11-12 19:03:31' as timestamp, 40320368 as owner_id, '2019-11-12 19:03:31' as owner_assigneddate, '2019-11-12 19:03:31' as lastmodifieddate union all
select '2019-11-12 19:03:33' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:33' as lastmodifieddate union all
select '2019-11-12 19:03:56' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:356' as lastmodifieddate)
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
COALESCE(owner_id, LEAD(owner_id) OVER(ORDER BY timestamp DESC)) AS test_column
from cte order by timestamp asc
通过前面的查询,我已经设法将值仅放在下一行中。
我想要做的是用基于上一行的值填充所有列。
第 4 行的值应为 39530934,第 7 行的值应为 40320368。
我想我在这里遗漏了一些东西,但我不知道是什么。
这应该适用于您的 cte
定义:
...
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
LAST_VALUE(owner_id IGNORE NULLS)
OVER(ORDER BY timestamp ASC ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS test_column
from cte order by timestamp asc
就此而言,Big Query 不支持 window 函数中的 ignore null
。这是一个依赖于 window max 的解决方案来定位保存最后一个非空 owner_id
的记录(这假定时间戳的唯一性)。有了这些信息,您就可以通过连接引入相应的 owner_id
。
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
(
select
cte.*,
max(case when owner_id is not null then timestamp end)
over(order by timestamp rows unbounded preceding) target_timestamp
from cte
) c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.target_timestamp
timestamp | owner_id | owner_assigneddate | lastmodifieddate
:------------------ | -------: | :------------------ | :-------------------
2019-11-12 16:01:55 | null | null | null
2019-11-12 19:03:18 | 39530934 | 2019-11-12 19:03:18 | 2019-11-12 19:03:18
2019-11-12 19:03:19 | 39530934 | null | 2019-11-12 19:03:19
2019-11-12 19:03:20 | 39530934 | null | 2019-11-12 19:03:20
2019-11-12 19:03:31 | 40320368 | 2019-11-12 19:03:31 | 2019-11-12 19:03:31
2019-11-12 19:03:33 | 40320368 | null | 2019-11-12 19:03:33
2019-11-12 19:03:56 | 40320368 | null | 2019-11-12 19:03:356
注意:如果需要,为了更好地理解逻辑,您可以 运行 独立地查看内部查询 returns(请参阅数据库 fiddle)。
编辑
重读此文,我发现 window max 提供的信息在您的原始数据中已经可用,在列 owner_assigneddate
中...所以这要简单得多:
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
cte c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.owner_assigneddate
在 BigQuery 中使用 LAST_VALUE()
和 IGNORE NULLS
选项 和 COALESCE()
:
select timestamp,
COALESCE(owner_id, last_value(owner_id ignore nulls) over (order by timestamp)) as owner_id,
COALESCE(owner_assigneddate, LAST_VALUE(owner_assigneddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as owner_assigneddate,
COALESCE(lastmodifieddate, LAST_VALUE(lastmodifieddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as lastmodifieddate
from cte order by timestamp asc
我想在这里做点什么。我想让所有的列都充满价值。但是当我有空列时,我想让它填充上一个非空列的值。
with cte as (
select '2019-11-12 16:01:55' as timestamp, null as owner_id, null as owner_assigneddate, null as lastmodifieddate union all
select '2019-11-12 19:03:18' as timestamp, 39530934 as owner_id, '2019-11-12 19:03:18' as owner_assigneddate, '2019-11-12 19:03:18' as lastmodifieddate union all
select '2019-11-12 19:03:19' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:19' as lastmodifieddate union all
select '2019-11-12 19:03:20' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:20' as lastmodifieddate union all
select '2019-11-12 19:03:31' as timestamp, 40320368 as owner_id, '2019-11-12 19:03:31' as owner_assigneddate, '2019-11-12 19:03:31' as lastmodifieddate union all
select '2019-11-12 19:03:33' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:33' as lastmodifieddate union all
select '2019-11-12 19:03:56' as timestamp, null as owner_id, null as owner_assigneddate, '2019-11-12 19:03:356' as lastmodifieddate)
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
COALESCE(owner_id, LEAD(owner_id) OVER(ORDER BY timestamp DESC)) AS test_column
from cte order by timestamp asc
通过前面的查询,我已经设法将值仅放在下一行中。
我想要做的是用基于上一行的值填充所有列。 第 4 行的值应为 39530934,第 7 行的值应为 40320368。 我想我在这里遗漏了一些东西,但我不知道是什么。
这应该适用于您的 cte
定义:
...
select timestamp,
owner_id,
owner_assigneddate,
lastmodifieddate,
LAST_VALUE(owner_id IGNORE NULLS)
OVER(ORDER BY timestamp ASC ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW) AS test_column
from cte order by timestamp asc
就此而言,Big Query 不支持 window 函数中的 ignore null
。这是一个依赖于 window max 的解决方案来定位保存最后一个非空 owner_id
的记录(这假定时间戳的唯一性)。有了这些信息,您就可以通过连接引入相应的 owner_id
。
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
(
select
cte.*,
max(case when owner_id is not null then timestamp end)
over(order by timestamp rows unbounded preceding) target_timestamp
from cte
) c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.target_timestamp
timestamp | owner_id | owner_assigneddate | lastmodifieddate :------------------ | -------: | :------------------ | :------------------- 2019-11-12 16:01:55 | null | null | null 2019-11-12 19:03:18 | 39530934 | 2019-11-12 19:03:18 | 2019-11-12 19:03:18 2019-11-12 19:03:19 | 39530934 | null | 2019-11-12 19:03:19 2019-11-12 19:03:20 | 39530934 | null | 2019-11-12 19:03:20 2019-11-12 19:03:31 | 40320368 | 2019-11-12 19:03:31 | 2019-11-12 19:03:31 2019-11-12 19:03:33 | 40320368 | null | 2019-11-12 19:03:33 2019-11-12 19:03:56 | 40320368 | null | 2019-11-12 19:03:356
注意:如果需要,为了更好地理解逻辑,您可以 运行 独立地查看内部查询 returns(请参阅数据库 fiddle)。
编辑
重读此文,我发现 window max 提供的信息在您的原始数据中已经可用,在列 owner_assigneddate
中...所以这要简单得多:
select
c.timestamp,
coalesce(c.owner_id, c_lag.owner_id) owner_id,
c.owner_assigneddate,
c.lastmodifieddate
from
cte c
left join cte c_lag
on c.owner_id is null
and c_lag.timestamp = c.owner_assigneddate
在 BigQuery 中使用 LAST_VALUE()
和 IGNORE NULLS
选项 和 COALESCE()
:
select timestamp,
COALESCE(owner_id, last_value(owner_id ignore nulls) over (order by timestamp)) as owner_id,
COALESCE(owner_assigneddate, LAST_VALUE(owner_assigneddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as owner_assigneddate,
COALESCE(lastmodifieddate, LAST_VALUE(lastmodifieddate IGNORE NULLS) OVER (ORDER BY TIMESTAMP)) as lastmodifieddate
from cte order by timestamp asc