SQLite:使用旧记录合并相同 table 中的行以填充新记录中的空白
SQLite: merge rows in same table using older record to fill blanks in newer record
我有一个 SQLite 数据 table,它存储复合键为 的记录。
在不同的日期为每个 id 添加新记录,但新记录通常不会完成所有列。不完整的列可能为 null 或空格,例如
| id | rdate | data1 | data2 |
----------------------------------------
0 | 1 | 01/01/2009 | foo | boo |
1 | 1 | 04/01/2010 | foo1 | bar1 |
2 | 1 | 08/01/2010 | fooX | <null> |
3 | 2 | 01/01/2010 | foo2 | bar2 |
4 | 2 | 04/01/2010 | | |
5 | 3 | 01/01/2010 | foo3 | bar3 |
----------------------------------------
我想定期更新具有相同 ID 的记录,以用前一条记录的数据填充最近一条记录(按 rdate)中的空白列。在上面的示例中,来自 row 1
的数据用于填充 row 2
中的空白列
所以 table,在 运行 查询之后,看起来像这样:
| id | rdate | data1 | data2 |
----------------------------------------
0 | 1 | 01/01/2009 | foo | boo |
1 | 1 | 04/01/2010 | foo1 | bar1 |
2 | 1 | 08/01/2010 | fooX | bar1 |
3 | 2 | 01/01/2010 | foo2 | bar2 |
4 | 2 | 04/01/2010 | foo2 | bar2 |
5 | 3 | 01/01/2010 | foo3 | bar3 |
----------------------------------------
我试图构建一个查询来执行此操作,但我正在努力解决这个问题,或者即使它可以完成。
着眼于合并记录,但从重复数据删除的角度来看。我找不到任何我需要的东西。 COALESCE
看起来很有希望,但我一直无法弄清楚如何构建查询来使用它。
非常感谢您的帮助和建议。
您可以将 update
与两个相关子查询一起使用:
update t
set data1 = coalesce(data1,
(select t2.data1
from t t2
where t2.id = t.id and
t2.rdate < t.rdate and
t2.data1 is not null
)
),
data2 = coalesce(data1,
(select t2.data2
from t t2
where t2.id = t.id and
t2.rdate < t.rdate and
t2.data2 is not null
)
)
where data1 is null or data2 is null;
要处理所有空格,您必须在查询中修改很多小东西:
UPDATE tablename as t1
SET data1 = (CASE WHEN TRIM(t1.data1) <> ''
THEN t1.data1
ELSE (SELECT t2.data1
FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND trim(t2.data1) <> ''
ORDER BY t2.rdate DESC
LIMIT 1
)
END),
data2 = (CASE WHEN TRIM(t1.data2) <> ''
THEN t1.data2
ELSE (SELECT t2.data2
FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND trim(t2.data2) <> ''
ORDER BY t2.rdate DESC
LIMIT 1
)
END)
WHERE data1 IS NULL OR data2 IS NULL or trim(data1) = '' or trim(data2) = '';
对于 data1
和 data2
列中的每一列,使用 returns 该列中最后一个非空值的相关子查询:
UPDATE tablename AS t1
SET data1 = COALESCE(
NULLIF(TRIM(t1.data1), ''),
(SELECT t2.data1 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND NULLIF(TRIM(t2.data1), '') IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
),
data2 = COALESCE(
NULLIF(TRIM(t1.data2), ''),
(SELECT t2.data2 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND NULLIF(TRIM(t2.data2), '') IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
)
WHERE NULLIF(TRIM(t1.data1), '') IS NULL OR NULLIF(TRIM(t1.data2), '') IS NULL
参见demo。
但最好更新 table 以便每个空值都替换为 null
:
UPDATE tablename
SET data1 = NULLIF(TRIM(data1), ''),
data2 = NULLIF(TRIM(data2), '')
WHERE TRIM(data1) = '' OR TRIM(data2) = ''
然后代码可以简化为:
UPDATE tablename AS t1
SET data1 = COALESCE(
t1.data1,
(SELECT t2.data1 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND t2.data1 IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
),
data2 = COALESCE(
t1.data2,
(SELECT t2.data2 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND t2.data2 IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
)
WHERE data1 IS NULL OR data2 IS NULL
参见demo。
结果:
id
rdate
data1
data2
1
2009-01-01
foo
boo
1
2010-01-04
foo1
bar1
1
2010-01-08
fooX
bar1
2
2010-01-01
foo2
bar2
2
2010-01-04
foo2
bar2
3
2010-01-01
foo3
bar3
请注意,样本数据中的日期不可比。
将它们更改为格式 'YYYY-MM-DD'
.
我有一个 SQLite 数据 table,它存储复合键为
| id | rdate | data1 | data2 |
----------------------------------------
0 | 1 | 01/01/2009 | foo | boo |
1 | 1 | 04/01/2010 | foo1 | bar1 |
2 | 1 | 08/01/2010 | fooX | <null> |
3 | 2 | 01/01/2010 | foo2 | bar2 |
4 | 2 | 04/01/2010 | | |
5 | 3 | 01/01/2010 | foo3 | bar3 |
----------------------------------------
我想定期更新具有相同 ID 的记录,以用前一条记录的数据填充最近一条记录(按 rdate)中的空白列。在上面的示例中,来自 row 1
的数据用于填充 row 2
所以 table,在 运行 查询之后,看起来像这样:
| id | rdate | data1 | data2 |
----------------------------------------
0 | 1 | 01/01/2009 | foo | boo |
1 | 1 | 04/01/2010 | foo1 | bar1 |
2 | 1 | 08/01/2010 | fooX | bar1 |
3 | 2 | 01/01/2010 | foo2 | bar2 |
4 | 2 | 04/01/2010 | foo2 | bar2 |
5 | 3 | 01/01/2010 | foo3 | bar3 |
----------------------------------------
我试图构建一个查询来执行此操作,但我正在努力解决这个问题,或者即使它可以完成。
COALESCE
看起来很有希望,但我一直无法弄清楚如何构建查询来使用它。
非常感谢您的帮助和建议。
您可以将 update
与两个相关子查询一起使用:
update t
set data1 = coalesce(data1,
(select t2.data1
from t t2
where t2.id = t.id and
t2.rdate < t.rdate and
t2.data1 is not null
)
),
data2 = coalesce(data1,
(select t2.data2
from t t2
where t2.id = t.id and
t2.rdate < t.rdate and
t2.data2 is not null
)
)
where data1 is null or data2 is null;
要处理所有空格,您必须在查询中修改很多小东西:
UPDATE tablename as t1
SET data1 = (CASE WHEN TRIM(t1.data1) <> ''
THEN t1.data1
ELSE (SELECT t2.data1
FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND trim(t2.data1) <> ''
ORDER BY t2.rdate DESC
LIMIT 1
)
END),
data2 = (CASE WHEN TRIM(t1.data2) <> ''
THEN t1.data2
ELSE (SELECT t2.data2
FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND trim(t2.data2) <> ''
ORDER BY t2.rdate DESC
LIMIT 1
)
END)
WHERE data1 IS NULL OR data2 IS NULL or trim(data1) = '' or trim(data2) = '';
对于 data1
和 data2
列中的每一列,使用 returns 该列中最后一个非空值的相关子查询:
UPDATE tablename AS t1
SET data1 = COALESCE(
NULLIF(TRIM(t1.data1), ''),
(SELECT t2.data1 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND NULLIF(TRIM(t2.data1), '') IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
),
data2 = COALESCE(
NULLIF(TRIM(t1.data2), ''),
(SELECT t2.data2 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND NULLIF(TRIM(t2.data2), '') IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
)
WHERE NULLIF(TRIM(t1.data1), '') IS NULL OR NULLIF(TRIM(t1.data2), '') IS NULL
参见demo。
但最好更新 table 以便每个空值都替换为 null
:
UPDATE tablename
SET data1 = NULLIF(TRIM(data1), ''),
data2 = NULLIF(TRIM(data2), '')
WHERE TRIM(data1) = '' OR TRIM(data2) = ''
然后代码可以简化为:
UPDATE tablename AS t1
SET data1 = COALESCE(
t1.data1,
(SELECT t2.data1 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND t2.data1 IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
),
data2 = COALESCE(
t1.data2,
(SELECT t2.data2 FROM tablename t2
WHERE t2.id = t1.id AND t2.rdate < t1.rdate AND t2.data2 IS NOT NULL
ORDER BY t2.rdate DESC LIMIT 1)
)
WHERE data1 IS NULL OR data2 IS NULL
参见demo。
结果:
id | rdate | data1 | data2 |
---|---|---|---|
1 | 2009-01-01 | foo | boo |
1 | 2010-01-04 | foo1 | bar1 |
1 | 2010-01-08 | fooX | bar1 |
2 | 2010-01-01 | foo2 | bar2 |
2 | 2010-01-04 | foo2 | bar2 |
3 | 2010-01-01 | foo3 | bar3 |
请注意,样本数据中的日期不可比。
将它们更改为格式 'YYYY-MM-DD'
.