如何找到每列的第一个非空记录并按 id 分组

How to find first non-null record for every column and group by id

我有一个 table 存储事件,我想为每个 ID 创建 latest/current 状态的视图。
table 中的每一行都应该由具有最高相应序列号的非空记录组成。
Sequence number 由事件携带。 我的 SQL 技能有点生疏,因为我大部分时间都在使用 Cassandra。

我花了一整天的时间弄清楚如何去做,并尝试了很多东西,例如使用 COALESCEFIRST_VALUE 和不同的子 SELECT 查询。所以我认为我失败的解决方案如果张贴在这里只会造成混淆。

这是包含事件的 table:

|----|------|------|----------|
| Id | A    | B    | Sequence |
|----|------|------|----------|
| 1  | a0   | b0   | 0        |
|----|------|------|----------|
| 2  | a0   | b6   | 0        |
|----|------|------|----------|
| 1  | a1   | NULL | 1        |
|----|------|------|----------|
| 2  | a1   | NULL | 1        |
|----|------|------|----------|
| 2  | NULL | b2   | 2        |
|----|------|------|----------|
| 2  | a3   | b3   | 3        |
|----|------|------|----------|
| 2  | NULL | b4   | 4        |
|----|------|------|----------|

...以及想要实现的视图:

|----|----|----|----------|
| Id | A  | B  | Sequence |
|----|----|----|----------|
| 1  | a1 | b0 | 1        |
|----|----|----|----------|
| 2  | a3 | b4 | 4        |
|----|----|----|----------|

一个简单的解决方案是使用很好的老式子查询。

首先,创建并填充示例 table(在您以后的问题中为我们省去这一步):

DECLARE @T AS TABLE
(
    Id int,
    A char(2),
    B char(2),
    Sequence int
)

INSERT INTO @T (Id, A, B, Sequence) VALUES
(1, 'a0', 'b0', 0),
(2, 'a0', 'b6', 0),
(1, 'a1', NULL, 1),
(2, 'a1', NULL, 1),
(2, NULL, 'b2', 2),
(2, 'a3', 'b3', 3),
(2, NULL, 'b4', 4);

查询:

SELECT  Id, 
        (
            -- get the last non-null A value for the specified Id
            SELECT TOP 1 A 
            FROM @T As T1
            WHERE T1.Id = T0.Id
            AND A IS NOT NULL
            ORDER BY Sequence DESC
        ) As A,
        (
            -- get the last non-null B value for the specified Id
            SELECT TOP 1 B 
            FROM @T As T1
            WHERE T1.Id = T0.Id
            AND B IS NOT NULL
            ORDER BY Sequence DESC
        ) As B,
        MAX(Sequence) As Sequence
FROM @T As T0
GROUP BY Id

结果:

Id  A   B   Sequence
1   a1  b0  1
2   a3  b4  4

如果你继续努力,结果会来的:-)我设法弄明白了。对于任何感兴趣的人,这里是代码:

SELECT [id],
       [A] = (
           SELECT TOP (1) [A]
           FROM [dbo].[Table]
           WHERE [A] IS NOT NULL
               AND [Current].[Id] = [Id]
           ORDER BY [Sequence] DESC
       ),
       [B] = (
           SELECT TOP (1) [B]
           FROM [dbo].[Table]
           WHERE [B] IS NOT NULL
               AND [Current].[Id] = [Id]
           ORDER BY [Sequence] DESC
       ),
       [HighestSequence] = (
           SELECT TOP (1) [Sequence]
           FROM [dbo].[Table]
           WHERE [Current].[Id] = [Id]
           ORDER BY [Sequence] DESC
       )
FROM (SELECT [Id] FROM [dbo].[Table]) AS [Current]
GROUP BY [Id]

我不知道查询将如何执行,但在我的场景中它是合适的。 如果您发现任何缺陷,请告诉我。改进总是受欢迎的。