使用先前的非空值填充 table 中的空值。每列都有几个空值?
Filling null values in the table using previous non null values. Each column has few null values?
我需要将 Table A 转换为 Table B(即用每列的先前非空值填充所有空值)
主要任务是用每列的先前非空值填充空值。
这是原文Table:
FromCompany Container Numbers ToCompany Location
DISCOVERY HALU 330308 5 MAGNA CHARGE St-Laurent
ATSU 827944 0 LEEZA DIST.
4
COLUMBIA CAIU 807457 3 La Cie Canada Baie D'Urfe
6
0
最后的 Table 应该是:
FromCompany Container Numbers ToCompany Location
DISCOVERY HALU 330308 5 MAGNA CHARGE St-Laurent
DISCOVERY ATSU 827944 0 LEEZA DIST St-Laurent
DISCOVERY ATSU 827944 4 LEEZA DIST St-Laurent
COLUMBIA CAIU 807457 3 La Cie Canada Baie D'Urfe
COLUMBIA CAIU 807457 6 La Cie Canada Baie D'Urfe
COLUMBIA CAIU 807457 0 La Cie Canada Baie D'Urfe
将不胜感激。
通常,如果您的 table 有标识列或保证行排序的方法,您可以使用 CTE 以相对高效的方式实现此目的。然而,我们这里没有那么奢侈,所以另一种解决方案是使用效率低得多的 CURSOR
来代替。
-- Cursor variables
DECLARE @FromCompanyCursor varchar(20),
@ContainerCursor varchar(20),
@NumbersCursor int,
@ToCompanyCursor varchar(20),
@LocationCursor varchar(20),
@FromCompany varchar(20),
@Container varchar(20),
@Numbers int,
@ToCompany varchar(20),
@Location varchar(20);
-- Cursor declaration
DECLARE C CURSOR FOR
(
SELECT FromCompany,
Container,
Numbers,
ToCompany,
Location
FROM TableName
)
FOR UPDATE OF FromCompany, Container, Numbers, ToCompany, Location;
OPEN C;
-- Get first row from the cursor
FETCH NEXT FROM C INTO @FromCompanyCursor, @ContainerCursor, @NumbersCursor, @ToCompanyCursor, @LocationCursor;
-- While we still have rows to iterate over
WHILE @@FETCH_STATUS = 0
BEGIN
-- Keep track of the last non-null value
SELECT @FromCompany = CASE WHEN @FromCompanyCursor IS NOT NULL THEN @FromCompanyCursor ELSE @FromCompany END,
@Container = CASE WHEN @ContainerCursor IS NOT NULL THEN @ContainerCursor ELSE @Container END,
@Numbers = CASE WHEN @NumbersCursor IS NOT NULL THEN @NumbersCursor ELSE @Numbers END,
@ToCompany = CASE WHEN @ToCompanyCursor IS NOT NULL THEN @ToCompanyCursor ELSE @ToCompany END,
@Location = CASE WHEN @LocationCursor IS NOT NULL THEN @LocationCursor ELSE @Location END;
-- Update the table with the last non-null values
UPDATE TableName
SET FromCompany = @FromCompany,
Container = @Container,
Numbers = @Numbers,
ToCompany = @ToCompany,
Location = @Location
WHERE CURRENT OF C;
-- Get the next row from the cursor
FETCH NEXT FROM C INTO @FromCompanyCursor, @ContainerCursor, @NumbersCursor, @ToCompanyCursor, @LocationCursor;
END
-- Don't forget to close the cursor!
CLOSE C;
DEALLOCATE C;
请注意,像这样的基于程序的操作在 SQL 服务器中效率极低,因此像这样的解决方案应该用作一次性操作,或作为计划维护的一部分工作。
正如大部分人评论的那样,您确实需要一列来对数据集进行排序。由于您的数据来自 CSV 文件,因此您可以在加载文件之前编辑文件以添加自动递增的行号。
假设您有此列 (id
),这里是一个 SQLServer 解决方案,用于解决用第一个非 NULL
值填充 NULL
值的问题同一列。
基本思路是将每条记录归为一组,组号对应第一条非空值记录的id。要填充 5 列,我们需要 5 个组。
SELECT
t.* ,
MAX(CASE WHEN FromCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpFromCompany,
MAX(CASE WHEN Container IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpContainer,
MAX(CASE WHEN Numbers IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpNumbers,
MAX(CASE WHEN ToCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpToCompany,
MAX(CASE WHEN Location IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpLocation
FROM mytable t
Returns :
id | FromCompany | Container | Numbers | ToCompany | Location | grpFromCompany | grpContainer | grpNumbers | grpToCompany | grpLocation
-: | :---------- | :---------- | ------: | :------------ | :---------- | -------------: | -----------: | ---------: | -----------: | ----------:
1 | DISCOVERY | HALU 330308 | 5 | MAGNA CHARGE | St-Laurent | 1 | 1 | 1 | 1 | 1
2 | null | ATSU 827944 | 0 | LEEZA DIST. | null | 1 | 2 | 2 | 2 | 1
3 | null | null | 4 | null | null | 1 | 2 | 3 | 2 | 1
4 | COLUMBIA | CAIU 807457 | 3 | La Cie Canada | Baie D'Urfe | 4 | 4 | 4 | 4 | 4
5 | null | null | 6 | null | null | 4 | 4 | 5 | 4 | 4
6 | null | null | 0 | null | null | 4 | 4 | 6 | 4 | 4
现在我们可以把它变成一个 CTE,并用它来查找 table 中的相关值:
WITH mycte AS (
SELECT
t.* ,
MAX(CASE WHEN FromCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpFromCompany,
MAX(CASE WHEN Container IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpContainer,
MAX(CASE WHEN Numbers IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpNumbers,
MAX(CASE WHEN ToCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpToCompany,
MAX(CASE WHEN Location IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpLocation
FROM mytable t
)
SELECT
id,
(SELECT FromCompany FROM mytable WHERE id = grpFromCompany) AS FromCompany,
(SELECT Container FROM mytable WHERE id = grpFromCompany) AS Container,
(SELECT Numbers FROM mytable WHERE id = grpNumbers) AS Numbers,
(SELECT ToCompany FROM mytable WHERE id = grpToCompany) AS ToCompany,
(SELECT Location FROM mytable WHERE id = grpLocation) AS Location
FROM mycte
GO
id | FromCompany | Container | Numbers | ToCompany | Location
-: | :---------- | :---------- | ------: | :------------ | :----------
1 | DISCOVERY | HALU 330308 | 5 | MAGNA CHARGE | St-Laurent
2 | DISCOVERY | HALU 330308 | 0 | LEEZA DIST. | St-Laurent
3 | DISCOVERY | HALU 330308 | 4 | LEEZA DIST. | St-Laurent
4 | COLUMBIA | CAIU 807457 | 3 | La Cie Canada | Baie D'Urfe
5 | COLUMBIA | CAIU 807457 | 6 | La Cie Canada | Baie D'Urfe
6 | COLUMBIA | CAIU 807457 | 0 | La Cie Canada | Baie D'Urfe
db<>fiddle here
我需要将 Table A 转换为 Table B(即用每列的先前非空值填充所有空值)
主要任务是用每列的先前非空值填充空值。
这是原文Table:
FromCompany Container Numbers ToCompany Location
DISCOVERY HALU 330308 5 MAGNA CHARGE St-Laurent
ATSU 827944 0 LEEZA DIST.
4
COLUMBIA CAIU 807457 3 La Cie Canada Baie D'Urfe
6
0
最后的 Table 应该是:
FromCompany Container Numbers ToCompany Location
DISCOVERY HALU 330308 5 MAGNA CHARGE St-Laurent
DISCOVERY ATSU 827944 0 LEEZA DIST St-Laurent
DISCOVERY ATSU 827944 4 LEEZA DIST St-Laurent
COLUMBIA CAIU 807457 3 La Cie Canada Baie D'Urfe
COLUMBIA CAIU 807457 6 La Cie Canada Baie D'Urfe
COLUMBIA CAIU 807457 0 La Cie Canada Baie D'Urfe
将不胜感激。
通常,如果您的 table 有标识列或保证行排序的方法,您可以使用 CTE 以相对高效的方式实现此目的。然而,我们这里没有那么奢侈,所以另一种解决方案是使用效率低得多的 CURSOR
来代替。
-- Cursor variables
DECLARE @FromCompanyCursor varchar(20),
@ContainerCursor varchar(20),
@NumbersCursor int,
@ToCompanyCursor varchar(20),
@LocationCursor varchar(20),
@FromCompany varchar(20),
@Container varchar(20),
@Numbers int,
@ToCompany varchar(20),
@Location varchar(20);
-- Cursor declaration
DECLARE C CURSOR FOR
(
SELECT FromCompany,
Container,
Numbers,
ToCompany,
Location
FROM TableName
)
FOR UPDATE OF FromCompany, Container, Numbers, ToCompany, Location;
OPEN C;
-- Get first row from the cursor
FETCH NEXT FROM C INTO @FromCompanyCursor, @ContainerCursor, @NumbersCursor, @ToCompanyCursor, @LocationCursor;
-- While we still have rows to iterate over
WHILE @@FETCH_STATUS = 0
BEGIN
-- Keep track of the last non-null value
SELECT @FromCompany = CASE WHEN @FromCompanyCursor IS NOT NULL THEN @FromCompanyCursor ELSE @FromCompany END,
@Container = CASE WHEN @ContainerCursor IS NOT NULL THEN @ContainerCursor ELSE @Container END,
@Numbers = CASE WHEN @NumbersCursor IS NOT NULL THEN @NumbersCursor ELSE @Numbers END,
@ToCompany = CASE WHEN @ToCompanyCursor IS NOT NULL THEN @ToCompanyCursor ELSE @ToCompany END,
@Location = CASE WHEN @LocationCursor IS NOT NULL THEN @LocationCursor ELSE @Location END;
-- Update the table with the last non-null values
UPDATE TableName
SET FromCompany = @FromCompany,
Container = @Container,
Numbers = @Numbers,
ToCompany = @ToCompany,
Location = @Location
WHERE CURRENT OF C;
-- Get the next row from the cursor
FETCH NEXT FROM C INTO @FromCompanyCursor, @ContainerCursor, @NumbersCursor, @ToCompanyCursor, @LocationCursor;
END
-- Don't forget to close the cursor!
CLOSE C;
DEALLOCATE C;
请注意,像这样的基于程序的操作在 SQL 服务器中效率极低,因此像这样的解决方案应该用作一次性操作,或作为计划维护的一部分工作。
正如大部分人评论的那样,您确实需要一列来对数据集进行排序。由于您的数据来自 CSV 文件,因此您可以在加载文件之前编辑文件以添加自动递增的行号。
假设您有此列 (id
),这里是一个 SQLServer 解决方案,用于解决用第一个非 NULL
值填充 NULL
值的问题同一列。
基本思路是将每条记录归为一组,组号对应第一条非空值记录的id。要填充 5 列,我们需要 5 个组。
SELECT
t.* ,
MAX(CASE WHEN FromCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpFromCompany,
MAX(CASE WHEN Container IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpContainer,
MAX(CASE WHEN Numbers IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpNumbers,
MAX(CASE WHEN ToCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpToCompany,
MAX(CASE WHEN Location IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpLocation
FROM mytable t
Returns :
id | FromCompany | Container | Numbers | ToCompany | Location | grpFromCompany | grpContainer | grpNumbers | grpToCompany | grpLocation -: | :---------- | :---------- | ------: | :------------ | :---------- | -------------: | -----------: | ---------: | -----------: | ----------: 1 | DISCOVERY | HALU 330308 | 5 | MAGNA CHARGE | St-Laurent | 1 | 1 | 1 | 1 | 1 2 | null | ATSU 827944 | 0 | LEEZA DIST. | null | 1 | 2 | 2 | 2 | 1 3 | null | null | 4 | null | null | 1 | 2 | 3 | 2 | 1 4 | COLUMBIA | CAIU 807457 | 3 | La Cie Canada | Baie D'Urfe | 4 | 4 | 4 | 4 | 4 5 | null | null | 6 | null | null | 4 | 4 | 5 | 4 | 4 6 | null | null | 0 | null | null | 4 | 4 | 6 | 4 | 4
现在我们可以把它变成一个 CTE,并用它来查找 table 中的相关值:
WITH mycte AS (
SELECT
t.* ,
MAX(CASE WHEN FromCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpFromCompany,
MAX(CASE WHEN Container IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpContainer,
MAX(CASE WHEN Numbers IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpNumbers,
MAX(CASE WHEN ToCompany IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpToCompany,
MAX(CASE WHEN Location IS NOT NULL THEN id END) OVER(ORDER BY id ROWS UNBOUNDED PRECEDING) AS grpLocation
FROM mytable t
)
SELECT
id,
(SELECT FromCompany FROM mytable WHERE id = grpFromCompany) AS FromCompany,
(SELECT Container FROM mytable WHERE id = grpFromCompany) AS Container,
(SELECT Numbers FROM mytable WHERE id = grpNumbers) AS Numbers,
(SELECT ToCompany FROM mytable WHERE id = grpToCompany) AS ToCompany,
(SELECT Location FROM mytable WHERE id = grpLocation) AS Location
FROM mycte
GO
id | FromCompany | Container | Numbers | ToCompany | Location -: | :---------- | :---------- | ------: | :------------ | :---------- 1 | DISCOVERY | HALU 330308 | 5 | MAGNA CHARGE | St-Laurent 2 | DISCOVERY | HALU 330308 | 0 | LEEZA DIST. | St-Laurent 3 | DISCOVERY | HALU 330308 | 4 | LEEZA DIST. | St-Laurent 4 | COLUMBIA | CAIU 807457 | 3 | La Cie Canada | Baie D'Urfe 5 | COLUMBIA | CAIU 807457 | 6 | La Cie Canada | Baie D'Urfe 6 | COLUMBIA | CAIU 807457 | 0 | La Cie Canada | Baie D'Urfe
db<>fiddle here