如何将缺失数据插入投标数据(9400 万条记录)table
How do I Insert missing data into a bid data (94 million records) table
数据中缺少秒数:
我在一个名为 [ESdata-1sec]
的 table 中有一些大数据 9400 万条记录。数据表示标准普尔 500 eMini 期货的日期和时间:open
、high
、low
、close
和 volume
值。
我要插入数据,缺秒
例如,最上面的记录是 00:05:23
,table 中的下一条记录是 00:05:33
(10 秒后)。所以 [ESdata-1sec]
缺少条目 9 秒。
我想插入 00:05::24, 00:05:25, 00:05:26 ... 00:05:32
的记录。
我有一个 table (DateRanges
) 的 DateTime
个条目,其中包含每秒的记录,所以我可以 join/union 获取日期和插入时间,开高低收盘量值将与最后一条记录相同(00:05:23
)。
我不知道如何才能有效地做到这一点。我可以暴力循环 [ESdata-1sec]
table 并在缺少一秒时添加一条记录,但我不确定我是否能活到足够长的时间看到结果。
在此方面的任何帮助都将不胜感激。
您可以尝试使用聚合 Window 函数创建一些组 ID (GID
) 以及 JOIN,如下所示。
生成输入的脚本:
CREATE TABLE #ESdata1sec
(
[Date] DATE,
[Time] TIME,
[Open] FLOAT,
[High] FLOAT,
[Low] FLOAT,
[Close] FLOAT,
[Volume] INT
)
INSERT INTO #ESdata1sec VALUES
('2009-09-30', '00:05:23.000', 1056.25, 1056.25, 1056.25, 1056.25, 1),
('2009-09-30', '00:05:33.000', 1056.25, 1056.25, 1056.25, 1056.25, 1),
('2009-09-30', '00:05:51.000', 1056.25, 1056.25, 1056.25, 1056.25, 5),
('2009-09-30', '00:05:54.000', 1056.25, 1056.25, 1056.25, 1056.25, 4),
('2009-09-30', '00:05:55.000', 1056.25, 1056.25, 1056.25, 1056.25, 28),
('2009-09-30', '00:05:57.000', 1056.00, 1056.25, 1056.00, 1056.25, 15),
('2009-09-30', '00:06:07.000', 1056.25, 1056.25, 1056.25, 1056.25, 55),
('2009-09-30', '00:06:14.000', 1056.25, 1056.25, 1056.25, 1056.25, 10),
('2009-09-30', '00:06:19.000', 1056.25, 1056.25, 1056.25, 1056.25, 8)
GO
CREATE TABLE #DateRanges
(
[Date] DATETIME
)
DECLARE @start AS DATETIME = '2009-09-30 00:05:23.000',
@end AS DATETIME = '2009-09-30 00:06:19.000'
WHILE @start <= @end
BEGIN
INSERT INTO #DateRanges VALUES (@start)
SET @start = DATEADD(SECOND, 1, @start)
END
GO
解法:
WITH x AS
(
SELECT
SUM(CASE WHEN ID IS NOT NULL Then 1 END) OVER (ORDER BY [Date]) AS GID
,[ID]
,[Date]
,[Open], [High], [Low], [Close], [Volume]
FROM #DateRanges d
LEFT OUTER JOIN
(
SELECT
ROW_NUMBER() OVER (ORDER BY [Date], [Time]) AS ID
,CAST([Date] AS DATETIME) + CAST([Time] AS DATETIME) AS [FullDate]
,[Open], [High], [Low], [Close], [Volume]
FROM #ESdata1sec
) e
ON e.FullDate = d.Date
)
SELECT
a.[Date]
,b.[Open], b.[High], b.[Low], b.[Close], b.[Volume]
FROM x a
LEFT OUTER JOIN
(
SELECT ID
,[Open], [High], [Low], [Close], [Volume]
FROM x
WHERE
ID IS NOT NULL
) b
ON a.GID = b.ID
ORDER BY
a.[DATE]
输出:
+-------------------------+---------+---------+---------+---------+--------+
| Date | Open | High | Low | Close | Volume |
+-------------------------+---------+---------+---------+---------+--------+
| 2009-09-30 00:05:23.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:24.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:25.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:26.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:27.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:28.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:29.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:30.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:31.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:32.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:33.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:34.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:35.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:36.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:37.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:38.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:39.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:40.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:41.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:42.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:43.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:44.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:45.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:46.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:47.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:48.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:49.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:50.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:51.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:52.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:53.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:54.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 4 |
| 2009-09-30 00:05:55.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 28 |
| 2009-09-30 00:05:56.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 28 |
| 2009-09-30 00:05:57.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:05:58.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:05:59.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:00.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:01.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:02.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:03.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:04.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:05.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:06.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:07.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:08.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:09.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:10.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:11.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:12.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:13.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:14.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:15.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:16.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:17.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:18.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:19.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 8 |
+-------------------------+---------+---------+---------+---------+--------+
数据中缺少秒数:
我在一个名为 [ESdata-1sec]
的 table 中有一些大数据 9400 万条记录。数据表示标准普尔 500 eMini 期货的日期和时间:open
、high
、low
、close
和 volume
值。
我要插入数据,缺秒
例如,最上面的记录是 00:05:23
,table 中的下一条记录是 00:05:33
(10 秒后)。所以 [ESdata-1sec]
缺少条目 9 秒。
我想插入 00:05::24, 00:05:25, 00:05:26 ... 00:05:32
的记录。
我有一个 table (DateRanges
) 的 DateTime
个条目,其中包含每秒的记录,所以我可以 join/union 获取日期和插入时间,开高低收盘量值将与最后一条记录相同(00:05:23
)。
我不知道如何才能有效地做到这一点。我可以暴力循环 [ESdata-1sec]
table 并在缺少一秒时添加一条记录,但我不确定我是否能活到足够长的时间看到结果。
在此方面的任何帮助都将不胜感激。
您可以尝试使用聚合 Window 函数创建一些组 ID (GID
) 以及 JOIN,如下所示。
生成输入的脚本:
CREATE TABLE #ESdata1sec
(
[Date] DATE,
[Time] TIME,
[Open] FLOAT,
[High] FLOAT,
[Low] FLOAT,
[Close] FLOAT,
[Volume] INT
)
INSERT INTO #ESdata1sec VALUES
('2009-09-30', '00:05:23.000', 1056.25, 1056.25, 1056.25, 1056.25, 1),
('2009-09-30', '00:05:33.000', 1056.25, 1056.25, 1056.25, 1056.25, 1),
('2009-09-30', '00:05:51.000', 1056.25, 1056.25, 1056.25, 1056.25, 5),
('2009-09-30', '00:05:54.000', 1056.25, 1056.25, 1056.25, 1056.25, 4),
('2009-09-30', '00:05:55.000', 1056.25, 1056.25, 1056.25, 1056.25, 28),
('2009-09-30', '00:05:57.000', 1056.00, 1056.25, 1056.00, 1056.25, 15),
('2009-09-30', '00:06:07.000', 1056.25, 1056.25, 1056.25, 1056.25, 55),
('2009-09-30', '00:06:14.000', 1056.25, 1056.25, 1056.25, 1056.25, 10),
('2009-09-30', '00:06:19.000', 1056.25, 1056.25, 1056.25, 1056.25, 8)
GO
CREATE TABLE #DateRanges
(
[Date] DATETIME
)
DECLARE @start AS DATETIME = '2009-09-30 00:05:23.000',
@end AS DATETIME = '2009-09-30 00:06:19.000'
WHILE @start <= @end
BEGIN
INSERT INTO #DateRanges VALUES (@start)
SET @start = DATEADD(SECOND, 1, @start)
END
GO
解法:
WITH x AS
(
SELECT
SUM(CASE WHEN ID IS NOT NULL Then 1 END) OVER (ORDER BY [Date]) AS GID
,[ID]
,[Date]
,[Open], [High], [Low], [Close], [Volume]
FROM #DateRanges d
LEFT OUTER JOIN
(
SELECT
ROW_NUMBER() OVER (ORDER BY [Date], [Time]) AS ID
,CAST([Date] AS DATETIME) + CAST([Time] AS DATETIME) AS [FullDate]
,[Open], [High], [Low], [Close], [Volume]
FROM #ESdata1sec
) e
ON e.FullDate = d.Date
)
SELECT
a.[Date]
,b.[Open], b.[High], b.[Low], b.[Close], b.[Volume]
FROM x a
LEFT OUTER JOIN
(
SELECT ID
,[Open], [High], [Low], [Close], [Volume]
FROM x
WHERE
ID IS NOT NULL
) b
ON a.GID = b.ID
ORDER BY
a.[DATE]
输出:
+-------------------------+---------+---------+---------+---------+--------+
| Date | Open | High | Low | Close | Volume |
+-------------------------+---------+---------+---------+---------+--------+
| 2009-09-30 00:05:23.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:24.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:25.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:26.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:27.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:28.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:29.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:30.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:31.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:32.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:33.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:34.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:35.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:36.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:37.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:38.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:39.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:40.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:41.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:42.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:43.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:44.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:45.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:46.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:47.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:48.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:49.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:50.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 1 |
| 2009-09-30 00:05:51.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:52.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:53.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 5 |
| 2009-09-30 00:05:54.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 4 |
| 2009-09-30 00:05:55.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 28 |
| 2009-09-30 00:05:56.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 28 |
| 2009-09-30 00:05:57.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:05:58.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:05:59.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:00.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:01.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:02.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:03.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:04.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:05.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:06.000 | 1056 | 1056.25 | 1056 | 1056.25 | 15 |
| 2009-09-30 00:06:07.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:08.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:09.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:10.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:11.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:12.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:13.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 55 |
| 2009-09-30 00:06:14.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:15.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:16.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:17.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:18.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 10 |
| 2009-09-30 00:06:19.000 | 1056.25 | 1056.25 | 1056.25 | 1056.25 | 8 |
+-------------------------+---------+---------+---------+---------+--------+