用 SQL 将缺失的金融时间序列数据从一个 Table 填充到另一个
Fill Missing Financial Time Series Data From One Table to Another with SQL
这是我在 Github 上构建的开源 python 项目的一部分,可以在此处找到:
我有很多来自 FXCM 的金融时间序列数据,这些数据充满了空白。这些空白需要由数据库中的其他数据来填补,如果有人可以帮助我,我很困惑?
数据库和 tables 是使用 python 脚本创建的,可以在 here
中找到
下面是代码片段。
CREATE DATABASE IF NOT EXISTS fxcm_bar_GBPUSD;
CREATE TABLE IF NOT EXISTS fxcm_bar_GBPUSD.tbl_GBPUSD_m1;
`date` DATETIME NOT NULL,
`bidopen` DECIMAL(19,6) NULL,
`bidhigh` DECIMAL(19,6) NULL,
`bidlow` DECIMAL(19,6) NULL,
`bidclose` DECIMAL(19,6) NULL,
`askopen` DECIMAL(19,6) NULL,
`askhigh` DECIMAL(19,6) NULL,
`asklow` DECIMAL(19,6) NULL,
`askclose` DECIMAL(19,6) NULL,
`volume` BIGINT NULL,
PRIMARY KEY (`date`))
ENGINE=InnoDB;
以下两个查询分别针对 1 分钟和 5 分钟的时间间隔,您可以看到在 1 分钟内缺少很多数据点。在我求助于 'predicting' 值之前,5 分钟内有一些数据点 table 可以帮助填补空白。
MariaDB [(none)]> select * from fxcm_bar_GBPUSD.tbl_GBPUSD_m1 where date >= "2002-3-31 17:00:00" and date <= "2002-3-31 18:00:00";
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:01:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425700 | 1.425800 | 1.426100 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:17:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:22:00 | 1.425700 | 1.425800 | 1.425700 | 1.425800 | 1.426000 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:24:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:29:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:31:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:48:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
9 rows in set (0.00 sec)
MariaDB [(none)]> select * from fxcm_bar_GBPUSD.tbl_GBPUSD_m5 where date >= "2002-3-31 17:00:00" and date <= "2002-3-31 18:00:00";
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
7 rows in set (0.01 sec)
MariaDB [(none)]>
同样,第 1 分钟 table 中有数据点,可以填补第 5 分钟 table 中缺失的数据点。
table 交换值后,它们看起来像这样。
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:01:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425700 | 1.425800 | 1.426100 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:17:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:22:00 | 1.425700 | 1.425800 | 1.425700 | 1.425800 | 1.426000 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:24:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:29:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:31:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 17:48:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
还有缺失的数据点,不过刚补完的数据是真实数据
然后我将使用 python 在数据库之外执行进一步的数据插值,这不是这个问题的一部分。
我如何让这两个 table 交换并插入缺失的行而不会交叉污染?
谢谢
我猜这就是您想要的。虽然可能是错的。很难说。
您所有的出价*数据都不受任何操作的影响,因此您的问题似乎等同于 tables 带有日期(此处为 tp)以标识行和一些数据,此处抽象为文本(这里是 t)为了方便。
-- Example setup
CREATE TABLE minutes1 (tp datetime, t text, PRIMARY KEY (tp));
CREATE TABLE minutes5 (tp datetime, t text, PRIMARY KEY (tp));
-- keep common data 00:00:00 as is
INSERT INTO minutes1 VALUES ('2017-01-01 00:00:00', 'a');
INSERT INTO minutes1 VALUES ('2017-01-01 00:01:00', 'b');
-- add this 00:05:00 to minutes5 because would fit there and is missing
INSERT INTO minutes1 VALUES ('2017-01-01 00:05:00', 'c');
-- keep common data for 00:00:00 as is
INSERT INTO minutes5 VALUES ('2017-01-01 00:00:00', '1');
-- add this 00:10:00 to minutes1 because would fit there and is missing
INSERT INTO minutes5 VALUES ('2017-01-01 00:10:00', '2');
table minutes1
tp | t
'2017-01-01 00:00:00' | 'a'
'2017-01-01 00:01:00' | 'b'
'2017-01-01 00:05:00' | 'c'
table minutes5
tp | t
'2017-01-01 00:00:00' | '1'
'2017-01-01 00:10:00' | '2'
解决攻略
我们从不更改任何 table 中的现有数据。只插入缺失的部分。因此不会发生交叉污染:
- 如果数据在两者中,则什么也不会发生。
- 如果数据在其中一个,而另一个不在,则可以安全插入。
- 如果数据不在任何一个中,那么我们无论如何也没有什么可传输的。
- 始终尊重粒度。
- 始终可以从 5 分钟步长插入到 1 分钟步长。
- 只有当步长 n 能被 5 整除时,才能从 1 分钟步长插入到 5 分钟步长。
从分钟5转为分钟1
如果 minutes1 中的数据丢失,这始终是安全的,因为 minutes1 的粒度小于 minutes5。
INSERT INTO minutes1
SELECT * FROM minutes5
WHERE date NOT IN (SELECT date FROM minutes1);
从分钟1转为分钟5
我们不能将 2 分钟的日期插入到 table 中,粒度为 5 分钟。
我们使用与上面相同的策略,使用额外的 WHERE MINUTE(date) % 5 = 0
子句来检查粒度。
INSERT INTO minutes5
SELECT * FROM minutes1
WHERE MINUTE(date) % 5 = 0 AND date NOT IN (SELECT date FROM minutes5);
预期结果
SELECT * FROM minutes1;
SELECT * FROM minutes5;
table minutes1
tp | t
'2017-01-01 00:00:00' | 'a'
'2017-01-01 00:01:00' | 'b' -- not added to minutes5
'2017-01-01 00:05:00' | 'c'
'2017-01-01 00:10:00' | '2' -- copied from minutes5
table minutes5
tp | t
'2017-01-01 00:00:00' | '1'
'2017-01-01 00:10:00' | '2'
'2017-01-01 00:05:00' | 'c' -- copied from minutes1
备注
您可以考虑添加一个 CHECK CONSTRAINT
以保证 minutes5
table 与 MINUTE(date) % 5 = 0
的完整性。请查阅您的 MariaDB 手册以获取有关如何实现此目的的说明。大概是这样的。
ALTER TABLE minutes5
ADD CONSTRAINT check_minutes5_is_multiple_of_5
CHECK (MINUTE(date) % 5 = 0);
这是我在 Github 上构建的开源 python 项目的一部分,可以在此处找到:
我有很多来自 FXCM 的金融时间序列数据,这些数据充满了空白。这些空白需要由数据库中的其他数据来填补,如果有人可以帮助我,我很困惑?
数据库和 tables 是使用 python 脚本创建的,可以在 here
中找到下面是代码片段。
CREATE DATABASE IF NOT EXISTS fxcm_bar_GBPUSD;
CREATE TABLE IF NOT EXISTS fxcm_bar_GBPUSD.tbl_GBPUSD_m1;
`date` DATETIME NOT NULL,
`bidopen` DECIMAL(19,6) NULL,
`bidhigh` DECIMAL(19,6) NULL,
`bidlow` DECIMAL(19,6) NULL,
`bidclose` DECIMAL(19,6) NULL,
`askopen` DECIMAL(19,6) NULL,
`askhigh` DECIMAL(19,6) NULL,
`asklow` DECIMAL(19,6) NULL,
`askclose` DECIMAL(19,6) NULL,
`volume` BIGINT NULL,
PRIMARY KEY (`date`))
ENGINE=InnoDB;
以下两个查询分别针对 1 分钟和 5 分钟的时间间隔,您可以看到在 1 分钟内缺少很多数据点。在我求助于 'predicting' 值之前,5 分钟内有一些数据点 table 可以帮助填补空白。
MariaDB [(none)]> select * from fxcm_bar_GBPUSD.tbl_GBPUSD_m1 where date >= "2002-3-31 17:00:00" and date <= "2002-3-31 18:00:00";
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:01:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425700 | 1.425800 | 1.426100 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:17:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:22:00 | 1.425700 | 1.425800 | 1.425700 | 1.425800 | 1.426000 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:24:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:29:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:31:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:48:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
9 rows in set (0.00 sec)
MariaDB [(none)]> select * from fxcm_bar_GBPUSD.tbl_GBPUSD_m5 where date >= "2002-3-31 17:00:00" and date <= "2002-3-31 18:00:00";
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
7 rows in set (0.01 sec)
MariaDB [(none)]>
同样,第 1 分钟 table 中有数据点,可以填补第 5 分钟 table 中缺失的数据点。
table 交换值后,它们看起来像这样。
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:01:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425700 | 1.425800 | 1.426100 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:17:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:22:00 | 1.425700 | 1.425800 | 1.425700 | 1.425800 | 1.426000 | 1.426100 | 1.426000 | 1.426100 | 0 |
| 2002-03-31 17:24:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:29:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:31:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 17:48:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| date | bidopen | bidhigh | bidlow | bidclose | askopen | askhigh | asklow | askclose | volume |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
| 2002-03-31 17:00:00 | 1.425900 | 1.425900 | 1.425800 | 1.425800 | 1.426200 | 1.426200 | 1.426100 | 1.426100 | 0 |
| 2002-03-31 17:15:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:20:00 | 1.425600 | 1.425700 | 1.425500 | 1.425700 | 1.425900 | 1.426000 | 1.425800 | 1.426000 | 0 |
| 2002-03-31 17:25:00 | 1.425600 | 1.425800 | 1.425600 | 1.425800 | 1.425900 | 1.426100 | 1.425900 | 1.426100 | 0 |
| 2002-03-31 17:30:00 | 1.425800 | 1.425800 | 1.425600 | 1.425600 | 1.426100 | 1.426100 | 1.425900 | 1.425900 | 0 |
| 2002-03-31 17:45:00 | 1.425600 | 1.425600 | 1.425200 | 1.425200 | 1.425900 | 1.425900 | 1.425500 | 1.425500 | 0 |
| 2002-03-31 18:00:00 | 1.425200 | 1.425500 | 1.425200 | 1.425500 | 1.425500 | 1.425800 | 1.425500 | 1.425800 | 0 |
+---------------------+----------+----------+----------+----------+----------+----------+----------+----------+--------+
还有缺失的数据点,不过刚补完的数据是真实数据
然后我将使用 python 在数据库之外执行进一步的数据插值,这不是这个问题的一部分。
我如何让这两个 table 交换并插入缺失的行而不会交叉污染?
谢谢
我猜这就是您想要的。虽然可能是错的。很难说。
您所有的出价*数据都不受任何操作的影响,因此您的问题似乎等同于 tables 带有日期(此处为 tp)以标识行和一些数据,此处抽象为文本(这里是 t)为了方便。
-- Example setup
CREATE TABLE minutes1 (tp datetime, t text, PRIMARY KEY (tp));
CREATE TABLE minutes5 (tp datetime, t text, PRIMARY KEY (tp));
-- keep common data 00:00:00 as is
INSERT INTO minutes1 VALUES ('2017-01-01 00:00:00', 'a');
INSERT INTO minutes1 VALUES ('2017-01-01 00:01:00', 'b');
-- add this 00:05:00 to minutes5 because would fit there and is missing
INSERT INTO minutes1 VALUES ('2017-01-01 00:05:00', 'c');
-- keep common data for 00:00:00 as is
INSERT INTO minutes5 VALUES ('2017-01-01 00:00:00', '1');
-- add this 00:10:00 to minutes1 because would fit there and is missing
INSERT INTO minutes5 VALUES ('2017-01-01 00:10:00', '2');
table minutes1
tp | t
'2017-01-01 00:00:00' | 'a'
'2017-01-01 00:01:00' | 'b'
'2017-01-01 00:05:00' | 'c'
table minutes5
tp | t
'2017-01-01 00:00:00' | '1'
'2017-01-01 00:10:00' | '2'
解决攻略
我们从不更改任何 table 中的现有数据。只插入缺失的部分。因此不会发生交叉污染:
- 如果数据在两者中,则什么也不会发生。
- 如果数据在其中一个,而另一个不在,则可以安全插入。
- 如果数据不在任何一个中,那么我们无论如何也没有什么可传输的。
- 始终尊重粒度。
- 始终可以从 5 分钟步长插入到 1 分钟步长。
- 只有当步长 n 能被 5 整除时,才能从 1 分钟步长插入到 5 分钟步长。
从分钟5转为分钟1
如果 minutes1 中的数据丢失,这始终是安全的,因为 minutes1 的粒度小于 minutes5。
INSERT INTO minutes1
SELECT * FROM minutes5
WHERE date NOT IN (SELECT date FROM minutes1);
从分钟1转为分钟5
我们不能将 2 分钟的日期插入到 table 中,粒度为 5 分钟。
我们使用与上面相同的策略,使用额外的 WHERE MINUTE(date) % 5 = 0
子句来检查粒度。
INSERT INTO minutes5
SELECT * FROM minutes1
WHERE MINUTE(date) % 5 = 0 AND date NOT IN (SELECT date FROM minutes5);
预期结果
SELECT * FROM minutes1;
SELECT * FROM minutes5;
table minutes1
tp | t
'2017-01-01 00:00:00' | 'a'
'2017-01-01 00:01:00' | 'b' -- not added to minutes5
'2017-01-01 00:05:00' | 'c'
'2017-01-01 00:10:00' | '2' -- copied from minutes5
table minutes5
tp | t
'2017-01-01 00:00:00' | '1'
'2017-01-01 00:10:00' | '2'
'2017-01-01 00:05:00' | 'c' -- copied from minutes1
备注
您可以考虑添加一个 CHECK CONSTRAINT
以保证 minutes5
table 与 MINUTE(date) % 5 = 0
的完整性。请查阅您的 MariaDB 手册以获取有关如何实现此目的的说明。大概是这样的。
ALTER TABLE minutes5
ADD CONSTRAINT check_minutes5_is_multiple_of_5
CHECK (MINUTE(date) % 5 = 0);