在 TimescaleDB 的 gapfill 的 WHERE 子句中使用子查询

Using a subquery in the WHERE clause of gapfill in TimescaleDB

我想 运行 timescaleDB 的 gapfill 函数,其中开始和结束日期是自动生成的。例如,我想 运行 数据库中最大条目和最小条目之间的 gapfill 函数。

给定的数据集游乐场:

CREATE TABLE public.playground (
    value1 numeric,
    "timestamp" bigint,
    name "char"
);

INSERT INTO playground(name, value1, timestamp)
VALUES ('test', 100, 1599100000000000000);

INSERT INTO playground(name, value1, timestamp)
VALUES ('test', 100, 1599100001000000000);

INSERT INTO playground(name, value1, timestamp)
VALUES ('test', 100, 1599300000000000000);

我试过这样获取数据:

SELECT time_bucket_gapfill(300E9::BIGINT, timestamp) as bucket
FROM playground
WHERE 
    timestamp >= (SELECT COALESCE(MIN(timestamp), 0) FROM playground)
    AND
    timestamp < (SELECT COALESCE(MAX(timestamp), 0) FROM playground)
GROUP BY bucket

我得到一个错误:

ERROR: missing time_bucket_gapfill argument: could not infer start from WHERE clause

如果我尝试使用硬编码时间戳进行查询,查询 运行 就可以了。 例如:

SELECT time_bucket_gapfill(300E9::BIGINT, timestamp) as bucket
FROM playground
WHERE timestamp >= 0 AND timestamp < 15900000000000000
GROUP BY bucket

在 gapfill 函数中提供开始和结束日期作为参数的另一种方法也失败了。

 WITH bounds AS (
  SELECT COALESCE(MIN(timestamp), 0) as min, COALESCE(MAX(timestamp), 0) as max
  FROM playground
  WHERE timestamp >= 0 AND timestamp < 15900000000000000
),
gapfill as(
SELECT time_bucket_gapfill(300E9::BIGINT, timestamp, bounds.min, bounds.max) as bucket
FROM playground, bounds
GROUP BY bucket
)
select * from gapfill

ERROR: invalid time_bucket_gapfill argument: start must be a simple expression

对于从 WHERE 子句推断开始和停止,仅支持直接列引用

参见:https://github.com/timescale/timescaledb/issues/1345

所以类似的东西可能有用,(我没有 timescaleDB 访问权限来测试) 但试试这个:

SELECT
    time_bucket_gapfill(300E9::BIGINT, time_range.min , time_range.max ) AS bucket
FROM
    (
        SELECT
            COALESCE(MIN(timestamp), 0)   AS min
            , COALESCE(MAX(timestamp), 0) AS max
        FROM
            playground
    ) AS time_range
    , playground
WHERE
    timestamp >= time_range.min
    AND timestamp < time_range.max
GROUP BY
    bucket;

time_bucket_gapfill 只接受 startfinish 值,这些值可以在查询计划时评估为常量。因此它可以提供带有常量和 now 的表达式,但是在表达式中访问 table 是行不通的。

虽然 time_bucket_gapfill 存在此限制,但无法在单个查询中实现所需的行为。解决方法是分别计算 startfinish 的值,然后使用 time_bucket_gapfill 将这些值提供给查询,这可以在存储过程或应用程序中完成。

附注,如果 PREPARE statement will be used in PostgreSQL 12, it is important to explicitly disable generic plan 出于同样的原因。