如何提高 timescaledb 获取最后时间戳的性能

How to improve the performance of timescaledb getting last timestamp

SELECT timeseries_id, "timestamp" FROM enhydris_timeseriesrecord WHERE timeseries_id=6661 ORDER BY "timestamp" DESC LIMIT 1;

(table的记录约66m,timeseries_id=6661的约0.5m。)

这个查询大约需要 1-2 秒到 运行,我觉得太多了。

如果它使用简单的 btree 索引,它应该在大约 30 次迭代后找到它要查找的内容。据我所知,当我为该查询执行 EXPLAIN ANALYZE 时,它确实使用了索引,但它必须在每个块中这样做,显然有 1374 个块。

如何让查询变得更快?

                 Table "public.enhydris_timeseriesrecord"
    Column     |           Type           | Collation | Nullable | Default 
---------------+--------------------------+-----------+----------+---------
 timeseries_id | integer                  |           | not null | 
 timestamp     | timestamp with time zone |           | not null | 
 value         | double precision         |           |          | 
 flags         | character varying(237)   |           | not null | 
Indexes:
    "enhydris_timeseriesrecord_pk" PRIMARY KEY, btree (timeseries_id, "timestamp")
    "enhydris_timeseriesrecord_timeseries_id_idx" btree (timeseries_id)
    "enhydris_timeseriesrecord_timestamp_idx" btree ("timestamp" DESC)
    "enhydris_timeseriesrecord_timestamp_timeseries_id_idx" btree ("timestamp", timeseries_id)
Foreign-key constraints:
    "enhydris_timeseriesrecord_timeseries_fk" FOREIGN KEY (timeseries_id) REFERENCES enhydris_timeseries(id) DEFERRABLE INITIALLY DEFERRED
Triggers:
    ts_insert_blocker BEFORE INSERT ON enhydris_timeseriesrecord FOR EACH ROW EXECUTE PROCEDURE _timescaledb_internal.insert_blocker()
Number of child tables: 1374 (Use \d+ to list them.)

更新EXPLAIN plan

数据库必须转到每个块的子索引并检索找到哪个是 timeseries_id=x 的最新时间戳。数据库正确使用索引(正如您从解释中看到的那样)它对每个块中的每个子索引进行索引扫描,而不是完整扫描。所以它会进行 >1000 次索引扫描。无法修剪任何块,因为规划器无法知道哪些块具有特定 timeseries_id 的条目。

并且您只有 66m 条记录的 1300 个块 -> 每个块约 50k 行。每个块的行太少。从 Timescale Docs 他们有以下建议:

The key property of choosing the time interval is that the chunk (including indexes) belonging to the most recent interval (or chunks if using space partitions) fit into memory. As such, we typically recommend setting the interval so that these chunk(s) comprise no more than 25% of main memory.

https://docs.timescale.com/latest/using-timescaledb/hypertables#best-practices

减少块数将显着提高查询性能。

此外,如果您使用 TimescaleDB 压缩,您可能会获得更高的查询性能,这将进一步减少需要扫描的块数,您可以按 timeseries_id (https://docs.timescale.com/latest/api#compression) Or you could define a continuous aggregate that will hold the last item per timeseries_id (https://docs.timescale.com/latest/api#continuous-aggregates) 进行分段