将时间 属性 设置为 postgis 几何的 m 维或单独的属性
set the time property as the m-dimension of postgis geometry or as a separate attribute
首先是基本版本信息:
psql (PostgreSQL) 12.7 (Ubuntu 12.7-1.pgdg18.04+1)
postgis | 3.1.1
我使用空间数据库的目的是快速查询specified time scope
和space boundary
内的GPS轨迹。目前我的数据基本信息是如下:
-- geometry table column (there are 50,000 rows in table mpart5w-wkt)
test=# \d "mpart5w-wkt"
Table "public.mpart5w-wkt"
Column | Type | Collation | Nullable | Default
-----------+----------------------------+-----------+----------+---------
driver_id | character varying | | |
order_id | character varying | | |
geom | geometry(LineStringM,4326) | | |
Indexes:
"mpart5w-wkt_driver_id_idx" btree (driver_id)
"mpart5w-wkt_geom_idx" gist (geom gist_geometry_ops_nd)
-- meta info
test=# select * from geometry_columns where f_table_name='mpart5w-wkt';
f_table_catalog | f_table_schema | f_table_name | f_geometry_column | coord_dimension | srid | type
-----------------+----------------+--------------+-------------------+-----------------+------+-------------
test | public | mpart5w-wkt | geom | 3 | 4326 | LINESTRINGM
(1 row)
-- sample data: LINESTRING M (lon lat timestamp)
test=# select st_astext(geom) from "mpart5w-wkt" limit 1;
LINESTRING M (104.04538 30.70745 1538402919,104.04538 30.70744 1538402928,104.04537 30.70745 1538402938,104.04536 30.70743 1538402948,104.04537 30.7074 1538402958, ...)
强调一下,geom
是(LineStringM, 4326)的几何类型。 GIS索引已建立在geom列上。
第一个问题是M维是否支持多索引?
我查了官方手册关于multi-index,它表明我们可以使用4维运算符class:
得到一个4维的BRIN
索引
CREATE INDEX [indexname] ON [tablename]
USING BRIN ([geome_col] brin_geometry_inclusion_ops_4d);
同时,我们可以使用以下语法获取几何类型的 n 维 gist
索引:
CREATE INDEX [indexname] ON [tablename] USING GIST ([geometryfield] gist_geometry_ops_nd);
所以,我想构建 4D gist
索引是有帮助的,只要提供 ZM 维度。
在 functions 的大部分描述中,“此功能支持 3d 并且不会删除 z-index。”提到m-index就不提了,而我对m-index没有更多的了解。
毕竟M维是否支持多索引,如何在M维上使用多索引,目前还没有确凿的证据
也许我应该这样创建table,这样我就不需要再处理M维度了?
create table "part5w-wkt"(
driver_id varchar,
order_id varchar,
geom geometry(Linestring, 4326),
min_time timestamp,
max_time timestamp
);
-- example (both start_time and end_time are parameters)
select * from "mpart5w-wkt"
where st_intersects(
geom,
ST_MakeEnvelope(104.067, 30.657, 104.083, 30.671, 4326)
) and (
(min_time < start_time and start_time < max_time)
or
(min_time < end_time and end_time < max_time)
)
第二个问题是boundary box如何使用2D-index?
毕竟,没有证据表明使用具有主旨索引的 m 维比使用具有单独时间属性的 2D 几何更方便。所以,我决定先在2D-index上做个测试
-- test 1
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
ST_MakeEnvelope(104.067, 30.657, 104.083, 30.671, 4326)
);
-- test 2
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
st_geometryfromtext(
'polygon((104.067 30.671, 104.083 30.671, 104.083 30.657, 104.067 30.657, 104.067 30.671))',
4326
)
);
-- test 3
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
'SRID=4326;polygon((104.067 30.671, 104.083 30.671, 104.083 30.657, 104.067 30.657, 104.067 30.671))'::geometry
);
-- almost the same result
Finalize Aggregate (cost=547292.05..547292.06 rows=1 width=8) (actual time=817.698..824.482 rows=1 loops=1)
-> Gather (cost=547291.84..547292.05 rows=2 width=8) (actual time=817.380..824.451 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=546291.84..546291.85 rows=1 width=8) (actual time=804.706..804.707 rows=1 loops=3)
-> Parallel Seq Scan on "mpart5w-wkt" (cost=0.00..546291.83 rows=2 width=19) (actual time=97.585..803.734 rows=4394 loops=3)
Filter: st_intersects(st_force2d(geom), '0103000020E610000001000000050000003F355EBA49045A40D578E92631A83E403F355EBA49045A40B29DEFA7C6AB3E405A643BDF4F055A40B29DEFA7C6AB3E405A643BDF4F055A40D578E92631A83E403F355EBA49045A40D578E92631A83E40'::geometry)
Rows Removed by Filter: 12272
Planning Time: 0.268 ms
JIT:
Functions: 17
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.771 ms, Inlining 99.859 ms, Optimization 117.378 ms, Emission 74.123 ms, Total 294.132 ms
Execution Time: 825.745 ms
(14 rows)
然而,测试显示结果几乎相同,即使我做了很多不同的测试,空间索引也不起作用。删除 st_force2d() 函数效率较低。
在具有额外时间限制的相同工作中,效率会降低。
对了,如果经常使用lon-lat boundary box,同时又需要计算距离,应该用4326和3857中的哪个作为SRID来存储GPS轨迹几何?
您可以通过在索引创建中使用函数 ST_Force2D
告诉索引将其已经使用 geom
的记录排序为二维,这样数据库就不需要这样做了查询时间:
CREATE INDEX idx_part5w_wkt_geom ON "part5w-wkt"
USING gist (ST_Force2D(geom) gist_geometry_ops_nd);
如果您只是省略 CREATE INDEX
中的 ST_Force2D
也会产生类似的效果,只要您以后也不会在 WHERE
子句中使用它。长话短说:列的索引方式和它们的查询方式必须匹配,否则索引可能不会被使用。
演示:db<>fiddle
首先是基本版本信息:
psql (PostgreSQL) 12.7 (Ubuntu 12.7-1.pgdg18.04+1)
postgis | 3.1.1
我使用空间数据库的目的是快速查询specified time scope
和space boundary
内的GPS轨迹。目前我的数据基本信息是如下:
-- geometry table column (there are 50,000 rows in table mpart5w-wkt)
test=# \d "mpart5w-wkt"
Table "public.mpart5w-wkt"
Column | Type | Collation | Nullable | Default
-----------+----------------------------+-----------+----------+---------
driver_id | character varying | | |
order_id | character varying | | |
geom | geometry(LineStringM,4326) | | |
Indexes:
"mpart5w-wkt_driver_id_idx" btree (driver_id)
"mpart5w-wkt_geom_idx" gist (geom gist_geometry_ops_nd)
-- meta info
test=# select * from geometry_columns where f_table_name='mpart5w-wkt';
f_table_catalog | f_table_schema | f_table_name | f_geometry_column | coord_dimension | srid | type
-----------------+----------------+--------------+-------------------+-----------------+------+-------------
test | public | mpart5w-wkt | geom | 3 | 4326 | LINESTRINGM
(1 row)
-- sample data: LINESTRING M (lon lat timestamp)
test=# select st_astext(geom) from "mpart5w-wkt" limit 1;
LINESTRING M (104.04538 30.70745 1538402919,104.04538 30.70744 1538402928,104.04537 30.70745 1538402938,104.04536 30.70743 1538402948,104.04537 30.7074 1538402958, ...)
强调一下,geom
是(LineStringM, 4326)的几何类型。 GIS索引已建立在geom列上。
第一个问题是M维是否支持多索引?
我查了官方手册关于multi-index,它表明我们可以使用4维运算符class:
得到一个4维的BRIN
索引
CREATE INDEX [indexname] ON [tablename]
USING BRIN ([geome_col] brin_geometry_inclusion_ops_4d);
同时,我们可以使用以下语法获取几何类型的 n 维 gist
索引:
CREATE INDEX [indexname] ON [tablename] USING GIST ([geometryfield] gist_geometry_ops_nd);
所以,我想构建 4D gist
索引是有帮助的,只要提供 ZM 维度。
在 functions 的大部分描述中,“此功能支持 3d 并且不会删除 z-index。”提到m-index就不提了,而我对m-index没有更多的了解。
毕竟M维是否支持多索引,如何在M维上使用多索引,目前还没有确凿的证据
也许我应该这样创建table,这样我就不需要再处理M维度了?
create table "part5w-wkt"(
driver_id varchar,
order_id varchar,
geom geometry(Linestring, 4326),
min_time timestamp,
max_time timestamp
);
-- example (both start_time and end_time are parameters)
select * from "mpart5w-wkt"
where st_intersects(
geom,
ST_MakeEnvelope(104.067, 30.657, 104.083, 30.671, 4326)
) and (
(min_time < start_time and start_time < max_time)
or
(min_time < end_time and end_time < max_time)
)
第二个问题是boundary box如何使用2D-index?
毕竟,没有证据表明使用具有主旨索引的 m 维比使用具有单独时间属性的 2D 几何更方便。所以,我决定先在2D-index上做个测试
-- test 1
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
ST_MakeEnvelope(104.067, 30.657, 104.083, 30.671, 4326)
);
-- test 2
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
st_geometryfromtext(
'polygon((104.067 30.671, 104.083 30.671, 104.083 30.657, 104.067 30.657, 104.067 30.671))',
4326
)
);
-- test 3
explain analyze
select count(order_id) from "mpart5w-wkt"
where st_intersects(
st_force2d(geom),
'SRID=4326;polygon((104.067 30.671, 104.083 30.671, 104.083 30.657, 104.067 30.657, 104.067 30.671))'::geometry
);
-- almost the same result
Finalize Aggregate (cost=547292.05..547292.06 rows=1 width=8) (actual time=817.698..824.482 rows=1 loops=1)
-> Gather (cost=547291.84..547292.05 rows=2 width=8) (actual time=817.380..824.451 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=546291.84..546291.85 rows=1 width=8) (actual time=804.706..804.707 rows=1 loops=3)
-> Parallel Seq Scan on "mpart5w-wkt" (cost=0.00..546291.83 rows=2 width=19) (actual time=97.585..803.734 rows=4394 loops=3)
Filter: st_intersects(st_force2d(geom), '0103000020E610000001000000050000003F355EBA49045A40D578E92631A83E403F355EBA49045A40B29DEFA7C6AB3E405A643BDF4F055A40B29DEFA7C6AB3E405A643BDF4F055A40D578E92631A83E403F355EBA49045A40D578E92631A83E40'::geometry)
Rows Removed by Filter: 12272
Planning Time: 0.268 ms
JIT:
Functions: 17
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.771 ms, Inlining 99.859 ms, Optimization 117.378 ms, Emission 74.123 ms, Total 294.132 ms
Execution Time: 825.745 ms
(14 rows)
然而,测试显示结果几乎相同,即使我做了很多不同的测试,空间索引也不起作用。删除 st_force2d() 函数效率较低。
在具有额外时间限制的相同工作中,效率会降低。
对了,如果经常使用lon-lat boundary box,同时又需要计算距离,应该用4326和3857中的哪个作为SRID来存储GPS轨迹几何?
您可以通过在索引创建中使用函数 ST_Force2D
告诉索引将其已经使用 geom
的记录排序为二维,这样数据库就不需要这样做了查询时间:
CREATE INDEX idx_part5w_wkt_geom ON "part5w-wkt"
USING gist (ST_Force2D(geom) gist_geometry_ops_nd);
如果您只是省略 CREATE INDEX
中的 ST_Force2D
也会产生类似的效果,只要您以后也不会在 WHERE
子句中使用它。长话短说:列的索引方式和它们的查询方式必须匹配,否则索引可能不会被使用。
演示:db<>fiddle