如何按 SDO_GEOMETRY 数据类型分组

How to group by SDO_GEOMETRY data type

我在 table 中有 3 列:

我想按 SDO_GEOMETRY 分组以去除重复的 shapes

然而,每次我这样做

SELECT
  p_id, user_id, shape
FROM table1
GROUP BY shape

我收到错误

ORA-22901: cannot compare VARRAY or LOB attributes of an object type

对,那行不通。这是一个小演练。

Table内容:

SQL> select id, geom from test;

ID   GEOM(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
---- -------------------------------------------------------------------------------------
4026 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9176596, 46,2173069, NULL), NULL, NULL)
4027 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9184437, 46,2219955, NULL), NULL, NULL)
4028 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9826714, 46,2176214, NULL), NULL, NULL)
5000 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9176596, 46,2173069, NULL), NULL, NULL)

SQL>

ID 4026 和 5000 具有相同的几何结构,因此 - 如您所说 - 您想要摆脱其中一个。

以下均无效:

不同的:

SQL> select distinct id, geom from test;
select distinct id, geom from test
                    *
ERROR at line 1:
ORA-22901: cannot compare VARRAY or LOB attributes of an object type

你的尝试:

SQL> select id, geom from test group by geom;
select id, geom from test group by geom
       *
ERROR at line 1:
ORA-00979: not a GROUP BY expression

当然,GROUP BY 子句中缺少 ID,所以让我们添加它:

SQL> select id, geom from test group by id, geom;
select id, geom from test group by id, geom
                                       *
ERROR at line 1:
ORA-22901: cannot compare VARRAY or LOB attributes of an object type

那么,怎么办?将自连接与 SDO_RELATE 结合使用以查找“重复项”:

SQL> select a.id, a.geom
  2  from test a join test b
  3    on sdo_relate(a.geom, b.geom, 'mask=equal') = 'TRUE'
  4   and a.id < b.id;

ID   GEOM(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
---- -------------------------------------------------------------------------------------
4026 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9176596, 46,2173069, NULL), NULL, NULL)

SQL>

对;正如我们所知,4026 和 5000 是一样的。由于第 4 行 (a.id < b.id) 返回 4026。

现在,将上述查询用作子查询(或 CTE,或您认为合适的任何内容)来获取不同的数据集是一项简单的任务:

SQL> with duplicates as
  2    (select a.id, a.geom
  3     from test a join test b
  4       on sdo_relate(a.geom, b.geom, 'mask=equal') = 'TRUE'
  5      and a.id < b.id
  6    )
  7  select t.id, t.geom
  8  from test t
  9  where not exists (select null
 10                    from duplicates d
 11                    where d.id = t.id
 12                   );

ID   GEOM(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
---- -------------------------------------------------------------------------------------
4027 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9184437, 46,2219955, NULL), NULL, NULL)
4028 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9826714, 46,2176214, NULL), NULL, NULL)
5000 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9176596, 46,2173069, NULL), NULL, NULL)

SQL>

或:

SQL> select t.id, t.geom
  2  from test t
  3  where t.id not in (select a.id
  4                     from test a join test b
  5                       on sdo_relate(a.geom, b.geom, 'mask=equal') = 'TRUE'
  6                      and a.id < b.id
  7                    );

ID   GEOM(SDO_GTYPE, SDO_SRID, SDO_POINT(X, Y, Z), SDO_ELEM_INFO, SDO_ORDINATES)
---- -------------------------------------------------------------------------------------
4027 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9184437, 46,2219955, NULL), NULL, NULL)
4028 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9826714, 46,2176214, NULL), NULL, NULL)
5000 SDO_GEOMETRY(2001, 8307, SDO_POINT_TYPE(16,9176596, 46,2173069, NULL), NULL, NULL)

SQL>

Littlefoot 的精彩演讲 - +1!
在 table 很大的情况下,and/or 你必须经常分组 and/or 任何其他你可能想避免空间查询负载的原因,你可能想尝试添加一个varchar(128) 列到您的 table 并用行的几何形状的 'hash' 填充它:

CREATE FUNCTION hash_noct( ingeom in mdsys.sdo_geometry )
  RETURN varchar2 DETERMINISTIC 
is
  v_clob     CLOB;  
  type v_tbl is table of varchar2(256); 
  v_col      v_tbl;  
  oCTID      varchar2(128); 
begin

  if ingeom is not null then  
  dbms_lob.createtemporary (v_clob, TRUE);  

  SELECT t_x||t_y  BULK COLLECT  INTO v_col  
    FROM ( select distinct t_x, t_y 
             from ( select t.id, t.x t_x, t.y t_y 
                      from table(sdo_util.getvertices(INGEOM)) t  
                     order by t.y desc, t.x asc )
            order by 2 desc, 1 asc );  

    for i in 1..v_col.count  
    loop  
    dbms_lob.writeappend(v_clob, length(v_col(i)), v_col(i));  
    end loop;  
  
  -- grant execute on sys.dbms_crypto to <your_schema> ;
  oCTID:= rawtohex( sys.dbms_crypto.hash(v_clob, 3) );
  dbms_lob.freetemporary(v_clob);  
  end if;
 RETURN oCTID ;
end hash_noct;  

然后,在 before insert or update 触发器中,只需添加 :NEW.<your_hash_column_name> := hash_noct(:new.<your_geom_column_name);
每次插入一个 geom 或更新一个现有的 geom 时,散列列将包含一个字符串,如 7F5EF344F7684DF45EB042500C8D234FD4FD4F5F,您可以使用它来分组、排序或其他。
PS:该函数不需要空间索引,并且照原样将使用您存储的坐标的全分辨率。您应该牢记要比较几何的公差。