使用 PL/SQL table 避免使用 SELECT 进行完整 table 扫描

Avoid full table scan with SELECT using PL/SQL table

测试数据

CREATE TABLE parent AS ( SELECT ROWNUM AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 1000 );
CREATE UNIQUE INDEX idx_parent ON parent(id);

CREATE TABLE child AS ( SELECT CEIL(ROWNUM/5) AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 5000 );
CREATE INDEX idx_child ON child(id);

EXEC dbms_stats.gather_table_stats(USER, 'parent');
EXEC dbms_stats.gather_table_stats(USER, 'child');

问题

即使考虑了 CARDINALITY 提示,以下查询也会在 child 上进行完整的 table 扫描(包括 12.1 和 19.0)。
当然,真正的查询需要来自 child.

的一些额外数据
SELECT child.id
FROM parent
JOIN
(
    SELECT child.id
    FROM child
    GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
-----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |            |     1 |    19 |    35   (3)| 00:00:01 |
|*  1 |  HASH JOIN RIGHT SEMI                  |            |     1 |    19 |    35   (3)| 00:00:01 |
|   2 |   COLLECTION ITERATOR CONSTRUCTOR FETCH|            |     1 |     2 |    29   (0)| 00:00:01 |
|   3 |   NESTED LOOPS                         |            |  1000 | 17000 |     6  (17)| 00:00:01 |
|   4 |    VIEW                                |            |  1000 | 13000 |     6  (17)| 00:00:01 |
|   5 |     HASH GROUP BY                      |            |  1000 |  4000 |     6  (17)| 00:00:01 |
|   6 |      TABLE ACCESS FULL                 | CHILD      |  5000 | 20000 |     5   (0)| 00:00:01 |
|*  7 |    INDEX UNIQUE SCAN                   | IDX_PARENT |     1 |     4 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

如果我用这个替换 WHERE 子句,两个索引都会按预期使用:

WHERE parent.id IN ( 1 );
----------------------------------------------------------------------------------
| Id  | Operation           | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |            |     5 |    35 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS       |            |     5 |    35 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN | IDX_PARENT |     1 |     4 |     1   (0)| 00:00:01 |
|   3 |   VIEW              |            |     5 |    15 |     1   (0)| 00:00:01 |
|   4 |    SORT GROUP BY    |            |     5 |    20 |     1   (0)| 00:00:01 |
|*  5 |     INDEX RANGE SCAN| IDX_CHILD  |     5 |    20 |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------

当我删除 GROUP BY 时它也有效。


知道如何解决这个问题吗?

问题是 ID 列可以包含 NULL 值。如果将列定义为 NOT NULL,则使用索引。

索引不包含 NULL 值。但 GROUP BY 必须包含此数据。

当您在示例中将 parent.id 限制为 1 时,数据库可以使用索引作为具体值。

您可以使用 MERGE hint

获得所需的行为
SELECT child.id
FROM parent
JOIN
(
    SELECT /*+ MERGE */ child.id  ---<<<<< merge the subquery
    FROM child
    GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );

执行计划如下

--------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |            |     5 |    70 |    32   (7)| 00:00:01 |
|   1 |  HASH GROUP BY                            |            |     5 |    70 |    32   (7)| 00:00:01 |
|   2 |   NESTED LOOPS                            |            |     5 |    70 |    31   (4)| 00:00:01 |
|   3 |    NESTED LOOPS                           |            |     1 |    10 |    30   (4)| 00:00:01 |
|   4 |     SORT UNIQUE                           |            |     1 |     2 |    29   (0)| 00:00:01 |
|   5 |      COLLECTION ITERATOR CONSTRUCTOR FETCH|            |     1 |     2 |    29   (0)| 00:00:01 |
|*  6 |     INDEX UNIQUE SCAN                     | IDX_PARENT |     1 |     8 |     0   (0)| 00:00:01 |
|*  7 |    INDEX RANGE SCAN                       | IDX_CHILD  |     5 |    20 |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   6 - access("PARENT"."ID"=VALUE(KOKBF$))
   7 - access("CHILD"."ID"="PARENT"."ID")

我猜你的 child table 太小了,CBO 认为这个计划不是最好的;但它也可能有其他原因。

补充说明

谓词的区别比较大

parent.id IN ( subquery )   and

parent.id IN ( 1 )

在后一种情况下,简单的 Oracle 可以 group by 子查询中推送谓词 (access("CHILD"."ID"=1))。 (参见提示 PUSH_PRED)。

但是无论如何如果你1)知道子查询returns只有一行并且2)你会有点帮助有了谓词,Oracle CBO 就可以正确完成 无需提示

此处根据 1) 和 2) 稍微更改了查询 - 请参阅评论

SELECT child.id
FROM parent
JOIN
(
    SELECT child.id
    FROM child
    GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE child.id  /* 2) match with *child.id* to help Oracle to unnest */
 =  /* 1) use equal predicate as there is ony one row */
( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );

计划

--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |            |     5 |    85 |     3   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                             |            |     5 |    85 |     3   (0)| 00:00:01 |
|   2 |   VIEW                                    |            |     5 |    65 |     3   (0)| 00:00:01 |
|   3 |    HASH GROUP BY                          |            |     5 |    25 |     3   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                      | IDX_CHILD  |     5 |    25 |     3   (0)| 00:00:01 |
|   5 |      COLLECTION ITERATOR CONSTRUCTOR FETCH|            |     1 |     2 |    29   (0)| 00:00:01 |
|*  6 |   INDEX UNIQUE SCAN                       | IDX_PARENT |     1 |     4 |     0   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("CHILD"."ID"= (SELECT /*+ OPT_ESTIMATE (TABLE "TAB"@"SEL" ROWS=1.000000 ) */ 
              VALUE(KOKBF$) FROM TABLE() "KOKBF[=14=]"))
   6 - access("CHILD"."ID"="PARENT"."ID")