使用 PL/SQL table 避免使用 SELECT 进行完整 table 扫描
Avoid full table scan with SELECT using PL/SQL table
测试数据
CREATE TABLE parent AS ( SELECT ROWNUM AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 1000 );
CREATE UNIQUE INDEX idx_parent ON parent(id);
CREATE TABLE child AS ( SELECT CEIL(ROWNUM/5) AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 5000 );
CREATE INDEX idx_child ON child(id);
EXEC dbms_stats.gather_table_stats(USER, 'parent');
EXEC dbms_stats.gather_table_stats(USER, 'child');
问题
即使考虑了 CARDINALITY 提示,以下查询也会在 child 上进行完整的 table 扫描(包括 12.1 和 19.0)。
当然,真正的查询需要来自 child.
的一些额外数据
SELECT child.id
FROM parent
JOIN
(
SELECT child.id
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 | 35 (3)| 00:00:01 |
|* 1 | HASH JOIN RIGHT SEMI | | 1 | 19 | 35 (3)| 00:00:01 |
| 2 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1000 | 17000 | 6 (17)| 00:00:01 |
| 4 | VIEW | | 1000 | 13000 | 6 (17)| 00:00:01 |
| 5 | HASH GROUP BY | | 1000 | 4000 | 6 (17)| 00:00:01 |
| 6 | TABLE ACCESS FULL | CHILD | 5000 | 20000 | 5 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
如果我用这个替换 WHERE 子句,两个索引都会按预期使用:
WHERE parent.id IN ( 1 );
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 35 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 35 | 2 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 1 (0)| 00:00:01 |
| 3 | VIEW | | 5 | 15 | 1 (0)| 00:00:01 |
| 4 | SORT GROUP BY | | 5 | 20 | 1 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN| IDX_CHILD | 5 | 20 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------
当我删除 GROUP BY
时它也有效。
知道如何解决这个问题吗?
问题是 ID 列可以包含 NULL 值。如果将列定义为 NOT NULL,则使用索引。
索引不包含 NULL 值。但 GROUP BY 必须包含此数据。
当您在示例中将 parent.id 限制为 1 时,数据库可以使用索引作为具体值。
您可以使用 MERGE
hint
获得所需的行为
SELECT child.id
FROM parent
JOIN
(
SELECT /*+ MERGE */ child.id ---<<<<< merge the subquery
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
执行计划如下
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 70 | 32 (7)| 00:00:01 |
| 1 | HASH GROUP BY | | 5 | 70 | 32 (7)| 00:00:01 |
| 2 | NESTED LOOPS | | 5 | 70 | 31 (4)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 10 | 30 (4)| 00:00:01 |
| 4 | SORT UNIQUE | | 1 | 2 | 29 (0)| 00:00:01 |
| 5 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 8 | 0 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | IDX_CHILD | 5 | 20 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("PARENT"."ID"=VALUE(KOKBF$))
7 - access("CHILD"."ID"="PARENT"."ID")
我猜你的 child table 太小了,CBO 认为这个计划不是最好的;但它也可能有其他原因。
补充说明
谓词的区别比较大
parent.id IN ( subquery ) and
parent.id IN ( 1 )
在后一种情况下,简单的 Oracle 可以 在 group by
子查询中推送谓词 (access("CHILD"."ID"=1)
)。 (参见提示 PUSH_PRED)。
但是无论如何如果你1)知道子查询returns只有一行并且2)你会有点帮助有了谓词,Oracle CBO 就可以正确完成 无需提示
此处根据 1) 和 2) 稍微更改了查询 - 请参阅评论
SELECT child.id
FROM parent
JOIN
(
SELECT child.id
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE child.id /* 2) match with *child.id* to help Oracle to unnest */
= /* 1) use equal predicate as there is ony one row */
( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
计划
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 85 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 85 | 3 (0)| 00:00:01 |
| 2 | VIEW | | 5 | 65 | 3 (0)| 00:00:01 |
| 3 | HASH GROUP BY | | 5 | 25 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IDX_CHILD | 5 | 25 | 3 (0)| 00:00:01 |
| 5 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CHILD"."ID"= (SELECT /*+ OPT_ESTIMATE (TABLE "TAB"@"SEL" ROWS=1.000000 ) */
VALUE(KOKBF$) FROM TABLE() "KOKBF[=14=]"))
6 - access("CHILD"."ID"="PARENT"."ID")
测试数据
CREATE TABLE parent AS ( SELECT ROWNUM AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 1000 );
CREATE UNIQUE INDEX idx_parent ON parent(id);
CREATE TABLE child AS ( SELECT CEIL(ROWNUM/5) AS id, 'XXX' AS dummy FROM dual CONNECT BY ROWNUM <= 5000 );
CREATE INDEX idx_child ON child(id);
EXEC dbms_stats.gather_table_stats(USER, 'parent');
EXEC dbms_stats.gather_table_stats(USER, 'child');
问题
即使考虑了 CARDINALITY 提示,以下查询也会在 child 上进行完整的 table 扫描(包括 12.1 和 19.0)。
当然,真正的查询需要来自 child.
SELECT child.id
FROM parent
JOIN
(
SELECT child.id
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 19 | 35 (3)| 00:00:01 |
|* 1 | HASH JOIN RIGHT SEMI | | 1 | 19 | 35 (3)| 00:00:01 |
| 2 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
| 3 | NESTED LOOPS | | 1000 | 17000 | 6 (17)| 00:00:01 |
| 4 | VIEW | | 1000 | 13000 | 6 (17)| 00:00:01 |
| 5 | HASH GROUP BY | | 1000 | 4000 | 6 (17)| 00:00:01 |
| 6 | TABLE ACCESS FULL | CHILD | 5000 | 20000 | 5 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 0 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
如果我用这个替换 WHERE 子句,两个索引都会按预期使用:
WHERE parent.id IN ( 1 );
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 35 | 2 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 35 | 2 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 1 (0)| 00:00:01 |
| 3 | VIEW | | 5 | 15 | 1 (0)| 00:00:01 |
| 4 | SORT GROUP BY | | 5 | 20 | 1 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN| IDX_CHILD | 5 | 20 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------
当我删除 GROUP BY
时它也有效。
知道如何解决这个问题吗?
问题是 ID 列可以包含 NULL 值。如果将列定义为 NOT NULL,则使用索引。
索引不包含 NULL 值。但 GROUP BY 必须包含此数据。
当您在示例中将 parent.id 限制为 1 时,数据库可以使用索引作为具体值。
您可以使用 MERGE
hint
SELECT child.id
FROM parent
JOIN
(
SELECT /*+ MERGE */ child.id ---<<<<< merge the subquery
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE parent.id IN ( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
执行计划如下
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 70 | 32 (7)| 00:00:01 |
| 1 | HASH GROUP BY | | 5 | 70 | 32 (7)| 00:00:01 |
| 2 | NESTED LOOPS | | 5 | 70 | 31 (4)| 00:00:01 |
| 3 | NESTED LOOPS | | 1 | 10 | 30 (4)| 00:00:01 |
| 4 | SORT UNIQUE | | 1 | 2 | 29 (0)| 00:00:01 |
| 5 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 8 | 0 (0)| 00:00:01 |
|* 7 | INDEX RANGE SCAN | IDX_CHILD | 5 | 20 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - access("PARENT"."ID"=VALUE(KOKBF$))
7 - access("CHILD"."ID"="PARENT"."ID")
我猜你的 child table 太小了,CBO 认为这个计划不是最好的;但它也可能有其他原因。
补充说明
谓词的区别比较大
parent.id IN ( subquery ) and
parent.id IN ( 1 )
在后一种情况下,简单的 Oracle 可以 在 group by
子查询中推送谓词 (access("CHILD"."ID"=1)
)。 (参见提示 PUSH_PRED)。
但是无论如何如果你1)知道子查询returns只有一行并且2)你会有点帮助有了谓词,Oracle CBO 就可以正确完成 无需提示
此处根据 1) 和 2) 稍微更改了查询 - 请参阅评论
SELECT child.id
FROM parent
JOIN
(
SELECT child.id
FROM child
GROUP BY child.id
) child ON ( child.id = parent.id )
WHERE child.id /* 2) match with *child.id* to help Oracle to unnest */
= /* 1) use equal predicate as there is ony one row */
( SELECT /*+ CARDINALITY( tab 1 ) */ COLUMN_VALUE FROM TABLE (sys.odcinumberlist(1) ) tab );
计划
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 85 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS | | 5 | 85 | 3 (0)| 00:00:01 |
| 2 | VIEW | | 5 | 65 | 3 (0)| 00:00:01 |
| 3 | HASH GROUP BY | | 5 | 25 | 3 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IDX_CHILD | 5 | 25 | 3 (0)| 00:00:01 |
| 5 | COLLECTION ITERATOR CONSTRUCTOR FETCH| | 1 | 2 | 29 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | IDX_PARENT | 1 | 4 | 0 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("CHILD"."ID"= (SELECT /*+ OPT_ESTIMATE (TABLE "TAB"@"SEL" ROWS=1.000000 ) */
VALUE(KOKBF$) FROM TABLE() "KOKBF[=14=]"))
6 - access("CHILD"."ID"="PARENT"."ID")