提高 SUM 和 JOIN 的查询处理性能 SQL

Improving the query processing performance of SUM and JOIN SQL

SELECT SUM(C_QUANTITY)
FROM CARS JOIN ORDERS
ON C_ORDERKEY = O_ORDERKEY;

我有这个查询,它聚合了 JOIN 表中 L_QUANTITY 的总和。使用 EXPLAIN PLAN 的查询成本是 12147。 objective 是通过实施更有效的 SELECT 语句来改进此 SELECT 语句,该语句将获得相同的结果。

我试过了

SELECT SUM(C_QUANTITY)
FROM CARS

它返回了相同的结果,但查询成本与原来完全相同。我认为通过删除 JOINSELECT 查询会得到改善。

有没有办法只修改SELECT语句来降低成本?

编辑:

原始查询计划

PLAN_TABLE_OUTPUT                                                               
--------------------------------------------------------------------------------
Plan hash value: 2287326370                                                     
                                                                                
------------------------------------------------------------------------------- 
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | 
------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT   |          |     1 |     3 | 12147   (1)| 00:00:01 | 
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          | 
|   2 |   TABLE ACCESS FULL|   CARS   |  1800K|  5273K| 12147   (1)| 00:00:01 | 
------------------------------------------------------------------------------- 

9 rows selected. 

第二个查询

PLAN_TABLE_OUTPUT                                                               
--------------------------------------------------------------------------------
Plan hash value: 2287326370                                                     
                                                                                
------------------------------------------------------------------------------- 
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     | 
------------------------------------------------------------------------------- 
|   0 | SELECT STATEMENT   |          |     1 |     3 | 12147   (1)| 00:00:01 | 
|   1 |  SORT AGGREGATE    |          |     1 |     3 |            |          | 
|   2 |   TABLE ACCESS FULL|   CARS   |  1800K|  5273K| 12147   (1)| 00:00:01 | 
------------------------------------------------------------------------------- 

9 rows selected. 

我建议添加以下索引:

CREATE INDEX idx ON ORDERS (O_ORDERKEY, C_QUANTITY);

据推测,ORDERS table 会比 CARS 大得多。如果是这样,Oracle 可能会通过扫描 CARS 来满足查询,然后将能够使用上述索引在 ORDERS table 中查找。我将 C_QUANTITY 列添加到索引的末尾,以涵盖 select 子句中的总和。

如果有两个tablecarsorders没有连接,会得到和ordinaryjoin执行计划如下.

--------------------------------------------------------------------------------------
| Id  | Operation           | Name   | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |        |     1 |    15 |       |   297   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE     |        |     1 |    15 |       |            |          |
|*  2 |   HASH JOIN         |        |   100K|  1464K|  1664K|   297   (2)| 00:00:01 |
|   3 |    TABLE ACCESS FULL| ORDERS |   100K|   488K|       |    47   (3)| 00:00:01 |
|   4 |    TABLE ACCESS FULL| CARS   |   100K|   976K|       |    62   (2)| 00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("C_ORDERKEY"="O_ORDERKEY")

table cars 显然是 child table 或 orders,即你有这个约束

alter table orders add primary key (O_ORDERKEY);
alter table cars add constraint cars_fk foreign key(C_ORDERKEY) references orders(O_ORDERKEY);

Oracle 聪明 足以知道它不需要访问 orders table 来获得总和

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |    10 |    63   (4)| 00:00:01 |
|   1 |  SORT AGGREGATE    |      |     1 |    10 |            |          |
|*  2 |   TABLE ACCESS FULL| CARS |   100K|   976K|    63   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - filter("C_ORDERKEY" IS NOT NULL)

请注意过滤器 C_ORDERKEY IS NOT NULL,如果列 C_ORDERKEY 可为空,则仍需要过滤器才能获得正确的总和。 (这些行将在连接中被消除)。

如果不是,这可能是有意义的

 alter table cars modify C_ORDERKEY not null;

只需要在C_QUANTITY列上定义一个索引就可以得到最优方案

create index car_idx on cars(C_QUANTITY);

---------------------------------------------------------------------------------
| Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |         |     1 |     5 |    63   (2)| 00:00:01 |
|   1 |  SORT AGGREGATE       |         |     1 |     5 |            |          |
|   2 |   INDEX FAST FULL SCAN| CAR_IDX |   100K|   488K|    63   (2)| 00:00:01 |
---------------------------------------------------------------------------------

请注意,INDEX FAST FULL SCAN 使用索引作为 table 全扫描访问(即不使用指针直接访问索引块)所以它是(如果索引小于 table) 比 table 全扫描访问快得多。