当一组相关的 Oracle 表都没有时间信息时,按天对它们进行分区
Partitioning a related set of Oracle tables by day when they don't all have Time information
我有一组看起来与此类似的表格:
Time_Table(比较小):
Time (TIMESTAMP)
timeId (NUMBER)
Data... (NUMBER)
表 2(大,每 time_table 行约 30 行):
timeId (NUMBER)
table2Id (NUMBER)
Data... (NUMBER)
Table3(非常大,每个 table2 行大约 10 行,目前在几百天后有 14 亿行):
timeId (NUMBER)
table2Id (NUMBER)
table3Id (NUMBER)
Data... (NUMBER)
我的查询总是至少加入 timeId,并且每个查询被分成几天(10 天读取将导致 10 个较小的查询)。每天都有新数据写入所有表。我们需要从这些表中存储(和查询)多年的数据。
当只能通过 JOIN 获知时间信息时,如何将这些表划分为每日块?我是否应该考虑以不依赖时间的方式进行分区?这可以自动完成,还是必须手动完成?
Oracle 版本 11.2
参考分区在这里可能有所帮助。它允许 child table 的分区方案由 parent table.
确定
架构
--drop table table3;
--drop table table2;
--drop table time_table;
drop table time_table;
create table Time_Table
(
time TIMESTAMP,
timeId NUMBER,
Data01 NUMBER,
constraint time_table_pk primary key (timeId)
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table2
(
timeId number,
table2Id number,
Data01 number,
constraint table2_pk primary key (table2ID),
constraint table2_fk foreign key (timeId) references time_table(timeId)
);
create table table3
(
timeId number not null,
table2Id number,
table3Id number,
Data01 number,
constraint table3_pk primary key (table3ID),
constraint table3_fk1 foreign key (timeId) references time_table(timeId),
constraint table3_fk2 foreign key (table2ID) references table2(table2ID)
) partition by reference (table3_fk1);
执行计划
Pstart
和 Pstop
显示大 child table 被正确修剪,即使分区谓词仅设置在小 parent 上table.
explain plan for
select *
from table3
join time_table using (timeId)
where time = date '2000-01-02';
select * from table(dbms_xplan.display);
Plan hash value: 832465087
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 91 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 91 | 3 (0)| 00:00:01 | 2 | 2 |
| 2 | NESTED LOOPS | | 1 | 91 | 3 (0)| 00:00:01 | | |
|* 3 | TABLE ACCESS FULL | TIME_TABLE | 1 | 39 | 2 (0)| 00:00:01 | 2 | 2 |
|* 4 | TABLE ACCESS FULL | TABLE3 | 1 | 52 | 1 (0)| 00:00:01 | 2 | 2 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TIME_TABLE"."TIME"=TIMESTAMP' 2000-01-02 00:00:00')
4 - filter("TABLE3"."TIMEID"="TIME_TABLE"."TIMEID")
Note
-----
- dynamic sampling used for this statement (level=2)
- automatic DOP: skipped because of IO calibrate statistics are missing
警告
参考分区有一些怪癖。它不适用于 11g 中的间隔分区,因此您必须手动定义 parent table 的每个分区。外键也无法禁用,这可能需要修改一些脚本。和任何很少使用的功能一样,它也有一些错误。
drop table time_table;
create table Time_Table
(
time TIMESTAMP,
-- timeId NUMBER, Why you need ID when you have timestamp?????
Data01 NUMBER,
constraint time_table_pk primary key (time) -- not timeID!!!
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table2
(
time timestamp not null,
table2ID number,
Data01 number
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table3
(
time timestamp not null,
table2Id number,
table3Id number,
Data01 number
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
我有一组看起来与此类似的表格:
Time_Table(比较小):
Time (TIMESTAMP)
timeId (NUMBER)
Data... (NUMBER)
表 2(大,每 time_table 行约 30 行):
timeId (NUMBER)
table2Id (NUMBER)
Data... (NUMBER)
Table3(非常大,每个 table2 行大约 10 行,目前在几百天后有 14 亿行):
timeId (NUMBER)
table2Id (NUMBER)
table3Id (NUMBER)
Data... (NUMBER)
我的查询总是至少加入 timeId,并且每个查询被分成几天(10 天读取将导致 10 个较小的查询)。每天都有新数据写入所有表。我们需要从这些表中存储(和查询)多年的数据。
当只能通过 JOIN 获知时间信息时,如何将这些表划分为每日块?我是否应该考虑以不依赖时间的方式进行分区?这可以自动完成,还是必须手动完成?
Oracle 版本 11.2
参考分区在这里可能有所帮助。它允许 child table 的分区方案由 parent table.
确定架构
--drop table table3;
--drop table table2;
--drop table time_table;
drop table time_table;
create table Time_Table
(
time TIMESTAMP,
timeId NUMBER,
Data01 NUMBER,
constraint time_table_pk primary key (timeId)
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table2
(
timeId number,
table2Id number,
Data01 number,
constraint table2_pk primary key (table2ID),
constraint table2_fk foreign key (timeId) references time_table(timeId)
);
create table table3
(
timeId number not null,
table2Id number,
table3Id number,
Data01 number,
constraint table3_pk primary key (table3ID),
constraint table3_fk1 foreign key (timeId) references time_table(timeId),
constraint table3_fk2 foreign key (table2ID) references table2(table2ID)
) partition by reference (table3_fk1);
执行计划
Pstart
和 Pstop
显示大 child table 被正确修剪,即使分区谓词仅设置在小 parent 上table.
explain plan for
select *
from table3
join time_table using (timeId)
where time = date '2000-01-02';
select * from table(dbms_xplan.display);
Plan hash value: 832465087
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 91 | 3 (0)| 00:00:01 | | |
| 1 | PARTITION RANGE SINGLE| | 1 | 91 | 3 (0)| 00:00:01 | 2 | 2 |
| 2 | NESTED LOOPS | | 1 | 91 | 3 (0)| 00:00:01 | | |
|* 3 | TABLE ACCESS FULL | TIME_TABLE | 1 | 39 | 2 (0)| 00:00:01 | 2 | 2 |
|* 4 | TABLE ACCESS FULL | TABLE3 | 1 | 52 | 1 (0)| 00:00:01 | 2 | 2 |
-----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("TIME_TABLE"."TIME"=TIMESTAMP' 2000-01-02 00:00:00')
4 - filter("TABLE3"."TIMEID"="TIME_TABLE"."TIMEID")
Note
-----
- dynamic sampling used for this statement (level=2)
- automatic DOP: skipped because of IO calibrate statistics are missing
警告
参考分区有一些怪癖。它不适用于 11g 中的间隔分区,因此您必须手动定义 parent table 的每个分区。外键也无法禁用,这可能需要修改一些脚本。和任何很少使用的功能一样,它也有一些错误。
drop table time_table;
create table Time_Table
(
time TIMESTAMP,
-- timeId NUMBER, Why you need ID when you have timestamp?????
Data01 NUMBER,
constraint time_table_pk primary key (time) -- not timeID!!!
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table2
(
time timestamp not null,
table2ID number,
Data01 number
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);
create table table3
(
time timestamp not null,
table2Id number,
table3Id number,
Data01 number
)
partition by range (time)
(
partition p1 values less than (date '2000-01-02'),
partition p2 values less than (date '2000-01-03'),
partition p3 values less than (date '2000-01-04')
);