Teradata SQL : PDCR Table 加入 ::有人可以解释行数差异吗
Teradata SQL : PDCR Table Join ::Can someone explain the row count disparity
我想这可能是旧讨论的一部分,但我认为如果我将它作为一个单独的问题打开它,我认为它不会将其扩展到论坛类型的谈话中,这确实归功于回复专家。
我试图理解为什么这两个查询给出的结果略有不同,重要的是其中一个错过了 imp 候选用户。
一个简单的报告,按数据库拉高 CPU 用户。
版本 2
SELECT b.objectdatabasename,
a.username,
CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3))
FROM
pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,
logdate,
MIN(objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
WHERE objectdatabasename IN (
SELECT child
FROM dbc.children
WHERE parent = 'findb'
GROUP BY 1
)
GROUP BY 1, 2
) b ON
a.queryid = b.queryid
AND a.loGDATE = b.Logdate
AND a.logdate BETWEEN x AND y
AND b.logdate BETWEEN x AND y
GROUP BY 1,2
与下面的行相比,这多了 3 行
版本 1
SELECT b.objectdatabasename,
a.username,
CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3))
FROM
pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,
logdate,
MIN(objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
GROUP BY 1, 2
) b ON
a.queryid = b.queryid
AND a.loGDATE = b.Logdate
AND a.logdate BETWEEN x AND y
AND b.logdate BETWEEN x AND y
WHERE
b.objectdatabasename IN
(
SELECT child
FROM dbc.children
WHERE parent = 'findb'
GROUP BY 1
)
GROUP BY 1,2
结果是这样的
+------------+-----------+-----------+
| Database | User | Total CPU |
+------------+-----------+-----------+
| FinDB | PSmith | 500,000 |
| FinDB_B | PROgers | 600,000 |
| ClaimDB_CO | BCRPRDUsr | 700,000 |
+------------+-----------+-----------+
版本 1 是一直使用的现存版本(使用的是另一种效率较低的形式),它错过了这个用户
FinDB PSmith 500,000
我从 PSmith 的查询 ID 和日志日期中检查他确实在使用 FinDB,但他从未进入版本 #2 的列表。
我确定 - 我错过了 101 的东西,我正在尝试了解导致行差异的原因。
版本#1
这比版本 2 少了 3-4 行
Explain SELECT
b.objectdatabasename ,
a.username ,
CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+ ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3)) (TITLE '')
FROM pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,logdate,
MIN (objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
GROUP BY 1,2 )
b
ON ( a.queryid=b.queryid
AND a.loGDATE=b.Logdate
)
and a.logdate BETWEEN '2016-01-01' AND '2016-01-11'
and b.logdate BETWEEN '2016-01-01' AND '2016-01-11'
where b.objectdatabasename in ( sel child from dbc.children where parent ='findb' group by 1 )
GROUP BY 1,
2
ORDER BY 3 desc , 2 asc, 1 asc;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access.
2) Next, we lock DBC.dbase for access, and we lock DBC.owners for
access.
3) We do an all-AMPs SUM step to aggregate from 11 partitions of
PDCRDATA.DBQLObjTbl_Hst with a condition of (
"(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND
(PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')")
, grouping by field1 ( PDCRDATA.DBQLObjTbl_Hst.QueryID
,PDCRDATA.DBQLObjTbl_Hst.LogDate). Aggregate Intermediate Results
are computed locally, then placed in Spool 3. The input table
will not be cached in memory, but it is eligible for synchronized
scanning. The size of Spool 3 is estimated with low confidence to
be 44,305,297 rows (5,715,383,313 bytes). The estimated time for
this step is 8.52 seconds.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by
way of an all-rows scan into Spool 1 (used to materialize
view, derived table, table function or table operator b)
(all_amps) (compressed columns allowed), which is built
locally on the AMPs. The size of Spool 1 is estimated with
low confidence to be 44,305,297 rows (5,316,635,640 bytes).
The estimated time for this step is 0.78 seconds.
2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with a condition of (
"(SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool
9 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (DBC.dbase.DatabaseId) to
all AMPs. Then we do a SORT to order Spool 9 by row hash.
The size of Spool 9 is estimated with no confidence to be 348
rows (5,916 bytes). The estimated time for this step is 0.01
seconds.
3) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with no residual conditions locking for access
into Spool 10 (all_amps) (compressed columns allowed), which
is redistributed by the hash code of (DBC.dbase.DatabaseId)
to all AMPs. Then we do a SORT to order Spool 10 by row hash.
The size of Spool 10 is estimated with high confidence to be
3,478 rows (361,712 bytes). The estimated time for this step
is 0.01 seconds.
5) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of a
RowHash match scan, which is joined to DBC.owners by way of a
RowHash match scan with no residual conditions. Spool 9 and
DBC.owners are joined using a merge join, with a join condition of
("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool
11 (all_amps) (compressed columns allowed), which is redistributed
by the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do
a SORT to order Spool 11 by row hash. The size of Spool 11 is
estimated with no confidence to be 10,450 rows (177,650 bytes).
The estimated time for this step is 0.02 seconds.
6) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of a
RowHash match scan, which is joined to Spool 11 (Last Use) by way
of a RowHash match scan. Spool 10 and Spool 11 are joined using a
merge join, with a join condition of ("OwneeId = DatabaseId").
The result goes into Spool 8 (all_amps), which is redistributed by
the hash code of (SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName
)USING UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30
))(CHAR(30), CHARACTER SET LATIN, NOT CASESPECIFIC)) to all AMPs.
Then we do a SORT to order Spool 8 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 8 is
estimated with no confidence to be 3,478 rows (191,290 bytes).
The estimated time for this step is 0.02 seconds.
7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan into Spool 12 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs. The size of Spool 12
is estimated with no confidence to be 1,752,912 rows (227,878,560
bytes). The estimated time for this step is 0.06 seconds.
8) We do an all-AMPs JOIN step from Spool 1 (Last Use) by way of an
all-rows scan with a condition of ("(b.LOGDATE <= DATE
'2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')"), which is
joined to Spool 12 (Last Use) by way of an all-rows scan. Spool 1
and Spool 12 are joined using a inclusion dynamic hash join, with
a join condition of ("OBJECTDATABASENAME = (TRANSLATE((Field_2
)USING LATIN_TO_UNICODE))"). The result goes into Spool 13
(all_amps) (compressed columns allowed), which is redistributed by
the rowkey of (PDCRDATA.DBQLObjTbl_Hst.LOGDATE,
PDCRDATA.DBQLObjTbl_Hst.QUERYID) to all AMPs. Then we do a SORT
to partition Spool 13 by rowkey. The size of Spool 13 is
estimated with no confidence to be 3,865 rows (432,880 bytes).
The estimated time for this step is 0.29 seconds.
9) We do an all-AMPs JOIN step from 11 partitions of
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a
RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst
in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND
(PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE
'2016-01-01')"), which is joined to Spool 13 (Last Use) by way of
a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 13 are
joined using a rowkey-based merge join, with a join condition of (
"(PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE) AND
(PDCRDATA.DBQLogTbl_Hst.QueryID = QUERYID)"). The input table
PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is
eligible for synchronized scanning. The result goes into Spool 7
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. The size of Spool 7 is estimated with no confidence to
be 3,816 rows (782,280 bytes). The estimated time for this step
is 0.03 seconds.
10) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by
way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.Field_4 ,PDCRDATA.DBQLogTbl_Hst.UserName).
Aggregate Intermediate Results are computed globally, then placed
in Spool 15. The size of Spool 15 is estimated with no confidence
to be 3,478 rows (2,472,858 bytes). The estimated time for this
step is 0.02 seconds.
11) We do an all-AMPs RETRIEVE step from Spool 15 (Last Use) by way of
an all-rows scan into Spool 5 (group_amps), which is built locally
on the AMPs. Then we do a SORT to order Spool 5 by the sort key
in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime
(DECIMAL(18,3)) )+
(ZEROIFNULL(PDCRDATA.DBQLogTbl_Hst.ParserCPUTime
)))(DECIMAL(18,3)), PDCRDATA.DBQLogTbl_Hst.UserName,
PDCRDATA.DBQLObjTbl_Hst.Field_4). The size of Spool 5 is
estimated with no confidence to be 3,478 rows (2,201,574 bytes).
The estimated time for this step is 0.01 seconds.
12) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 5 are sent back to the user as the result of
statement 1. The total estimated time is 9.75 seconds.
版本 2
这说明了丢失的用户。
Explain SELECT
b.objectdatabasename ,
a.username ,
CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+
ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3))
FROM pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,logdate,
MIN (objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
where objectdatabasename in ( sel child from dbc.children where parent ='findb' group by 1 )
GROUP BY 1,2 )
b
ON ( a.queryid=b.queryid
AND a.loGDATE=b.Logdate )
AND a.logdate BETWEEN '2016-01-01' AND '2016-01-11'
AND b.logdate BETWEEN '2016-01-01' AND '2016-01-11'
GROUP BY
1,2
order by
3 desc, 1 asc, 2 asc;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access.
2) Next, we lock DBC.dbase for access, and we lock DBC.owners for
access.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with a condition of (
"(SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool
5 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (DBC.dbase.DatabaseId) to
all AMPs. Then we do a SORT to order Spool 5 by row hash.
The size of Spool 5 is estimated with no confidence to be 348
rows (5,916 bytes). The estimated time for this step is 0.01
seconds.
2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with no residual conditions locking for access
into Spool 6 (all_amps) (compressed columns allowed), which
is redistributed by the hash code of (DBC.dbase.DatabaseId)
to all AMPs. Then we do a SORT to order Spool 6 by row hash.
The size of Spool 6 is estimated with high confidence to be
3,478 rows (361,712 bytes). The estimated time for this step
is 0.01 seconds.
4) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a
RowHash match scan, which is joined to DBC.owners by way of a
RowHash match scan with no residual conditions. Spool 5 and
DBC.owners are joined using a merge join, with a join condition of
("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool 7
(all_amps) (compressed columns allowed), which is redistributed by
the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do a
SORT to order Spool 7 by row hash. The size of Spool 7 is
estimated with no confidence to be 10,450 rows (177,650 bytes).
The estimated time for this step is 0.02 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of
a RowHash match scan, which is joined to Spool 7 (Last Use)
by way of a RowHash match scan. Spool 6 and Spool 7 are
joined using a merge join, with a join condition of (
"OwneeId = DatabaseId"). The result goes into Spool 4
(all_amps), which is redistributed by the hash code of (
SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC)) to all AMPs. Then we
do a SORT to order Spool 4 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 4
is estimated with no confidence to be 3,478 rows (191,290
bytes). The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from 11 partitions of
PDCRDATA.DBQLObjTbl_Hst with a condition of (
"(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND
(PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')") into
Spool 8 (all_amps) (compressed columns allowed), which is
built locally on the AMPs. The input table will not be
cached in memory, but it is eligible for synchronized
scanning. The size of Spool 8 is estimated with high
confidence to be 109,751,471 rows (12,292,164,752 bytes).
The estimated time for this step is 4.29 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of
an all-rows scan into Spool 9 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs. The size of Spool 9 is
estimated with no confidence to be 1,752,912 rows (227,878,560
bytes). The estimated time for this step is 0.06 seconds.
7) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an
all-rows scan, which is joined to Spool 9 (Last Use) by way of an
all-rows scan. Spool 8 and Spool 9 are joined using a single
partition inclusion hash join, with a join condition of (
"ObjectDatabaseName = (TRANSLATE((Field_2 )USING
LATIN_TO_UNICODE))"). The result goes into Spool 3 (all_amps)
(compressed columns allowed), which is built locally on the AMPs.
The size of Spool 3 is estimated with no confidence to be
36,436,341 rows (4,153,742,874 bytes). The estimated time for
this step is 1.05 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.QueryID ,PDCRDATA.DBQLObjTbl_Hst.LogDate).
Aggregate Intermediate Results are computed locally, then placed
in Spool 11. The size of Spool 11 is estimated with no confidence
to be 36,436,341 rows (4,700,287,989 bytes). The estimated time
for this step is 3.10 seconds.
9) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 1 (used to materialize view, derived
table, table function or table operator b) (all_amps) (compressed
columns allowed), which is built locally on the AMPs. The size of
Spool 1 is estimated with no confidence to be 36,436,341 rows (
4,372,360,920 bytes). The estimated time for this step is 0.65
seconds.
10) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
an all-rows scan with a condition of ("(b.LOGDATE <= DATE
'2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')") into Spool 16
(all_amps) (compressed columns allowed), which is redistributed by
the rowkey of (PDCRDATA.DBQLObjTbl_Hst.QueryID,
PDCRDATA.DBQLObjTbl_Hst.LogDate) to all AMPs. Then we do a SORT
to partition Spool 16 by rowkey. The size of Spool 16 is
estimated with no confidence to be 36,436,341 rows (4,080,870,192
bytes). The estimated time for this step is 3.86 seconds.
11) We do an all-AMPs JOIN step from 11 partitions of
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a
RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst
in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND
(PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE
'2016-01-01')"), which is joined to Spool 16 (Last Use) by way of
a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 16 are
joined using a rowkey-based merge join, with a join condition of (
"(PDCRDATA.DBQLogTbl_Hst.QueryID = QUERYID) AND
(PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE)"). The input table
PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is
eligible for synchronized scanning. The result goes into Spool 15
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. The size of Spool 15 is estimated with no confidence to
be 35,969,436 rows (7,373,734,380 bytes). The estimated time for
this step is 1.72 seconds.
12) We do an all-AMPs SUM step to aggregate from Spool 15 (Last Use)
by way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.ObjectDatabaseName
,PDCRDATA.DBQLogTbl_Hst.UserName). Aggregate Intermediate Results
are computed globally, then placed in Spool 17. The size of Spool
17 is estimated with no confidence to be 6,175,740 rows (
4,390,951,140 bytes). The estimated time for this step is 1.61
seconds.
13) We do an all-AMPs RETRIEVE step from Spool 17 (Last Use) by way of
an all-rows scan into Spool 13 (group_amps), which is built
locally on the AMPs. Then we do a SORT to order Spool 13 by the
sort key in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime
(DECIMAL(18,3)) )+
(ZEROIFNULL(PDCRDATA.DBQLogTbl_Hst.ParserCPUTime
)))(DECIMAL(18,3)), PDCRDATA.DBQLObjTbl_Hst.ObjectDatabaseName,
PDCRDATA.DBQLogTbl_Hst.UserName). The size of Spool 13 is
estimated with no confidence to be 6,175,740 rows (3,909,243,420
bytes). The estimated time for this step is 0.43 seconds.
14) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 13 are sent back to the user as the result
of statement 1. The total estimated time is 16.79 seconds.
在#2 中,您在数据库名称 之前过滤 MIN,但在 #1 中在 之后过滤 MIN。
假设查询访问了 bold 数据库,然后子查询 b
in #1 returns 'Bla_DB' 作为 MIN 而在 #2 中它是 returns 'Fin_DB':
- dbc
- sysdba
- FinDB
- FinDB_B
- ClaimDB_CO
- ...
- ...
- Bla_DB
- ...
我想这可能是旧讨论的一部分,但我认为如果我将它作为一个单独的问题打开它,我认为它不会将其扩展到论坛类型的谈话中,这确实归功于回复专家。
我试图理解为什么这两个查询给出的结果略有不同,重要的是其中一个错过了 imp 候选用户。 一个简单的报告,按数据库拉高 CPU 用户。
版本 2
SELECT b.objectdatabasename,
a.username,
CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3))
FROM
pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,
logdate,
MIN(objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
WHERE objectdatabasename IN (
SELECT child
FROM dbc.children
WHERE parent = 'findb'
GROUP BY 1
)
GROUP BY 1, 2
) b ON
a.queryid = b.queryid
AND a.loGDATE = b.Logdate
AND a.logdate BETWEEN x AND y
AND b.logdate BETWEEN x AND y
GROUP BY 1,2
与下面的行相比,这多了 3 行
版本 1
SELECT b.objectdatabasename,
a.username,
CAST(SUM((((a.AmpCPUTime(DEC(18, 3))) + ZEROIFNULL(a.ParserCPUTime)))) AS DECIMAL(18, 3))
FROM
pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,
logdate,
MIN(objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
GROUP BY 1, 2
) b ON
a.queryid = b.queryid
AND a.loGDATE = b.Logdate
AND a.logdate BETWEEN x AND y
AND b.logdate BETWEEN x AND y
WHERE
b.objectdatabasename IN
(
SELECT child
FROM dbc.children
WHERE parent = 'findb'
GROUP BY 1
)
GROUP BY 1,2
结果是这样的
+------------+-----------+-----------+
| Database | User | Total CPU |
+------------+-----------+-----------+
| FinDB | PSmith | 500,000 |
| FinDB_B | PROgers | 600,000 |
| ClaimDB_CO | BCRPRDUsr | 700,000 |
+------------+-----------+-----------+
版本 1 是一直使用的现存版本(使用的是另一种效率较低的形式),它错过了这个用户
FinDB PSmith 500,000
我从 PSmith 的查询 ID 和日志日期中检查他确实在使用 FinDB,但他从未进入版本 #2 的列表。
我确定 - 我错过了 101 的东西,我正在尝试了解导致行差异的原因。
版本#1
这比版本 2 少了 3-4 行
Explain SELECT
b.objectdatabasename ,
a.username ,
CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+ ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3)) (TITLE '')
FROM pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,logdate,
MIN (objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
GROUP BY 1,2 )
b
ON ( a.queryid=b.queryid
AND a.loGDATE=b.Logdate
)
and a.logdate BETWEEN '2016-01-01' AND '2016-01-11'
and b.logdate BETWEEN '2016-01-01' AND '2016-01-11'
where b.objectdatabasename in ( sel child from dbc.children where parent ='findb' group by 1 )
GROUP BY 1,
2
ORDER BY 3 desc , 2 asc, 1 asc;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access.
2) Next, we lock DBC.dbase for access, and we lock DBC.owners for
access.
3) We do an all-AMPs SUM step to aggregate from 11 partitions of
PDCRDATA.DBQLObjTbl_Hst with a condition of (
"(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND
(PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')")
, grouping by field1 ( PDCRDATA.DBQLObjTbl_Hst.QueryID
,PDCRDATA.DBQLObjTbl_Hst.LogDate). Aggregate Intermediate Results
are computed locally, then placed in Spool 3. The input table
will not be cached in memory, but it is eligible for synchronized
scanning. The size of Spool 3 is estimated with low confidence to
be 44,305,297 rows (5,715,383,313 bytes). The estimated time for
this step is 8.52 seconds.
4) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by
way of an all-rows scan into Spool 1 (used to materialize
view, derived table, table function or table operator b)
(all_amps) (compressed columns allowed), which is built
locally on the AMPs. The size of Spool 1 is estimated with
low confidence to be 44,305,297 rows (5,316,635,640 bytes).
The estimated time for this step is 0.78 seconds.
2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with a condition of (
"(SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool
9 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (DBC.dbase.DatabaseId) to
all AMPs. Then we do a SORT to order Spool 9 by row hash.
The size of Spool 9 is estimated with no confidence to be 348
rows (5,916 bytes). The estimated time for this step is 0.01
seconds.
3) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with no residual conditions locking for access
into Spool 10 (all_amps) (compressed columns allowed), which
is redistributed by the hash code of (DBC.dbase.DatabaseId)
to all AMPs. Then we do a SORT to order Spool 10 by row hash.
The size of Spool 10 is estimated with high confidence to be
3,478 rows (361,712 bytes). The estimated time for this step
is 0.01 seconds.
5) We do an all-AMPs JOIN step from Spool 9 (Last Use) by way of a
RowHash match scan, which is joined to DBC.owners by way of a
RowHash match scan with no residual conditions. Spool 9 and
DBC.owners are joined using a merge join, with a join condition of
("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool
11 (all_amps) (compressed columns allowed), which is redistributed
by the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do
a SORT to order Spool 11 by row hash. The size of Spool 11 is
estimated with no confidence to be 10,450 rows (177,650 bytes).
The estimated time for this step is 0.02 seconds.
6) We do an all-AMPs JOIN step from Spool 10 (Last Use) by way of a
RowHash match scan, which is joined to Spool 11 (Last Use) by way
of a RowHash match scan. Spool 10 and Spool 11 are joined using a
merge join, with a join condition of ("OwneeId = DatabaseId").
The result goes into Spool 8 (all_amps), which is redistributed by
the hash code of (SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName
)USING UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30
))(CHAR(30), CHARACTER SET LATIN, NOT CASESPECIFIC)) to all AMPs.
Then we do a SORT to order Spool 8 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 8 is
estimated with no confidence to be 3,478 rows (191,290 bytes).
The estimated time for this step is 0.02 seconds.
7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
an all-rows scan into Spool 12 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs. The size of Spool 12
is estimated with no confidence to be 1,752,912 rows (227,878,560
bytes). The estimated time for this step is 0.06 seconds.
8) We do an all-AMPs JOIN step from Spool 1 (Last Use) by way of an
all-rows scan with a condition of ("(b.LOGDATE <= DATE
'2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')"), which is
joined to Spool 12 (Last Use) by way of an all-rows scan. Spool 1
and Spool 12 are joined using a inclusion dynamic hash join, with
a join condition of ("OBJECTDATABASENAME = (TRANSLATE((Field_2
)USING LATIN_TO_UNICODE))"). The result goes into Spool 13
(all_amps) (compressed columns allowed), which is redistributed by
the rowkey of (PDCRDATA.DBQLObjTbl_Hst.LOGDATE,
PDCRDATA.DBQLObjTbl_Hst.QUERYID) to all AMPs. Then we do a SORT
to partition Spool 13 by rowkey. The size of Spool 13 is
estimated with no confidence to be 3,865 rows (432,880 bytes).
The estimated time for this step is 0.29 seconds.
9) We do an all-AMPs JOIN step from 11 partitions of
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a
RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst
in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND
(PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE
'2016-01-01')"), which is joined to Spool 13 (Last Use) by way of
a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 13 are
joined using a rowkey-based merge join, with a join condition of (
"(PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE) AND
(PDCRDATA.DBQLogTbl_Hst.QueryID = QUERYID)"). The input table
PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is
eligible for synchronized scanning. The result goes into Spool 7
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. The size of Spool 7 is estimated with no confidence to
be 3,816 rows (782,280 bytes). The estimated time for this step
is 0.03 seconds.
10) We do an all-AMPs SUM step to aggregate from Spool 7 (Last Use) by
way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.Field_4 ,PDCRDATA.DBQLogTbl_Hst.UserName).
Aggregate Intermediate Results are computed globally, then placed
in Spool 15. The size of Spool 15 is estimated with no confidence
to be 3,478 rows (2,472,858 bytes). The estimated time for this
step is 0.02 seconds.
11) We do an all-AMPs RETRIEVE step from Spool 15 (Last Use) by way of
an all-rows scan into Spool 5 (group_amps), which is built locally
on the AMPs. Then we do a SORT to order Spool 5 by the sort key
in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime
(DECIMAL(18,3)) )+
(ZEROIFNULL(PDCRDATA.DBQLogTbl_Hst.ParserCPUTime
)))(DECIMAL(18,3)), PDCRDATA.DBQLogTbl_Hst.UserName,
PDCRDATA.DBQLObjTbl_Hst.Field_4). The size of Spool 5 is
estimated with no confidence to be 3,478 rows (2,201,574 bytes).
The estimated time for this step is 0.01 seconds.
12) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 5 are sent back to the user as the result of
statement 1. The total estimated time is 9.75 seconds.
版本 2
这说明了丢失的用户。
Explain SELECT
b.objectdatabasename ,
a.username ,
CAST( SUM((((a.AmpCPUTime(DEC(18,3)))+
ZEROIFNULL(a.ParserCPUTime)) )) AS DECIMAL(18,3))
FROM pdcrinfo.dbqlogtbl a
JOIN
(
SELECT queryid,logdate,
MIN (objectdatabasename) AS objectdatabasename
FROM pdcrinfo.dbqlobjtbl_hst
where objectdatabasename in ( sel child from dbc.children where parent ='findb' group by 1 )
GROUP BY 1,2 )
b
ON ( a.queryid=b.queryid
AND a.loGDATE=b.Logdate )
AND a.logdate BETWEEN '2016-01-01' AND '2016-01-11'
AND b.logdate BETWEEN '2016-01-01' AND '2016-01-11'
GROUP BY
1,2
order by
3 desc, 1 asc, 2 asc;
This query is optimized using type 2 profile insert-sel, profileid
10001.
1) First, we lock PDCRDATA.DBQLObjTbl_Hst for access, and we lock
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl for access.
2) Next, we lock DBC.dbase for access, and we lock DBC.owners for
access.
3) We execute the following steps in parallel.
1) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with a condition of (
"(SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC))= 'findb '") into Spool
5 (all_amps) (compressed columns allowed), which is
redistributed by the hash code of (DBC.dbase.DatabaseId) to
all AMPs. Then we do a SORT to order Spool 5 by row hash.
The size of Spool 5 is estimated with no confidence to be 348
rows (5,916 bytes). The estimated time for this step is 0.01
seconds.
2) We do an all-AMPs RETRIEVE step from DBC.dbase by way of an
all-rows scan with no residual conditions locking for access
into Spool 6 (all_amps) (compressed columns allowed), which
is redistributed by the hash code of (DBC.dbase.DatabaseId)
to all AMPs. Then we do a SORT to order Spool 6 by row hash.
The size of Spool 6 is estimated with high confidence to be
3,478 rows (361,712 bytes). The estimated time for this step
is 0.01 seconds.
4) We do an all-AMPs JOIN step from Spool 5 (Last Use) by way of a
RowHash match scan, which is joined to DBC.owners by way of a
RowHash match scan with no residual conditions. Spool 5 and
DBC.owners are joined using a merge join, with a join condition of
("DBC.owners.OwnerId = DatabaseId"). The result goes into Spool 7
(all_amps) (compressed columns allowed), which is redistributed by
the hash code of (DBC.owners.OwneeId) to all AMPs. Then we do a
SORT to order Spool 7 by row hash. The size of Spool 7 is
estimated with no confidence to be 10,450 rows (177,650 bytes).
The estimated time for this step is 0.02 seconds.
5) We execute the following steps in parallel.
1) We do an all-AMPs JOIN step from Spool 6 (Last Use) by way of
a RowHash match scan, which is joined to Spool 7 (Last Use)
by way of a RowHash match scan. Spool 6 and Spool 7 are
joined using a merge join, with a join condition of (
"OwneeId = DatabaseId"). The result goes into Spool 4
(all_amps), which is redistributed by the hash code of (
SUBSTRING((TRANSLATE((DBC.dbase.DatabaseName )USING
UNICODE_TO_LOCALE WITH ERROR )) FROM (1) FOR (30 ))(CHAR(30),
CHARACTER SET LATIN, NOT CASESPECIFIC)) to all AMPs. Then we
do a SORT to order Spool 4 by row hash and the sort key in
spool field1 eliminating duplicate rows. The size of Spool 4
is estimated with no confidence to be 3,478 rows (191,290
bytes). The estimated time for this step is 0.02 seconds.
2) We do an all-AMPs RETRIEVE step from 11 partitions of
PDCRDATA.DBQLObjTbl_Hst with a condition of (
"(PDCRDATA.DBQLObjTbl_Hst.LogDate >= DATE '2016-01-01') AND
(PDCRDATA.DBQLObjTbl_Hst.LogDate <= DATE '2016-01-11')") into
Spool 8 (all_amps) (compressed columns allowed), which is
built locally on the AMPs. The input table will not be
cached in memory, but it is eligible for synchronized
scanning. The size of Spool 8 is estimated with high
confidence to be 109,751,471 rows (12,292,164,752 bytes).
The estimated time for this step is 4.29 seconds.
6) We do an all-AMPs RETRIEVE step from Spool 4 (Last Use) by way of
an all-rows scan into Spool 9 (all_amps) (compressed columns
allowed), which is duplicated on all AMPs. The size of Spool 9 is
estimated with no confidence to be 1,752,912 rows (227,878,560
bytes). The estimated time for this step is 0.06 seconds.
7) We do an all-AMPs JOIN step from Spool 8 (Last Use) by way of an
all-rows scan, which is joined to Spool 9 (Last Use) by way of an
all-rows scan. Spool 8 and Spool 9 are joined using a single
partition inclusion hash join, with a join condition of (
"ObjectDatabaseName = (TRANSLATE((Field_2 )USING
LATIN_TO_UNICODE))"). The result goes into Spool 3 (all_amps)
(compressed columns allowed), which is built locally on the AMPs.
The size of Spool 3 is estimated with no confidence to be
36,436,341 rows (4,153,742,874 bytes). The estimated time for
this step is 1.05 seconds.
8) We do an all-AMPs SUM step to aggregate from Spool 3 (Last Use) by
way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.QueryID ,PDCRDATA.DBQLObjTbl_Hst.LogDate).
Aggregate Intermediate Results are computed locally, then placed
in Spool 11. The size of Spool 11 is estimated with no confidence
to be 36,436,341 rows (4,700,287,989 bytes). The estimated time
for this step is 3.10 seconds.
9) We do an all-AMPs RETRIEVE step from Spool 11 (Last Use) by way of
an all-rows scan into Spool 1 (used to materialize view, derived
table, table function or table operator b) (all_amps) (compressed
columns allowed), which is built locally on the AMPs. The size of
Spool 1 is estimated with no confidence to be 36,436,341 rows (
4,372,360,920 bytes). The estimated time for this step is 0.65
seconds.
10) We do an all-AMPs RETRIEVE step from Spool 1 (Last Use) by way of
an all-rows scan with a condition of ("(b.LOGDATE <= DATE
'2016-01-11') AND (b.LOGDATE >= DATE '2016-01-01')") into Spool 16
(all_amps) (compressed columns allowed), which is redistributed by
the rowkey of (PDCRDATA.DBQLObjTbl_Hst.QueryID,
PDCRDATA.DBQLObjTbl_Hst.LogDate) to all AMPs. Then we do a SORT
to partition Spool 16 by rowkey. The size of Spool 16 is
estimated with no confidence to be 36,436,341 rows (4,080,870,192
bytes). The estimated time for this step is 3.86 seconds.
11) We do an all-AMPs JOIN step from 11 partitions of
PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl by way of a
RowHash match scan with a condition of ("(PDCRDATA.DBQLogTbl_Hst
in view pdcrinfo.dbqlogtbl.LogDate <= DATE '2016-01-11') AND
(PDCRDATA.DBQLogTbl_Hst in view pdcrinfo.dbqlogtbl.LogDate >= DATE
'2016-01-01')"), which is joined to Spool 16 (Last Use) by way of
a RowHash match scan. PDCRDATA.DBQLogTbl_Hst and Spool 16 are
joined using a rowkey-based merge join, with a join condition of (
"(PDCRDATA.DBQLogTbl_Hst.QueryID = QUERYID) AND
(PDCRDATA.DBQLogTbl_Hst.LogDate = LOGDATE)"). The input table
PDCRDATA.DBQLogTbl_Hst will not be cached in memory, but it is
eligible for synchronized scanning. The result goes into Spool 15
(all_amps) (compressed columns allowed), which is built locally on
the AMPs. The size of Spool 15 is estimated with no confidence to
be 35,969,436 rows (7,373,734,380 bytes). The estimated time for
this step is 1.72 seconds.
12) We do an all-AMPs SUM step to aggregate from Spool 15 (Last Use)
by way of an all-rows scan , grouping by field1 (
PDCRDATA.DBQLObjTbl_Hst.ObjectDatabaseName
,PDCRDATA.DBQLogTbl_Hst.UserName). Aggregate Intermediate Results
are computed globally, then placed in Spool 17. The size of Spool
17 is estimated with no confidence to be 6,175,740 rows (
4,390,951,140 bytes). The estimated time for this step is 1.61
seconds.
13) We do an all-AMPs RETRIEVE step from Spool 17 (Last Use) by way of
an all-rows scan into Spool 13 (group_amps), which is built
locally on the AMPs. Then we do a SORT to order Spool 13 by the
sort key in spool field1 (SUM((PDCRDATA.DBQLogTbl_Hst.AMPCPUTime
(DECIMAL(18,3)) )+
(ZEROIFNULL(PDCRDATA.DBQLogTbl_Hst.ParserCPUTime
)))(DECIMAL(18,3)), PDCRDATA.DBQLObjTbl_Hst.ObjectDatabaseName,
PDCRDATA.DBQLogTbl_Hst.UserName). The size of Spool 13 is
estimated with no confidence to be 6,175,740 rows (3,909,243,420
bytes). The estimated time for this step is 0.43 seconds.
14) Finally, we send out an END TRANSACTION step to all AMPs involved
in processing the request.
-> The contents of Spool 13 are sent back to the user as the result
of statement 1. The total estimated time is 16.79 seconds.
在#2 中,您在数据库名称 之前过滤 MIN,但在 #1 中在 之后过滤 MIN。
假设查询访问了 bold 数据库,然后子查询 b
in #1 returns 'Bla_DB' 作为 MIN 而在 #2 中它是 returns 'Fin_DB':
- dbc
- sysdba
- FinDB
- FinDB_B
- ClaimDB_CO
- ...
- ...
- Bla_DB
- FinDB
- ...
- sysdba