Hive 解释计划在哪里可以看到完整的 table 扫描?
Hive explain plan where to see full table scan?
如何从 hive EXPLAIN
中查看是否有完整的 table 扫描?
比如有没有全盘扫描?
table 大小为 993 行。
查询是
explain select latitude,longitude FROM CRIMES WHERE geohash='dp3twhjuyutr'
我在 geohash
列上有二级索引。
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: crimes
filterExpr: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 993 Data size: 265582 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: latitude (type: double), longitude (type: double)
outputColumnNames: _col0, _col1
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
- 计划中没有分区谓词意味着全扫描。当然,这与 ORC 中的谓词 push-down 无关。
- 检查每个运算符中的数据大小和行数。
EXPLAIN DEPENDENCY
command 将显示所有 input_partitions
集合,您可以查看将扫描的确切内容。
如何从 hive EXPLAIN
中查看是否有完整的 table 扫描?
比如有没有全盘扫描? table 大小为 993 行。
查询是
explain select latitude,longitude FROM CRIMES WHERE geohash='dp3twhjuyutr'
我在 geohash
列上有二级索引。
STAGE PLANS:
Stage: Stage-1
Map Reduce
Map Operator Tree:
TableScan
alias: crimes
filterExpr: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 993 Data size: 265582 Basic stats: COMPLETE Column stats: NONE
Filter Operator
predicate: (geohash = 'dp3twhjuyutr') (type: boolean)
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
Select Operator
expressions: latitude (type: double), longitude (type: double)
outputColumnNames: _col0, _col1
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
File Output Operator
compressed: false
Statistics: Num rows: 496 Data size: 132657 Basic stats: COMPLETE Column stats: NONE
table:
input format: org.apache.hadoop.mapred.SequenceFileInputFormat
output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Stage: Stage-0
Fetch Operator
limit: -1
Processor Tree:
ListSink
- 计划中没有分区谓词意味着全扫描。当然,这与 ORC 中的谓词 push-down 无关。
- 检查每个运算符中的数据大小和行数。
EXPLAIN DEPENDENCY
command 将显示所有input_partitions
集合,您可以查看将扫描的确切内容。