使用动态分区插入配置单元 table 仅将第一个分区写入磁盘而不是全部
Insert into hive table with dynamic partition only writing first partition to disk and not all
我正在尝试将数据写入配置单元 table 但失败了。我在 Cycle_dt =null 的末尾出现错误,并且只有一个分区正在写入。这是第一天的。
set hive.auto.convert.join=true;
set hive.optimize.mapjoin.mapreduce=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set mapred.map.tasks = 100;
Insert into table dynamic.dynamic_test_avro_v1 partition(cycle_dt)
Select date_time as CYCLE_TS, case when evar1 is not null or length(trim(evar1)) > 0 then cast(unbase64(substring(evar1,6,12)) as string) end NRNM ,
prop14 as state, evar8 as FLOW_TYPE, prop25 as KEY, pagename PAGE_NM,
partition_dt as cycle_dt from source.std_avro_v1 WHERE
(partition_dt = '2016-10-02' AND partition_dt < '2016-10-07')
AND (
evar8='google');
我不确定这里发生了什么。我有一个日期范围设置,只将这些日期作为分区。
In the dynamic partition inserts, users can give partial partition specifications, which means just specifying the list of partition column names in the PARTITION clause. The column values are optional. If a partition column value is given, we call this a static partition, otherwise it is a dynamic partition. Each dynamic partition column has a corresponding input column from the select statement. This means that the dynamic partition creation is determined by the value of the input column. The dynamic partition columns must be specified last among the columns in the SELECT statement and in the same order in which they appear in the PARTITION() clause.
因此,在您的查询中,partition_dt
是动态分区的值。但是,您施加了以下约束:(partition_dt = '2016-10-02' AND partition_dt < '2016-10-07')
转换为 partition_dt = '2016-10-02'
并最终创建了一个分区。
您可能想要一个日期范围:(partition_dt >= '2016-10-02' AND partition_dt < '2016-10-07')
我正在尝试将数据写入配置单元 table 但失败了。我在 Cycle_dt =null 的末尾出现错误,并且只有一个分区正在写入。这是第一天的。
set hive.auto.convert.join=true;
set hive.optimize.mapjoin.mapreduce=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set mapred.map.tasks = 100;
Insert into table dynamic.dynamic_test_avro_v1 partition(cycle_dt)
Select date_time as CYCLE_TS, case when evar1 is not null or length(trim(evar1)) > 0 then cast(unbase64(substring(evar1,6,12)) as string) end NRNM ,
prop14 as state, evar8 as FLOW_TYPE, prop25 as KEY, pagename PAGE_NM,
partition_dt as cycle_dt from source.std_avro_v1 WHERE
(partition_dt = '2016-10-02' AND partition_dt < '2016-10-07')
AND (
evar8='google');
我不确定这里发生了什么。我有一个日期范围设置,只将这些日期作为分区。
In the dynamic partition inserts, users can give partial partition specifications, which means just specifying the list of partition column names in the PARTITION clause. The column values are optional. If a partition column value is given, we call this a static partition, otherwise it is a dynamic partition. Each dynamic partition column has a corresponding input column from the select statement. This means that the dynamic partition creation is determined by the value of the input column. The dynamic partition columns must be specified last among the columns in the SELECT statement and in the same order in which they appear in the PARTITION() clause.
因此,在您的查询中,partition_dt
是动态分区的值。但是,您施加了以下约束:(partition_dt = '2016-10-02' AND partition_dt < '2016-10-07')
转换为 partition_dt = '2016-10-02'
并最终创建了一个分区。
您可能想要一个日期范围:(partition_dt >= '2016-10-02' AND partition_dt < '2016-10-07')