Hive Partitioned Table - 尝试将数据从一个 table 加载到我的 Hive 中的分区 table 并得到 [Error 10044]*
Hive Partitioned Table - trying to load data from one table to a partitioned table in my Hive and getting [Error 10044]*
所以我有一个包含 20 列的 table,我创建了另一个分区 table - 使用 2 个分区值,现在当我尝试从包含 20 列的 table 加载数据时进入另一个分区 tables 我收到错误消息说我的分区 table 的列数多于 table 我从
插入数据
我的创建 table 语句:
create table flight_data_parquet(
YEAR INT,
FL_DATE STRING,
UNIQUE_CARRIER STRING,
AIRLINE_ID INT,
CARRIER STRING,
TAIL_NUM STRING,
FL_NUM INT,
ORIGIN_AIRPORT_ID INT,
ORIGIN_AIRPORT_SEQ_ID INT,
ORIGIN STRING,
DEST_AIRPORT_ID INT,
DEST_AIRPORT_SEQ_ID INT,
DEST STRING,
DEP_DELAY FLOAT,
ARR_DELAY FLOAT,
CANCELLED TINYINT,
DIVERTED TINYINT,
DISTANCE INT)
partitioned by (Month INT, DAY_OF_MONTH INT)stored AS PARQUET;
-插入语句:
insert into table flight_data_parquet partition(month=1, day_of_month)
select YEAR,FL_DATE,
UNIQUE_CARRIER,
AIRLINE_ID,
CARRIER,
TAIL_NUM,
FL_NUM,
ORIGIN_AIRPORT_ID,
ORIGIN_AIRPORT_SEQ_ID,
ORIGIN,
DEST_AIRPORT_ID,
DEST_AIRPORT_SEQ_ID,
DEST,
DEP_DELAY,
ARR_DELAY,
CANCELLED,
DIVERTED,
DISTANCE, month, day_of_month
from flight_data_v2 where month=1;
我得到的错误是-
FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'day_of_month': Table insclause-0 has 19 columns, but query has 20 columns.
hive (flights)>
分区规范中的 month=1
partition(month=1, day_of_month) - 是静态分区并且值已指定,删除 month
来自 select 查询。只有 day_of_month
(动态分区)应该在 select:
insert into table flight_data_parquet partition(month=1, day_of_month) -- Month=1 is a static partition
select YEAR,FL_DATE,
UNIQUE_CARRIER,
AIRLINE_ID,
CARRIER,
TAIL_NUM,
FL_NUM,
ORIGIN_AIRPORT_ID,
ORIGIN_AIRPORT_SEQ_ID,
ORIGIN,
DEST_AIRPORT_ID,
DEST_AIRPORT_SEQ_ID,
DEST,
DEP_DELAY,
ARR_DELAY,
CANCELLED,
DIVERTED,
DISTANCE, day_of_month
from flight_data_v2 where month=1;
所以我有一个包含 20 列的 table,我创建了另一个分区 table - 使用 2 个分区值,现在当我尝试从包含 20 列的 table 加载数据时进入另一个分区 tables 我收到错误消息说我的分区 table 的列数多于 table 我从
插入数据我的创建 table 语句:
create table flight_data_parquet(
YEAR INT,
FL_DATE STRING,
UNIQUE_CARRIER STRING,
AIRLINE_ID INT,
CARRIER STRING,
TAIL_NUM STRING,
FL_NUM INT,
ORIGIN_AIRPORT_ID INT,
ORIGIN_AIRPORT_SEQ_ID INT,
ORIGIN STRING,
DEST_AIRPORT_ID INT,
DEST_AIRPORT_SEQ_ID INT,
DEST STRING,
DEP_DELAY FLOAT,
ARR_DELAY FLOAT,
CANCELLED TINYINT,
DIVERTED TINYINT,
DISTANCE INT)
partitioned by (Month INT, DAY_OF_MONTH INT)stored AS PARQUET;
-插入语句:
insert into table flight_data_parquet partition(month=1, day_of_month)
select YEAR,FL_DATE,
UNIQUE_CARRIER,
AIRLINE_ID,
CARRIER,
TAIL_NUM,
FL_NUM,
ORIGIN_AIRPORT_ID,
ORIGIN_AIRPORT_SEQ_ID,
ORIGIN,
DEST_AIRPORT_ID,
DEST_AIRPORT_SEQ_ID,
DEST,
DEP_DELAY,
ARR_DELAY,
CANCELLED,
DIVERTED,
DISTANCE, month, day_of_month
from flight_data_v2 where month=1;
我得到的错误是-
FAILED: SemanticException [Error 10044]: Line 1:18 Cannot insert into target table because column number/types are different 'day_of_month': Table insclause-0 has 19 columns, but query has 20 columns.
hive (flights)>
month=1
partition(month=1, day_of_month) - 是静态分区并且值已指定,删除 month
来自 select 查询。只有 day_of_month
(动态分区)应该在 select:
insert into table flight_data_parquet partition(month=1, day_of_month) -- Month=1 is a static partition
select YEAR,FL_DATE,
UNIQUE_CARRIER,
AIRLINE_ID,
CARRIER,
TAIL_NUM,
FL_NUM,
ORIGIN_AIRPORT_ID,
ORIGIN_AIRPORT_SEQ_ID,
ORIGIN,
DEST_AIRPORT_ID,
DEST_AIRPORT_SEQ_ID,
DEST,
DEP_DELAY,
ARR_DELAY,
CANCELLED,
DIVERTED,
DISTANCE, day_of_month
from flight_data_v2 where month=1;