如何在配置单元中按月和日对 table 进行分区
How to partition a table by month and day in hive
我创建了一个 table 具有:
CREATE EXTERNAL TABLE extab (
vendorID string,
orderID string ,
ordertime string
)
location '/common_folder/data'
然后我按月和日创建了一个分区
CREATE EXTERNAL TABLE part_extab(
endorID string,
orderID string ,
ordertime string
)
PARTITIONED by (month string, day string)
location '/common_folder/data'
然后将数据插入分区table
INSERT OVERWRITE TABLE
select vendorId, orderId, ordertime , month, day
FROM extab
如何从订单时间中提取月、日??
使用动态分区加载。如果您的日期格式正确,month()
和 day()
函数将起作用:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT OVERWRITE TABLE part_extab partiion (month, day)
select vendorId, orderId, ordertime ,
lpad(month(ordertime),2,0) as month,
lpad(day(ordertime),2,0) as day
FROM extab;
或者,您可以使用 substr() 来提取月份和日期,例如 答案
我创建了一个 table 具有:
CREATE EXTERNAL TABLE extab (
vendorID string,
orderID string ,
ordertime string
)
location '/common_folder/data'
然后我按月和日创建了一个分区
CREATE EXTERNAL TABLE part_extab(
endorID string,
orderID string ,
ordertime string
)
PARTITIONED by (month string, day string)
location '/common_folder/data'
然后将数据插入分区table
INSERT OVERWRITE TABLE
select vendorId, orderId, ordertime , month, day
FROM extab
如何从订单时间中提取月、日??
使用动态分区加载。如果您的日期格式正确,month()
和 day()
函数将起作用:
set hive.exec.dynamic.partition=true;
set hive.exec.dynamic.partition.mode=nonstrict;
INSERT OVERWRITE TABLE part_extab partiion (month, day)
select vendorId, orderId, ordertime ,
lpad(month(ordertime),2,0) as month,
lpad(day(ordertime),2,0) as day
FROM extab;
或者,您可以使用 substr() 来提取月份和日期,例如