将 Parquet 文件格式转换为序列文件格式
Conversion of Parquet file format to sequence file format
我将 Hive 表以 Parquet 格式存储在 HDFS 中的某个位置。我可以将此位置的镶木地板文件转换为序列文件格式并在其上构建配置单元表吗?
是否有任何程序可以进行此转换?
创建新序列文件 table 并使用插入重新加载数据 select:
insert into sequence_table
select * from parquet_table;
hive> create table src (i int) stored as parquet;
OK
Time taken: 0.427 seconds
hive> create table trg stored as sequencefile as select * from src;
对于@AndyReddy
create table src (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as parquet
;
create table trg (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as sequencefile
;
set hive.exec.dynamic.partition.mode=nonstrict
;
insert into trg partition(year,month,day)
select * from src
;
我将 Hive 表以 Parquet 格式存储在 HDFS 中的某个位置。我可以将此位置的镶木地板文件转换为序列文件格式并在其上构建配置单元表吗? 是否有任何程序可以进行此转换?
创建新序列文件 table 并使用插入重新加载数据 select:
insert into sequence_table
select * from parquet_table;
hive> create table src (i int) stored as parquet;
OK
Time taken: 0.427 seconds
hive> create table trg stored as sequencefile as select * from src;
对于@AndyReddy
create table src (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as parquet
;
create table trg (i int)
partitioned by (year int,month tinyint,day tinyint)
stored as sequencefile
;
set hive.exec.dynamic.partition.mode=nonstrict
;
insert into trg partition(year,month,day)
select * from src
;