直接将本地 csv 文件加载到 hive parquet table，而不是求助于临时文本文件 table

Load local csv file to hive parquet table directly,not resort to a temp textfile table

我现在准备将.csv 文件中的数据存储到配置单元中。当然，由于parquet文件格式的良好性能，hivetable应该是parquet格式。所以，通常的方法是创建一个格式为 textfile 的临时 table，然后我将本地 CSV 文件数据加载到这个临时 table，最后，创建一个相同结构的镶木地板table 并使用 sql insert into parquet_table values (select * from textfile_table);.

但我认为这个临时文本文件 table 不是必需的。所以，我的问题是，有没有办法直接将这些本地 .csv 文件加载到 hive parquet-format table 中，即不求助于 a temp table？或者更简单的方法来完成这项任务？

如Hive documentation所述：

NO verification of data against the schema is performed by the load command.

If the file is in hdfs, it is moved into the Hive-controlled file system namespace.

您可以使用 CREATE TABLE AS SELECT 为镶木地板跳过一个步骤 table。

所以你将有 3 个步骤：

创建文本table 定义架构
Load data into text table（将文件移动到新的 table）
CREATE TABLE parquet_table AS SELECT * FROM textfile_table STORED AS PARQUET; supported from hive 0.13