在 HIVE 中使用 csv 文件将数据插入 table

Question

CREATE TABLE `rk_test22`(
`index` int, 
`country` string, 
`description` string, 
`designation` string, 
`points` int, 
`price` int, 
`province` string, 
`region_1` string, 
`region_2` string, 
`taster_name` string, 
`taster_twitter_handle` string, 
`title` string, 
`variety` string, 
`winery` string)
ROW FORMAT SERDE 
'org.apache.hadoop.hive.serde2.OpenCSVSerde' 
WITH SERDEPROPERTIES ( 
'input.regex'=',(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)') 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
'hdfs://namever/user/hive/warehouse/robert.db/rk_test22'
TBLPROPERTIES (
'COLUMN_STATS_ACCURATE'='true', 
'numFiles'='1', 
'skip.header.line.count'='1', 
'totalSize'='52796693', 
'transient_lastDdlTime'='1516088117');

我使用上面的命令创建了配置单元 table。现在我想使用加载数据命令将以下行（在 CSV 文件中）加载到 table 中。加载数据命令显示状态正常，但我无法在 table.

中看到数据

0,Italy,"Aromas include tropical fruit, broom, brimstone and dried herb. The palate isn't overly expressive, offering unripened apple, citrus and dried sage alongside brisk acidity.",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco  (Etna),White Blend,Nicosia

Answer 1

如果您正在加载一行 CSV 文件，那么该行将因此而被跳过属性：'skip.header.line.count'='1'

此外，正则表达式应为每一列包含一个捕获组。就像这个答案：

为什么要在 table DDL 中提供这些设置：

'COLUMN_STATS_ACCURATE'='true'
'numFiles'='1', 
'totalSize'='52796693', 
'transient_lastDdlTime'='1516088117'

所有这些都应该在 DDL 和 ANALYZE 之后自动设置。

在 HIVE 中使用 csv 文件将数据插入 table

insert data into table using csv file in HIVE

csv

hive

regexserde

hive-serde

hiveddl