从 HDFS 上的 csv 创建外部 table,所有值都带有引号

Create external table from csv on HDFS , all values come with quotes

我在 HDFS 上有一个 csv 文件,我正在尝试创建一个 impala table ,情况是它创建了 table 和包含所有“

CREATE external TABLE abc.def
(

name STRING,
title STRING,
last  STRING, 
pno STRING
)
row format delimited fields terminated by ','

location 'hdfs:pathlocation'
tblproperties ("skip.header.line.count"="1") ;

输出为
name tile last pno
"abc" "mr" "xyz" "1234"
"rew" "ms" "pre" "654"

我只想从不带引号的 csv 文件创建 table。请指导我哪里出错了。 问候, R

一种方法是创建一个阶段 table,用引号加载文件,然后使用 CTAS(将 table 创建为 select)创建正确的 table 使用替换功能清理字段。 举个例子

CREATE TABLE quote_stage(
 id STRING,
 name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
+-----+----------+
| id  | name     |
+-----+----------+
| "1" | "pepe"   |
| "2" | "ana"    |
| "3" | "maria"  |
| "4" | "ramon"  |
| "5" | "lucia"  |
| "6" | "carmen" |
| "7" | "alicia" |
| "8" | "pedro"  |
+-----+----------+
CREATE TABLE t_quote 
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
AS SELECT replace(id,'"','') AS id, replace(name,'"','') AS name FROM quote_stage;
+----+--------+
| id | name   |
+----+--------+
| 1  | pepe   |
| 2  | ana    |
| 3  | maria  |
| 4  | ramon  |
| 5  | lucia  |
| 6  | carmen |
| 7  | alicia |
| 8  | pedro  |
+----+--------+

希望对您有所帮助。