从 HDFS 上的 csv 创建外部 table,所有值都带有引号
Create external table from csv on HDFS , all values come with quotes
我在 HDFS 上有一个 csv 文件,我正在尝试创建一个 impala table ,情况是它创建了 table 和包含所有“
CREATE external TABLE abc.def
(
name STRING,
title STRING,
last STRING,
pno STRING
)
row format delimited fields terminated by ','
location 'hdfs:pathlocation'
tblproperties ("skip.header.line.count"="1") ;
输出为
name tile last pno
"abc" "mr" "xyz" "1234"
"rew" "ms" "pre" "654"
我只想从不带引号的 csv 文件创建 table。请指导我哪里出错了。
问候,
R
一种方法是创建一个阶段 table,用引号加载文件,然后使用 CTAS(将 table 创建为 select)创建正确的 table 使用替换功能清理字段。
举个例子
CREATE TABLE quote_stage(
id STRING,
name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
+-----+----------+
| id | name |
+-----+----------+
| "1" | "pepe" |
| "2" | "ana" |
| "3" | "maria" |
| "4" | "ramon" |
| "5" | "lucia" |
| "6" | "carmen" |
| "7" | "alicia" |
| "8" | "pedro" |
+-----+----------+
CREATE TABLE t_quote
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
AS SELECT replace(id,'"','') AS id, replace(name,'"','') AS name FROM quote_stage;
+----+--------+
| id | name |
+----+--------+
| 1 | pepe |
| 2 | ana |
| 3 | maria |
| 4 | ramon |
| 5 | lucia |
| 6 | carmen |
| 7 | alicia |
| 8 | pedro |
+----+--------+
希望对您有所帮助。
我在 HDFS 上有一个 csv 文件,我正在尝试创建一个 impala table ,情况是它创建了 table 和包含所有“
CREATE external TABLE abc.def
(
name STRING,
title STRING,
last STRING,
pno STRING
)
row format delimited fields terminated by ','
location 'hdfs:pathlocation'
tblproperties ("skip.header.line.count"="1") ;
输出为
name tile last pno
"abc" "mr" "xyz" "1234"
"rew" "ms" "pre" "654"
我只想从不带引号的 csv 文件创建 table。请指导我哪里出错了。 问候, R
一种方法是创建一个阶段 table,用引号加载文件,然后使用 CTAS(将 table 创建为 select)创建正确的 table 使用替换功能清理字段。 举个例子
CREATE TABLE quote_stage(
id STRING,
name STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
+-----+----------+
| id | name |
+-----+----------+
| "1" | "pepe" |
| "2" | "ana" |
| "3" | "maria" |
| "4" | "ramon" |
| "5" | "lucia" |
| "6" | "carmen" |
| "7" | "alicia" |
| "8" | "pedro" |
+-----+----------+
CREATE TABLE t_quote
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
STORED AS TEXTFILE
AS SELECT replace(id,'"','') AS id, replace(name,'"','') AS name FROM quote_stage;
+----+--------+
| id | name |
+----+--------+
| 1 | pepe |
| 2 | ana |
| 3 | maria |
| 4 | ramon |
| 5 | lucia |
| 6 | carmen |
| 7 | alicia |
| 8 | pedro |
+----+--------+
希望对您有所帮助。