在 PostgreSQL 中使用 COPY FROM 加载带有 TIME ZONE 的 NULL TIMESTAMP
Load NULL TIMESTAMP with TIME ZONE using COPY FROM in PostgreSQL
我有一个 CSV 文件,我正尝试使用 COPY FROM
命令将其加载到 PostgreSQL 9.2.4 数据库中。特别是有一个允许为空的时间戳字段,但是当我加载“空值”(实际上只是 ""
)时,我收到以下错误:
ERROR: invalid input syntax for type timestamp with time zone: ""
CSV 文件示例如下所示:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",""
SQL 看起来如下:
CREATE TABLE "users"
(
"id" BIGSERIAL NOT NULL PRIMARY KEY,
"name" VARCHAR(255),
"joined" TIMESTAMP WITH TIME ZONE,
);
COPY "users" ("id", "name", "joined")
FROM '/path/to/data.csv'
WITH (
ENCODING 'utf-8',
HEADER 1,
FORMAT 'csv'
);
根据documentation,空值应由不能包含引号字符的空字符串表示,在本例中为双引号("
):
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
Note: When using COPY FROM, any data item that matches this string will be stored as a null value, so you should make sure that you use the same string as you used with COPY TO.
我试过 NULL ''
选项,但似乎没有效果。请指教!
不带引号的空字符串正常工作:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",
select * from users;
id | name | joined
----+------+------------------------
1 | bob | 2013-10-03 03:27:44+07
2 | jane |
也许使用 sed 将 "" 替换为空字符串会更简单。
Postgres 9.4+ 中 COPY FROM
的 FORCE_NULL
选项将是解决您问题的最优雅的方法。 Per documentation:
FORCE_NULL
Match the specified columns' values against the null string, even if
it has been quoted, and if a match is found set the value to NULL
. In
the default case where the null string is empty, this converts a
quoted empty string into NULL
. This option is allowed only in COPY
FROM
, and only when using CSV
format.
当然,它会转换所有列中的所有匹配值。
在旧版本中,您可以 COPY
到具有相同 table 布局的 临时 table - 除了数据类型 text
用于问题列。然后从那里修复有问题的值和 INSERT
:
- single quotes appear arround value after running copy in postgres 9.2
无法让它工作。结束使用这个程序:
http://neilb.bitbucket.org/csvfix/
有了它,您可以用其他值替换空字段。
例如,在您的案例中,第 3 列需要有一个时间戳值,所以我给它一个假的。在本例中为“1900-01-01 00:00:00”。如果需要,您可以在导入数据后将其删除或过滤掉。
$CSVFIXHOME/csvfix map -f 3 -fv '' -tv '1900-01-01 00:00:00' -rsep ',' $YOURFILE > $FILEWITHDATES
之后就可以导入新创建的文件了。
我有一个 CSV 文件,我正尝试使用 COPY FROM
命令将其加载到 PostgreSQL 9.2.4 数据库中。特别是有一个允许为空的时间戳字段,但是当我加载“空值”(实际上只是 ""
)时,我收到以下错误:
ERROR: invalid input syntax for type timestamp with time zone: ""
CSV 文件示例如下所示:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",""
SQL 看起来如下:
CREATE TABLE "users"
(
"id" BIGSERIAL NOT NULL PRIMARY KEY,
"name" VARCHAR(255),
"joined" TIMESTAMP WITH TIME ZONE,
);
COPY "users" ("id", "name", "joined")
FROM '/path/to/data.csv'
WITH (
ENCODING 'utf-8',
HEADER 1,
FORMAT 'csv'
);
根据documentation,空值应由不能包含引号字符的空字符串表示,在本例中为双引号("
):
NULL
Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings. This option is not allowed when using binary format.
Note: When using COPY FROM, any data item that matches this string will be stored as a null value, so you should make sure that you use the same string as you used with COPY TO.
我试过 NULL ''
选项,但似乎没有效果。请指教!
不带引号的空字符串正常工作:
id,name,joined
1,"bob","2013-10-02 15:27:44-05"
2,"jane",
select * from users;
id | name | joined
----+------+------------------------
1 | bob | 2013-10-03 03:27:44+07
2 | jane |
也许使用 sed 将 "" 替换为空字符串会更简单。
Postgres 9.4+ 中 COPY FROM
的 FORCE_NULL
选项将是解决您问题的最优雅的方法。 Per documentation:
FORCE_NULL
Match the specified columns' values against the null string, even if it has been quoted, and if a match is found set the value to
NULL
. In the default case where the null string is empty, this converts a quoted empty string intoNULL
. This option is allowed only inCOPY FROM
, and only when usingCSV
format.
当然,它会转换所有列中的所有匹配值。
在旧版本中,您可以 COPY
到具有相同 table 布局的 临时 table - 除了数据类型 text
用于问题列。然后从那里修复有问题的值和 INSERT
:
- single quotes appear arround value after running copy in postgres 9.2
无法让它工作。结束使用这个程序: http://neilb.bitbucket.org/csvfix/
有了它,您可以用其他值替换空字段。
例如,在您的案例中,第 3 列需要有一个时间戳值,所以我给它一个假的。在本例中为“1900-01-01 00:00:00”。如果需要,您可以在导入数据后将其删除或过滤掉。
$CSVFIXHOME/csvfix map -f 3 -fv '' -tv '1900-01-01 00:00:00' -rsep ',' $YOURFILE > $FILEWITHDATES
之后就可以导入新创建的文件了。