无法将数值插入postgres中的整数列

Question

我已经提到了 this post, this post, this post。所以，请不要标记为重复

我在 pandas 数据框中有一个原始数据，称为 temp_id，如下所示。

由于 NA，该列的数据类型为 float64，它看起来如下图 jupyter notebook

中所示

temp_id
55608.0
55609.0
NaN        
55610.0
NaN        
55611.0

在 csv 文件中，同一列如下所示

temp_id
55608
55609
        #empty row indicating NA        
55610
        #empty row indicating NA
55611

我正在尝试将此数据复制到具有以下 table 定义的 postgresql table 中。请注意，它不是主键，可以有空行

CREATE TABLE temp(
  temp_id integer
  
);

当我尝试复制数据时，出现以下错误

ERROR:  invalid input syntax for integer: "55608.0"
CONTEXT:  COPY temp, line 2, column temp_id: "55608.0"

如何避免这种情况并将此数据插入 Postgresql table 中的整数列？以下是我在import csv

期间在pgadmin中输入的杂字

Answer 1

您尝试插入的列包含 NaN（或 None）。一件有趣的事情是，浮点类型确实有一个特殊值 NaN 但整数没有。因此，在读取 csv 文件时，计算机（pandas 模块）假设整个列都是由浮点数组成的。

a = [1, 2, 3.01] # Will be float when read by Pandas.
b = [1, 2, None] # Will be float when read by Pandas.

解决方案

删除包含 NaN 的行并设置为 int

import pandas as pd

df = pd.DataFrame(dict(col=[1, 2, 3, 4, None]))
df = df.dropna()
df = df.astype(int)

有些SQL数据库使用"NULL"表示NaN，但必须以字符串形式发送。在数据库中，该列将为 int，但必须将其设置为“可空”。

import pandas as pd

df = pd.DataFrame(dict(col=[1, 2, 3, 4, None]))
# Note that Pandas accept mixed type columns. The column dtype will be of "Object".
df = df.fillna('NULL')
df = df.astype(str)
df['col'] = df['col'].apply(lambda x: x.replace('.0', ''))

Answer 2

答案与@Lukas Thaler 发布的相似。但是我不得不使用 astype(Int64)

而不是 astype(int)

df['temp_id'] = df['temp_id'].astype('Int64')

这很好地将 NA 的列转换为 Int64 类型，我能够成功上传。

所以，不是 int，而是 int64。

无法将数值插入postgres中的整数列

Unable to insert numeric values into integer column in postgres

python

sql

postgresql

psycopg2

pgadmin

解决方案