当使用 copy_from 一个 csv 文件到 Postgres 数据库时，Psycopg2 不会自动生成 id

Question

我有一个包含多个列的 csv 文件：

upc date quantity customer

在我的 physical table 中，我为每一行自动生成一个 id 列：

id upc date quantity customer

当我运行我的 python 脚本复制到数据库时，数据库似乎正在将 upc 解释为实际 ID。我收到此错误消息：

Error: value "1111111" is out of range for type integer
CONTEXT:  COPY physical, line 1, column id: "1111111"

我以前从未尝试过，但我相信这是正确的：

def insert_csv(f, table):
    connection = get_postgres_connection()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',')
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

我是不是做错了什么，还是我必须创建另一个脚本才能从 table 中获取最后一个 ID？

更新的工作代码：

def insert_csv(f, table, columns):
    connection = get_postgres_connection()
    cursor = connection.cursor()
    try:
        column_names = ','.join(columns)
        query = f'''
            COPY {table}({column_names})
            FROM STDOUT (FORMAT CSV)
        '''
        cursor.copy_expert(query, f)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

columns = (
        "upc",
        "date_thru",
        "transaction_type",
        "transaction_type_subtype",
        "country_code",
        "customer",
        "quantity",
        "income_gross",
        "fm_serial",
        "date_usage"
    )

with open(dump_file, 'r', newline='', encoding="ISO-8859-1") as f:
        inserted = insert_csv(f, 'physical', columns)

Answer 1

从这里开始 COPY:

If a column list is specified, COPY TO copies only the data in the specified columns to the file. For COPY FROM, each field in the file is inserted, in order, into the specified column. Table columns not specified in the COPY FROM column list will receive their default values.

因此 CSV 文件中的值将从左到右分配，table 末尾的字段将获得其 DEFAULT 值。如果您不希望发生这种情况，请从这里 copy_from:

columns – iterable with name of the columns to import. The length and types should match the content of the file to read. If not specified, it is assumed that the entire table matches the file structure.

创建一个与文件结构匹配的列列表，省略 id 列，该列将填充 sequence 值。

Answer 2

您需要指定要导入的列。来自 the documentation:

columns – iterable with name of the columns to import. The length and types should match the content of the file to read. If not specified, it is assumed that the entire table matches the file structure.

您的代码可能如下所示：

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        cursor.copy_from(f, table, sep=',', columns=columns)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()
        
with open("path_to_my_csv") as file:
    insert_csv(file, "my_table", ("upc", "date", "quantity", "customer"))

如果您必须使用copy_expert()，请按以下方式修改您的功能：

def insert_csv(f, table, columns):
    connection = connect()
    cursor = connection.cursor()
    try:
        column_names = ','.join(columns)
        copy_cmd = f"copy {table}({column_names}) from stdout (format csv)"
        cursor.copy_expert(copy_cmd, f)
        connection.commit()
        return True
    except (psycopg2.Error) as e:
        print(e)
        return False
    finally:
        cursor.close()
        connection.close()

当使用 copy_from 一个 csv 文件到 Postgres 数据库时，Psycopg2 不会自动生成 id

Psycopg2 not auto generating id when using copy_from a csv file to Postgres db

python

postgresql

csv

copy

psycopg2