PostGres - 使用 execute 而不是 executemany 插入元组列表

Question

我正在插入数千行，时机和速度非常重要。我通过基准测试发现 postgres 可以使用 execute() 而不是 executemany()

更快地摄取我的行

这很适合我：

...

def insert(self, table, columns, values):
    conn = self.connectionPool.getconn()
    conn.autocommit = True

    try:
        with conn.cursor() as cursor:
            query = (
                 f'INSERT INTO {table} ({columns}) '
                 f'VALUES {values} '
                 f'ON CONFLICT DO NOTHING;'
             ).replace('[', '').replace(']', '')  # Notice the replace x2 to get rid of the list brackets

            print(query)
            cursor.execute(query)
    finally:
        cursor.close()
        self.connectionPool.putconn(conn)

...

self.insert('types', 'name, created_at', rows)

在双 replace 之后，打印 query returns 类似这样的内容并摄取行：

INSERT INTO types (name, created_at) VALUES ('TIMER', '2022-04-09 03:19:49'), ('Sequence1', '2022-04-09 03:19:49') ON CONFLICT DO NOTHING;

我的方法安全吗？是否有使用 execute 的更 pythonic 实现？

Answer 1

不，这不安全甚至不可靠 – Python repr 与 PostgreSQL 字符串语法不兼容（尝试一些带有单引号、换行符或反斜杠）。

考虑改为传递数组参数并使用 UNNEST:
```
cursor.execute(
    "INSERT INTO types (name, created_at)"
    " SELECT name, created_at FROM UNNEST (%(names)s, %(created_ats)s) AS t",
    {
        'names': ['TIMER', 'Sequence1', ...],
        'created_ats': ['2022-04-09 03:19:49', ...],
    })
```
这是最好的解决方案，因为查询不依赖于参数（可以准备和缓存，统计信息可以很容易地分组，使得没有SQL注入漏洞很明显，可以很容易地记录没有数据的查询）。
否则，构建一个只在参数数量上是动态的查询，例如 VALUES ((%s, %s, ...), (%s, %s, ...), ...)。请注意 PostgreSQL 有参数限制，因此您可能需要批量生产这些。
否则，使用psycopg2.sql.Literal。

PostGres - 使用 execute 而不是 executemany 插入元组列表

PostGres - Insert list of tuples using execute instead of executemany

python

psycopg2