当 itersize 小于数据大小时且提取数小于 itersize 时,psycopg2 服务器端游标如何操作?
How does psycopg2 server side cursor operate when itersize is less than data size and fetch number is less than itersize?
我已经阅读了文档和几篇文章、帖子和话题等等,但我不确定我是否清楚地理解了这一点。让我们假设这种情况:
1. I have a server side cursor.
2. I set the itersize to 1000.
3. I execute a SELECT query which would normally return 10000 records.
4. I use fetchmany to fetch 100 records at a time.
我的问题是这是如何在幕后完成的?我的理解是执行了查询,但是服务器端游标读取了 1000 条记录。游标避免读取下一个 1000,除非它滚动到当前读取的 1000 的最后一条记录。此外,服务器端游标将 1000 保存在服务器的内存中并一次滚动 100,将它们发送到客户端。我也很想知道 ram 消耗量是什么样的?据我了解,如果执行完整查询占用 10000 kb 的内存,则服务器端游标将在服务器上仅消耗 1000 kb,因为它一次仅读取 1000 条记录,而客户端游标将使用 100 kb。我的理解正确吗?
更新
根据文档和我们在回复中的讨论,我希望这段代码一次打印 10 个项目的列表:
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 10
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
for i in db_cursor:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
但是,在每次迭代中它只打印一条记录。我获得 10 条记录的唯一方法是使用 fetchmany:
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(10)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(10)
根据这两个代码片段,我猜测在前面提到的场景中发生的是给定下面的代码...
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 1000
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(100)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(100)
... itersize 是服务器端的事情。它所做的是,当查询运行时,它设置了一个限制,只能从数据库中加载 1000 条记录。但是 fetchmany 是客户端的事情。它从服务器获取 1000 个中的 100 个。每次运行 fetchmany 时,都会从服务器获取下 100 个。当服务器端的所有 1000 个都滚动时,接下来的 1000 个将从服务器端的数据库中获取。但我很困惑,因为这似乎不是文档所暗示的。但话又说回来......代码似乎暗示了这一点。
我会在这里度过一段时间Server side cursor。
您会发现 itersize
仅在您遍历游标时适用:
for record in cur:
print record
由于您使用的是 fetchmany(size=100)
,因此您一次只能处理 100 行。 服务器不会在内存中保存 1000 行。 我错了。游标将 return 内存中的所有行发送到客户端,然后 fetchmany()
将以指定的批大小从那里提取行(如果未使用命名游标)。如果使用命名游标,那么它将以批量大小从服务器获取。
更新。显示 itersize
和 fetchmany()
的工作原理。
将 itersize
和 fetchmany()
与命名游标一起使用:
cur = con.cursor(name='cp')
cur.itersize = 10
cur.execute("select * from cell_per")
for rs in cur:
print(rs)
cur.close()
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"
cur = con.cursor(name='cp')
cur.execute("select * from cell_per")
cur.fetchmany(size=10)
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
将 fetchmany
与未命名游标一起使用:
cur = con.cursor()
cur.execute("select * from cell_per")
rs = cur.fetchmany(size=10)
len(rs)
10
#Log
statement: select * from cell_per
因此,命名游标在迭代时按 itersize
设置的批次(从服务器)获取行,在使用 fetchmany(size=n)
时按 size
设置。而 non-named 游标将所有行拉入内存,然后根据 fetchmany(size=n)
.
中设置的 size
从那里获取它们
进一步更新.
itersize
仅在迭代游标对象本身时适用:
cur = con.cursor(name="cp")
cur.itersize = 10
cur.execute("select * from cell_per")
for r in cur:
print(r)
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"
在 r
上方将是从服务器端(命名)游标 return 的每批 10 行中提取的一行。该批量大小为 = itersize
。因此,当您遍历命名游标对象本身时,查询指定的所有行都将在迭代器中 returned,只是分批 itersize
.
没有迭代命名的游标对象。使用 fetchmany(size=n)
:
cur = con.cursor(name="cp")
cur.itersize = 10
cur.execute("select * from cell_per")
cur.fetchmany(size=20)
cur.fetchmany(size=20)
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 20 FROM "cp"
statement: FETCH FORWARD 20 FROM "cp"
CLOSE "cp"
itersize
已设置,但作为命名游标对象无效
没有被迭代。相反,fetchmany(size=20)
让服务器端游标在每次调用时发送一批 20 条记录。
我已经阅读了文档和几篇文章、帖子和话题等等,但我不确定我是否清楚地理解了这一点。让我们假设这种情况:
1. I have a server side cursor.
2. I set the itersize to 1000.
3. I execute a SELECT query which would normally return 10000 records.
4. I use fetchmany to fetch 100 records at a time.
我的问题是这是如何在幕后完成的?我的理解是执行了查询,但是服务器端游标读取了 1000 条记录。游标避免读取下一个 1000,除非它滚动到当前读取的 1000 的最后一条记录。此外,服务器端游标将 1000 保存在服务器的内存中并一次滚动 100,将它们发送到客户端。我也很想知道 ram 消耗量是什么样的?据我了解,如果执行完整查询占用 10000 kb 的内存,则服务器端游标将在服务器上仅消耗 1000 kb,因为它一次仅读取 1000 条记录,而客户端游标将使用 100 kb。我的理解正确吗?
更新 根据文档和我们在回复中的讨论,我希望这段代码一次打印 10 个项目的列表:
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 10
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
for i in db_cursor:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
但是,在每次迭代中它只打印一条记录。我获得 10 条记录的唯一方法是使用 fetchmany:
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(10)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(10)
根据这两个代码片段,我猜测在前面提到的场景中发生的是给定下面的代码...
from psycopg2 import connect, extras as psg_extras
with connect(host="db_url", port="db_port", dbname="db_name", user="db_user", password="db_password") as db_connection:
with db_connection.cursor(name="data_operator",
cursor_factory=psg_extras.DictCursor) as db_cursor:
db_cursor.itersize = 1000
db_cursor.execute("SELECT rec_pos FROM schm.test_data;")
records = db_cursor.fetchmany(100)
while len(records) > 0:
print(i)
print(">>>>>>>>>>>>>>>>>>>")
records = db_cursor.fetchmany(100)
... itersize 是服务器端的事情。它所做的是,当查询运行时,它设置了一个限制,只能从数据库中加载 1000 条记录。但是 fetchmany 是客户端的事情。它从服务器获取 1000 个中的 100 个。每次运行 fetchmany 时,都会从服务器获取下 100 个。当服务器端的所有 1000 个都滚动时,接下来的 1000 个将从服务器端的数据库中获取。但我很困惑,因为这似乎不是文档所暗示的。但话又说回来......代码似乎暗示了这一点。
我会在这里度过一段时间Server side cursor。
您会发现 itersize
仅在您遍历游标时适用:
for record in cur:
print record
由于您使用的是 fetchmany(size=100)
,因此您一次只能处理 100 行。 服务器不会在内存中保存 1000 行。 我错了。游标将 return 内存中的所有行发送到客户端,然后 fetchmany()
将以指定的批大小从那里提取行(如果未使用命名游标)。如果使用命名游标,那么它将以批量大小从服务器获取。
更新。显示 itersize
和 fetchmany()
的工作原理。
将 itersize
和 fetchmany()
与命名游标一起使用:
cur = con.cursor(name='cp')
cur.itersize = 10
cur.execute("select * from cell_per")
for rs in cur:
print(rs)
cur.close()
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"
cur = con.cursor(name='cp')
cur.execute("select * from cell_per")
cur.fetchmany(size=10)
#Log
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
将 fetchmany
与未命名游标一起使用:
cur = con.cursor()
cur.execute("select * from cell_per")
rs = cur.fetchmany(size=10)
len(rs)
10
#Log
statement: select * from cell_per
因此,命名游标在迭代时按 itersize
设置的批次(从服务器)获取行,在使用 fetchmany(size=n)
时按 size
设置。而 non-named 游标将所有行拉入内存,然后根据 fetchmany(size=n)
.
size
从那里获取它们
进一步更新.
itersize
仅在迭代游标对象本身时适用:
cur = con.cursor(name="cp")
cur.itersize = 10
cur.execute("select * from cell_per")
for r in cur:
print(r)
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: FETCH FORWARD 10 FROM "cp"
statement: CLOSE "cp"
在 r
上方将是从服务器端(命名)游标 return 的每批 10 行中提取的一行。该批量大小为 = itersize
。因此,当您遍历命名游标对象本身时,查询指定的所有行都将在迭代器中 returned,只是分批 itersize
.
没有迭代命名的游标对象。使用 fetchmany(size=n)
:
cur = con.cursor(name="cp")
cur.itersize = 10
cur.execute("select * from cell_per")
cur.fetchmany(size=20)
cur.fetchmany(size=20)
cur.close()
#Postgres log:
statement: DECLARE "cp" CURSOR WITHOUT HOLD FOR select * from cell_per
statement: FETCH FORWARD 20 FROM "cp"
statement: FETCH FORWARD 20 FROM "cp"
CLOSE "cp"
itersize
已设置,但作为命名游标对象无效
没有被迭代。相反,fetchmany(size=20)
让服务器端游标在每次调用时发送一批 20 条记录。