cProfile 是在背叛我吗?
Is cProfile betraying me?
我想知道为什么我的基于 pyzmq
和 protobuf
的消息传递乒乓比预期的慢得多,所以我使用 cProfile
检查你最后找到的脚本post.
protoc --python_out=. rpc.proto
python -m cProfile -o rpc.pstats ./test_rpc.py
returns
3.604 sec for 10000 messages, 360.41us/m, 2774 m/s
和
python -m pstats rpc.pstats
rpc.pstats% sort tottime
rpc.pstats% stats 10
给我(仅针对客户端进程):
619163 function calls (618374 primitive calls) in 3.779 seconds
Ordered by: internal time
List reduced from 580 to 30 due to restriction <30>
ncalls tottime percall cumtime percall filename:lineno(function)
10002 2.658 0.000 2.658 0.000 {method 'recv' of 'zmq.backend.cython.socket.Socket' objects}
10002 0.088 0.000 0.088 0.000 {method 'send' of 'zmq.backend.cython.socket.Socket' objects}
10001 0.060 0.000 3.654 0.000 ./test_rpc.py:36(rpc_set)
10002 0.058 0.000 3.457 0.000 ./test_rpc.py:32(zmq_reply)
80016 0.056 0.000 0.056 0.000 {method 'write' of 'cStringIO.StringO' objects}
10002 0.056 0.000 0.424 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:781(InternalSerialize)
30004 0.054 0.000 0.137 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:453(setter)
20002 0.051 0.000 0.058 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/type_checkers.py:113(CheckValue)
10002 0.050 0.000 0.055 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:839(IsInitialized)
10002 0.050 0.000 0.148 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:577(ListFields)
...
奇怪的是:pyzmq
的 recv()
/send()
似乎消耗了 2700 毫秒 而 protobuf
仅消耗约 250 毫秒.
但事实并非如此!
如果我去掉 protobuf
部分,同一进程在同一系统上仅消耗大约 1350 毫秒 (-65%)。 (我没有添加 pyzmq
-only 脚本,但您可以只发送几个字节而不是序列化数据)
这些额外的 65% 几乎 100% 归因于 pyzmq
,实际上由 protobuf
消耗。
问题:这是怎么回事?在这种情况下,如何以指向 protobuf
而不是 pyzmq
的方式分析我的脚本?
为了重现日期,您必须安装 protobuf-python
和 python-zmq
。以下是用于此实验的脚本:
test_rpc.py:
import sys
import time
import threading
import subprocess
import zmq
import rpc_pb2 # must be generated first
if '--server' in sys.argv:
print('zmq_protobuf_rpc_server')
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5556")
request = rpc_pb2.rpc_request()
reply = rpc_pb2.rpc_reply()
while True:
request.ParseFromString(socket.recv())
if request.type == rpc_pb2.rpc_request.SET:
value = request.value
elif request.type == rpc_pb2.rpc_request.GET:
reply.value = "value"
socket.send(reply.SerializeToString())
if request.type == rpc_pb2.rpc_request.QUIT:
break
else:
def zmq_reply(req_msg, rep_msg, socket):
socket.send(req_msg.SerializeToString())
rep_msg.ParseFromString(socket.recv())
def rpc_set(req_msg, rep_msg, socket, name, value):
req_msg.type = rpc_pb2.rpc_request.SET
req_msg.name = name
req_msg.value = value
zmq_reply(req_msg, rep_msg, socket)
def rpc_get(req_msg, rep_msg, socket, name):
req_msg.type = rpc_pb2.rpc_request.GET
req_msg.name = name
zmq_reply(req_msg, rep_msg, socket)
return rep_msg.value
print('zmq_protobuf_rpc')
p = subprocess.Popen([sys.executable, '-u', __file__, '--server'])
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5556") # IPC would be a bit faster
request = rpc_pb2.rpc_request()
reply = rpc_pb2.rpc_reply()
# wait for the server to be responsive
rpc_set(request, reply, socket, 'hello', 'hello')
N = 10000
t = time.time()
for i in range(N):
rpc_set(request, reply, socket, 'name', str(i))
t = time.time() - t
print("%.3f sec for %d messages, %.2fus/m, %d m/s"
% (t, N, t / N * 1000000, N/t))
request.type = rpc_pb2.rpc_request.QUIT
zmq_reply(request, reply, socket)
p.wait()
rpc.proto:
package rpc;
message rpc_request {
enum RpcType {GET = 0; SET = 1; QUIT = 2; }
required RpcType type = 1;
required string name = 2;
optional string value = 3; }
message rpc_reply {
optional string value = 3; }
您想测量 process_time 而不是默认测量的 system_time。这是通过为您的分析器使用自定义计时器功能来完成的,例如time.process_time()。
示例:
import cProfile
import time
import pstats
import io
pr = cProfile.Profile(time.process_time)
pr.enable()
time.sleep(1)
for i in range(0, 1000):
print("hello world")
time.sleep(1)
pr.disable()
s = io.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print(s.getvalue())
打印功能消耗最多 process_time,统计数据会显示这一点。
如果您不将 time.process_time() 作为计时器函数传递,统计信息将显示睡眠函数使用了大部分时间。
我想知道为什么我的基于 pyzmq
和 protobuf
的消息传递乒乓比预期的慢得多,所以我使用 cProfile
检查你最后找到的脚本post.
protoc --python_out=. rpc.proto
python -m cProfile -o rpc.pstats ./test_rpc.py
returns
3.604 sec for 10000 messages, 360.41us/m, 2774 m/s
和
python -m pstats rpc.pstats
rpc.pstats% sort tottime
rpc.pstats% stats 10
给我(仅针对客户端进程):
619163 function calls (618374 primitive calls) in 3.779 seconds
Ordered by: internal time
List reduced from 580 to 30 due to restriction <30>
ncalls tottime percall cumtime percall filename:lineno(function)
10002 2.658 0.000 2.658 0.000 {method 'recv' of 'zmq.backend.cython.socket.Socket' objects}
10002 0.088 0.000 0.088 0.000 {method 'send' of 'zmq.backend.cython.socket.Socket' objects}
10001 0.060 0.000 3.654 0.000 ./test_rpc.py:36(rpc_set)
10002 0.058 0.000 3.457 0.000 ./test_rpc.py:32(zmq_reply)
80016 0.056 0.000 0.056 0.000 {method 'write' of 'cStringIO.StringO' objects}
10002 0.056 0.000 0.424 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:781(InternalSerialize)
30004 0.054 0.000 0.137 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:453(setter)
20002 0.051 0.000 0.058 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/type_checkers.py:113(CheckValue)
10002 0.050 0.000 0.055 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:839(IsInitialized)
10002 0.050 0.000 0.148 0.000 /usr/lib/python2.7/site-packages/google/protobuf/internal/python_message.py:577(ListFields)
...
奇怪的是:pyzmq
的 recv()
/send()
似乎消耗了 2700 毫秒 而 protobuf
仅消耗约 250 毫秒.
但事实并非如此!
如果我去掉 protobuf
部分,同一进程在同一系统上仅消耗大约 1350 毫秒 (-65%)。 (我没有添加 pyzmq
-only 脚本,但您可以只发送几个字节而不是序列化数据)
这些额外的 65% 几乎 100% 归因于 pyzmq
,实际上由 protobuf
消耗。
问题:这是怎么回事?在这种情况下,如何以指向 protobuf
而不是 pyzmq
的方式分析我的脚本?
为了重现日期,您必须安装 protobuf-python
和 python-zmq
。以下是用于此实验的脚本:
test_rpc.py:
import sys
import time
import threading
import subprocess
import zmq
import rpc_pb2 # must be generated first
if '--server' in sys.argv:
print('zmq_protobuf_rpc_server')
context = zmq.Context()
socket = context.socket(zmq.REP)
socket.bind("tcp://*:5556")
request = rpc_pb2.rpc_request()
reply = rpc_pb2.rpc_reply()
while True:
request.ParseFromString(socket.recv())
if request.type == rpc_pb2.rpc_request.SET:
value = request.value
elif request.type == rpc_pb2.rpc_request.GET:
reply.value = "value"
socket.send(reply.SerializeToString())
if request.type == rpc_pb2.rpc_request.QUIT:
break
else:
def zmq_reply(req_msg, rep_msg, socket):
socket.send(req_msg.SerializeToString())
rep_msg.ParseFromString(socket.recv())
def rpc_set(req_msg, rep_msg, socket, name, value):
req_msg.type = rpc_pb2.rpc_request.SET
req_msg.name = name
req_msg.value = value
zmq_reply(req_msg, rep_msg, socket)
def rpc_get(req_msg, rep_msg, socket, name):
req_msg.type = rpc_pb2.rpc_request.GET
req_msg.name = name
zmq_reply(req_msg, rep_msg, socket)
return rep_msg.value
print('zmq_protobuf_rpc')
p = subprocess.Popen([sys.executable, '-u', __file__, '--server'])
context = zmq.Context()
socket = context.socket(zmq.REQ)
socket.connect("tcp://127.0.0.1:5556") # IPC would be a bit faster
request = rpc_pb2.rpc_request()
reply = rpc_pb2.rpc_reply()
# wait for the server to be responsive
rpc_set(request, reply, socket, 'hello', 'hello')
N = 10000
t = time.time()
for i in range(N):
rpc_set(request, reply, socket, 'name', str(i))
t = time.time() - t
print("%.3f sec for %d messages, %.2fus/m, %d m/s"
% (t, N, t / N * 1000000, N/t))
request.type = rpc_pb2.rpc_request.QUIT
zmq_reply(request, reply, socket)
p.wait()
rpc.proto:
package rpc;
message rpc_request {
enum RpcType {GET = 0; SET = 1; QUIT = 2; }
required RpcType type = 1;
required string name = 2;
optional string value = 3; }
message rpc_reply {
optional string value = 3; }
您想测量 process_time 而不是默认测量的 system_time。这是通过为您的分析器使用自定义计时器功能来完成的,例如time.process_time()。
示例:
import cProfile
import time
import pstats
import io
pr = cProfile.Profile(time.process_time)
pr.enable()
time.sleep(1)
for i in range(0, 1000):
print("hello world")
time.sleep(1)
pr.disable()
s = io.StringIO()
sortby = 'cumulative'
ps = pstats.Stats(pr, stream=s).sort_stats(sortby)
ps.print_stats()
print(s.getvalue())
打印功能消耗最多 process_time,统计数据会显示这一点。 如果您不将 time.process_time() 作为计时器函数传递,统计信息将显示睡眠函数使用了大部分时间。