服务器关闭永远不会完成
Server shutdown never finishes
我有异步 gRPC 服务器(gRPC 版本 1。40.X,Windows 10 x64)。
当我尝试关闭服务器时,有时正常关闭,有时负责 运行 gRPC 的线程在关闭过程开始时卡住。向服务器发送垃圾邮件的客户端越多,这种情况发生的可能性就越大。
我的关机程序:
- 等待当前
Que->AsyncNext
完成
- 我在任何收到
AsyncNotifyWhenDone
的电话上调用 TryToCancel
,这可能是导致问题的原因,因为我不能像某些 AsyncNotifyWhenDone
尚未收到,不确定如何处理此问题,因为调用 finish()
是在 GOT_EVENT 之后完成的
- 我调用
Server->Shutdown()
,这里卡住,线程永远挂起。
- 然后
Que->Shutdown()
- 然后
DrainQue()
同步函数,通过 Que->Next
所有剩余的调用
- 清除所有通话数据
这是它发生时的最后痕迹:
I0921 08:44:56.768000000 17588 init.cc:167] grpc_init(void)
I0921 08:44:56.768000000 17588 completion_queue.cc:522] grpc_completion_queue_create_internal(completion_type=0, polling_type=0)
I0921 08:44:56.768000000 17588 server.cc:1536] grpc_server_shutdown_and_notify(server=000001538016BEE8, cq=0000015415E29A90, tag=0000004D816FF208)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800F7040, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133F40, storage=0000015380133F68)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800EFE00, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133E80, storage=0000015380133EA8)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800EFC00, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133DC0, storage=0000015380133DE8)
I0921 08:44:56.768000000 17588 chttp2_transport.cc:1752] ipv4:127.0.0.1:56702: Sending goaway err={"created":"@1632206696.768000000","description":"Server shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":480,"grpc_status":0}
I0921 08:44:56.768000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=0000015415E29A90)
I0921 08:44:56.768000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 0 }, reserved=0000000000000000)
(为了尊重我的隐私,路径的开头部分已被删除)
当客户端完全断开连接时,它会解冻并继续:
I0921 08:47:02.540000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=0000015415E29A90, tag=0000004D816FF208, error="No Error", done=00007FFD47DE9450, done_arg=0000015380127F00, storage=0000015380137F90)
I0921 08:47:02.540000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=0000015421DDF220, error="No Error", done=00007FFD47DE07B0, done_arg=0000015421DDF3C0, storage=0000015421DDF408)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: OP_COMPLETE: tag:0x4d816ff208 OK
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 0 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 completion_queue.cc:1425] grpc_completion_queue_destroy(cq=0000015415E29A90)
I0921 08:47:02.540000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=0000015415E29A90)
I0921 08:47:02.540000000 17588 init.cc:213] grpc_shutdown(void)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800f7040 ERROR
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800F7308)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800efe00 ERROR
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800efc00 ERROR
I0921 08:47:02.540000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=00000153AE87C900)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x15421ddf220 OK
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FB660)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800F96E0)
I0921 08:47:02.540000000 17588 call.cc:590] grpc_call_unref(c=0000015421DDCAE0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=0000015421DDD5C8)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FABE0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FA160)
I0921 08:47:02.540000000 17588 call.cc:590] grpc_call_unref(c=00000154221F10E0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000154221F1BC8)
I0921 08:47:02.540000000 17588 init.cc:213] grpc_shutdown(void)
I0921 08:47:03.297000000 28004 server.cc:1550] grpc_server_destroy(server=000001538016BEE8)
I0921 08:47:03.297000000 28004 init.cc:213] grpc_shutdown(void)
I0921 08:47:03.297000000 28004 completion_queue.cc:1425] grpc_completion_queue_destroy(cq=00000153AE87C900)
I0921 08:47:03.297000000 28004 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=00000153AE87C900)
I0921 08:47:03.297000000 28004 init.cc:213] grpc_shutdown(void)
这与我之前的问题有些相关,我认为已经解决了:
我不是很理解跟踪,它只是给了我更多的问题然后是答案,例如为什么它调用 grpc_init,为什么 grpc_shutdown 被称为 3*,为什么它在自动关机功能。
如有任何帮助,我们将不胜感激
最好,一月
经过一些挖掘我找到了解决方案,Shutdown()
有两个重载,有截止日期和没有截止日期,我一直在使用的(没有截止日期)永远等待,有截止日期的等待只有截止时间。
所以我的新关机看起来像这样:
const std::chrono::milliseconds WaitDuration = std::chrono::milliseconds(50);
const std::chrono::time_point<system_clock> Deadline = std::chrono::system_clock::now() + WaitDuration;
Server->Shutdown(Deadline);
50 毫秒目前是任意选择的,因为在文档中也没有建议或最佳实践,因此它可能会发生变化。
我有异步 gRPC 服务器(gRPC 版本 1。40.X,Windows 10 x64)。
当我尝试关闭服务器时,有时正常关闭,有时负责 运行 gRPC 的线程在关闭过程开始时卡住。向服务器发送垃圾邮件的客户端越多,这种情况发生的可能性就越大。
我的关机程序:
- 等待当前
Que->AsyncNext
完成 - 我在任何收到
AsyncNotifyWhenDone
的电话上调用TryToCancel
,这可能是导致问题的原因,因为我不能像某些AsyncNotifyWhenDone
尚未收到,不确定如何处理此问题,因为调用finish()
是在 GOT_EVENT 之后完成的
- 我调用
Server->Shutdown()
,这里卡住,线程永远挂起。 - 然后
Que->Shutdown()
- 然后
DrainQue()
同步函数,通过Que->Next
所有剩余的调用
- 清除所有通话数据
这是它发生时的最后痕迹:
I0921 08:44:56.768000000 17588 init.cc:167] grpc_init(void)
I0921 08:44:56.768000000 17588 completion_queue.cc:522] grpc_completion_queue_create_internal(completion_type=0, polling_type=0)
I0921 08:44:56.768000000 17588 server.cc:1536] grpc_server_shutdown_and_notify(server=000001538016BEE8, cq=0000015415E29A90, tag=0000004D816FF208)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800F7040, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133F40, storage=0000015380133F68)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800EFE00, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133E80, storage=0000015380133EA8)
I0921 08:44:56.768000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=00000153800EFC00, error={"created":"@1632206696.768000000","description":"Server Shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":832}, done=00007FFD47DE9440, done_arg=0000015380133DC0, storage=0000015380133DE8)
I0921 08:44:56.768000000 17588 chttp2_transport.cc:1752] ipv4:127.0.0.1:56702: Sending goaway err={"created":"@1632206696.768000000","description":"Server shutdown","file":"\grpc\src\core\lib\surface\server.cc","file_line":480,"grpc_status":0}
I0921 08:44:56.768000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=0000015415E29A90)
I0921 08:44:56.768000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 0 }, reserved=0000000000000000)
(为了尊重我的隐私,路径的开头部分已被删除)
当客户端完全断开连接时,它会解冻并继续:
I0921 08:47:02.540000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=0000015415E29A90, tag=0000004D816FF208, error="No Error", done=00007FFD47DE9450, done_arg=0000015380127F00, storage=0000015380137F90)
I0921 08:47:02.540000000 17588 completion_queue.cc:701] cq_end_op_for_next(cq=00000153AE87C900, tag=0000015421DDF220, error="No Error", done=00007FFD47DE07B0, done_arg=0000015421DDF3C0, storage=0000015421DDF408)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: OP_COMPLETE: tag:0x4d816ff208 OK
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 0 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=0000015415E29A90, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[0000015415E29A90]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 completion_queue.cc:1425] grpc_completion_queue_destroy(cq=0000015415E29A90)
I0921 08:47:02.540000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=0000015415E29A90)
I0921 08:47:02.540000000 17588 init.cc:213] grpc_shutdown(void)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800f7040 ERROR
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800F7308)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800efe00 ERROR
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x153800efc00 ERROR
I0921 08:47:02.540000000 17588 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=00000153AE87C900)
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: OP_COMPLETE: tag:0x15421ddf220 OK
I0921 08:47:02.540000000 17588 completion_queue.cc:979] grpc_completion_queue_next(cq=00000153AE87C900, deadline=gpr_timespec { tv_sec: 9223372036854775807, tv_nsec: 0, clock_type: 1 }, reserved=0000000000000000)
I0921 08:47:02.540000000 17588 completion_queue.cc:1083] RETURN_EVENT[00000153AE87C900]: QUEUE_SHUTDOWN
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FB660)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800F96E0)
I0921 08:47:02.540000000 17588 call.cc:590] grpc_call_unref(c=0000015421DDCAE0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=0000015421DDD5C8)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FABE0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000153800FA160)
I0921 08:47:02.540000000 17588 call.cc:590] grpc_call_unref(c=00000154221F10E0)
I0921 08:47:02.540000000 17588 metadata_array.cc:34] grpc_metadata_array_destroy(array=00000154221F1BC8)
I0921 08:47:02.540000000 17588 init.cc:213] grpc_shutdown(void)
I0921 08:47:03.297000000 28004 server.cc:1550] grpc_server_destroy(server=000001538016BEE8)
I0921 08:47:03.297000000 28004 init.cc:213] grpc_shutdown(void)
I0921 08:47:03.297000000 28004 completion_queue.cc:1425] grpc_completion_queue_destroy(cq=00000153AE87C900)
I0921 08:47:03.297000000 28004 completion_queue.cc:1419] grpc_completion_queue_shutdown(cq=00000153AE87C900)
I0921 08:47:03.297000000 28004 init.cc:213] grpc_shutdown(void)
这与我之前的问题有些相关,我认为已经解决了:
我不是很理解跟踪,它只是给了我更多的问题然后是答案,例如为什么它调用 grpc_init,为什么 grpc_shutdown 被称为 3*,为什么它在自动关机功能。
如有任何帮助,我们将不胜感激
最好,一月
经过一些挖掘我找到了解决方案,Shutdown()
有两个重载,有截止日期和没有截止日期,我一直在使用的(没有截止日期)永远等待,有截止日期的等待只有截止时间。
所以我的新关机看起来像这样:
const std::chrono::milliseconds WaitDuration = std::chrono::milliseconds(50);
const std::chrono::time_point<system_clock> Deadline = std::chrono::system_clock::now() + WaitDuration;
Server->Shutdown(Deadline);
50 毫秒目前是任意选择的,因为在文档中也没有建议或最佳实践,因此它可能会发生变化。