从 Beast.1.0.0-b66 到 Boost.1.67 的端口性能下降。0.Beast
Performance drop on port from Beast.1.0.0-b66 to Boost.1.67.0.Beast
我观察到从 Beast.1.0.0-b66(使用 Boost.1.64.0)迁移到 Boost.1.67 时性能急剧下降(并且 CPU 消耗急剧增加)。0.Beast(即 Beast 集成到 Boost 中)。毫无疑问,我做错了什么,但我无法想象是什么。
过去是什么:
typedef beast::http::request<beast::http::string_body> BeastHttpRequest;
现在是:
namespace http = boost::beast::http;
typedef http::request<http::string_body> BeastHttpRequest;
过去是什么:
beast::http::prepare(req);
beast::http::write(stream, req);
现在是:
req.prepare_payload();
http::write(stream, req);
当然,我还必须进行许多 API 更改。例如:
req.fields.replace(hdrName, hdrValue);
现在是:
req.set(hdrName, hdrValue);
应用程序正常工作——包括 SSL 握手和代理协商——但我必须修复 CPU 消耗的峰值和相应的性能下降。我想知道是否有人知道我忽略的一些明显的事情。
编辑:
我应该提到我正在为 SSL 流使用 flat_buffer。
我有机会分析 "before port" 和 "after port" 性能数据。这是 "before port"(即良好的性能)调用链:
- 34.61% HttpRequest::send
- 32.02% beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, true, beast::http::string_body,
- 31.90% beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, true, beast::http::string_bo
- 20.77% beast::http::detail::write_preparation<true, beast::http::string_body, beast::http::basic_fields<std::allocator<char> > >::init
- 18.25% beast::http::detail::write_fields<beast::basic_streambuf<std::allocator<char> >, beast::http::basic_fields<std::allocator<char> > >
- 10.38% beast::write<beast::basic_streambuf<std::allocator<char> >, boost::basic_string_ref<char, std::char_traits<char> > >
- 10.20% beast::detail::write_dynabuf<beast::basic_streambuf<std::allocator<char> >, boost::basic_string_ref<char, std::char_traits<char> > >
+ 3.66% beast::basic_streambuf<std::allocator<char> >::prepare
+ 2.90% beast::basic_streambuf<std::allocator<char> >::commit
+ 1.81% boost::lexical_cast<std::string, boost::basic_string_ref<char, std::char_traits<char> > >
+ 1.47% boost::asio::buffer_copy<beast::basic_streambuf<std::allocator<char> >::mutable_buffers_type>
- 7.51% beast::write<beast::basic_streambuf<std::allocator<char> >, char [3]>
- 7.47% beast::detail::write_dynabuf<beast::basic_streambuf<std::allocator<char> >, 3ul>
+ 3.14% beast::basic_streambuf<std::allocator<char> >::prepare
+ 2.84% beast::basic_streambuf<std::allocator<char> >::commit
+ 1.32% boost::asio::buffer_copy<beast::basic_streambuf<std::allocator<char> >::mutable_buffers_type>
+ 2.04% beast::http::detail::write_start_line<beast::basic_streambuf<std::allocator<char> >, beast::http::basic_fields<std::allocator<char> > >
- 9.60% beast::http::string_body::writer::write<beast::http::detail::writef0_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_serv
- 9.57% beast::http::detail::writef0_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast:
- 9.54% boost::asio::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast::detail::buffe
- 9.48% boost::asio::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast::detail::bu
- 8.55% boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>::write_some<boost::asio::detail::c
- 7.56% boost::asio::ssl::detail::io<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::ssl::detail::wr
- 6.98% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1>
- 6.93% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1,
- 6.69% boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >::write_some<boost::asio::detail::consuming_buffer
- 6.69% boost::asio::stream_socket_service<boost::asio::ip::tcp>::send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, boost::asio::mutable_buffers_1>
- 6.67% boost::asio::detail::reactive_socket_service_base::send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, boost::asio::mutable_buffers_1> >
- 6.51% boost::asio::detail::socket_ops::sync_send
- 6.32% 0xec6d
- 6.08% system_call_fastpath
- 6.07% sys_sendmsg
- 6.06% __sys_sendmsg
- 5.94% ___sys_sendmsg
+ 5.84% sock_send
这里是 "after port"(即性能不佳)调用链:
- 53.77% HttpRequest::send
- 53.32% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allo
- 53.30% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::a
- 53.14% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std
- 53.03% boost::beast::http::write_some<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<ch
- 52.95% boost::beast::http::detail::write_some_impl<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std
- 36.58% boost::beast::http::serializer<true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allocator<char> >, boost::beast::http::basic_fields<std::allocator<
- 35.28% boost::beast::http::serializer<true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allocator<char> >, boost::beast::http::basic_fields<std::allocat
- 25.93% boost::beast::http::detail::write_some_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&> >::operator()<boost::beast::detail::buffers
- 25.82% boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>::write_some<boost::beast::detail::buffers_ref<boost::beast::buffers_prefix_view<boost:
- 25.69% boost::asio::ssl::detail::io<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::ssl::detail::write_op<boost::beast::detail::buffers_ref<boost::beas
- 22.20% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer>
- 22.13% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer, boost::asio::detail::transfer_all_t>
- 21.96% boost::asio::detail::write_buffer_sequence<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer, boost::asio::mutable_buffer
- 21.47% boost::asio::basic_stream_socket<boost::asio::ip::tcp>::write_some<boost::asio::const_buffers_1>
- 21.34% boost::asio::detail::reactive_socket_service_base::send<boost::asio::const_buffers_1>
- 21.17% boost::asio::detail::socket_ops::sync_send
- 20.45% 0xec6d
- 19.63% system_call_fastpath
- 19.57% sys_sendmsg
- 19.52% __sys_sendmsg
- 19.16% ___sys_sendmsg
+ 18.90% sock_sendmsg
它应该更快,而不是更慢,因为 HTTP 算法已经过优化。我想我知道发生了什么。如果您可以比较两个版本在使用常规套接字而不是 SSL 时的性能,将会很有帮助。 boost::asio::ssl::stream
有一个缺点,即当写入长度大于 1 的缓冲区序列时,它会为序列中的每个缓冲区写入套接字,而不是将这些加密缓冲区合并为一次写入。这会对性能产生重大影响。
这确实需要在 Boost.Asio 中修复,但解决方法是您可以编写自己的流包装器,其写入算法在出现长度大于 1 的缓冲区序列时,会创建一个新序列使用 memcpy
和动态分配的长度为一。我也会在这里打开一个问题:https://github.com/boostorg/asio/issues
Beast的两个版本为什么不同? Beast 的旧版本在序列化期间分配内存以保存消息的线性版本。当前版本使用的算法根本不需要内存分配。它在正常情况下更快,但正如您在使用 ssl::stream
类型时发现的那样,它可能会更慢。
我观察到从 Beast.1.0.0-b66(使用 Boost.1.64.0)迁移到 Boost.1.67 时性能急剧下降(并且 CPU 消耗急剧增加)。0.Beast(即 Beast 集成到 Boost 中)。毫无疑问,我做错了什么,但我无法想象是什么。
过去是什么:
typedef beast::http::request<beast::http::string_body> BeastHttpRequest;
现在是:
namespace http = boost::beast::http;
typedef http::request<http::string_body> BeastHttpRequest;
过去是什么:
beast::http::prepare(req);
beast::http::write(stream, req);
现在是:
req.prepare_payload();
http::write(stream, req);
当然,我还必须进行许多 API 更改。例如:
req.fields.replace(hdrName, hdrValue);
现在是:
req.set(hdrName, hdrValue);
应用程序正常工作——包括 SSL 握手和代理协商——但我必须修复 CPU 消耗的峰值和相应的性能下降。我想知道是否有人知道我忽略的一些明显的事情。
编辑: 我应该提到我正在为 SSL 流使用 flat_buffer。
我有机会分析 "before port" 和 "after port" 性能数据。这是 "before port"(即良好的性能)调用链:
- 34.61% HttpRequest::send
- 32.02% beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, true, beast::http::string_body,
- 31.90% beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, true, beast::http::string_bo
- 20.77% beast::http::detail::write_preparation<true, beast::http::string_body, beast::http::basic_fields<std::allocator<char> > >::init
- 18.25% beast::http::detail::write_fields<beast::basic_streambuf<std::allocator<char> >, beast::http::basic_fields<std::allocator<char> > >
- 10.38% beast::write<beast::basic_streambuf<std::allocator<char> >, boost::basic_string_ref<char, std::char_traits<char> > >
- 10.20% beast::detail::write_dynabuf<beast::basic_streambuf<std::allocator<char> >, boost::basic_string_ref<char, std::char_traits<char> > >
+ 3.66% beast::basic_streambuf<std::allocator<char> >::prepare
+ 2.90% beast::basic_streambuf<std::allocator<char> >::commit
+ 1.81% boost::lexical_cast<std::string, boost::basic_string_ref<char, std::char_traits<char> > >
+ 1.47% boost::asio::buffer_copy<beast::basic_streambuf<std::allocator<char> >::mutable_buffers_type>
- 7.51% beast::write<beast::basic_streambuf<std::allocator<char> >, char [3]>
- 7.47% beast::detail::write_dynabuf<beast::basic_streambuf<std::allocator<char> >, 3ul>
+ 3.14% beast::basic_streambuf<std::allocator<char> >::prepare
+ 2.84% beast::basic_streambuf<std::allocator<char> >::commit
+ 1.32% boost::asio::buffer_copy<beast::basic_streambuf<std::allocator<char> >::mutable_buffers_type>
+ 2.04% beast::http::detail::write_start_line<beast::basic_streambuf<std::allocator<char> >, beast::http::basic_fields<std::allocator<char> > >
- 9.60% beast::http::string_body::writer::write<beast::http::detail::writef0_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_serv
- 9.57% beast::http::detail::writef0_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast:
- 9.54% boost::asio::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast::detail::buffe
- 9.48% boost::asio::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>, beast::detail::bu
- 8.55% boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >&>::write_some<boost::asio::detail::c
- 7.56% boost::asio::ssl::detail::io<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::ssl::detail::wr
- 6.98% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1>
- 6.93% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >, boost::asio::mutable_buffers_1,
- 6.69% boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::stream_socket_service<boost::asio::ip::tcp> >::write_some<boost::asio::detail::consuming_buffer
- 6.69% boost::asio::stream_socket_service<boost::asio::ip::tcp>::send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, boost::asio::mutable_buffers_1>
- 6.67% boost::asio::detail::reactive_socket_service_base::send<boost::asio::detail::consuming_buffers<boost::asio::const_buffer, boost::asio::mutable_buffers_1> >
- 6.51% boost::asio::detail::socket_ops::sync_send
- 6.32% 0xec6d
- 6.08% system_call_fastpath
- 6.07% sys_sendmsg
- 6.06% __sys_sendmsg
- 5.94% ___sys_sendmsg
+ 5.84% sock_send
这里是 "after port"(即性能不佳)调用链:
- 53.77% HttpRequest::send
- 53.32% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allo
- 53.30% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::a
- 53.14% boost::beast::http::write<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std
- 53.03% boost::beast::http::write_some<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std::char_traits<ch
- 52.95% boost::beast::http::detail::write_some_impl<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>, true, boost::beast::http::basic_string_body<char, std
- 36.58% boost::beast::http::serializer<true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allocator<char> >, boost::beast::http::basic_fields<std::allocator<
- 35.28% boost::beast::http::serializer<true, boost::beast::http::basic_string_body<char, std::char_traits<char>, std::allocator<char> >, boost::beast::http::basic_fields<std::allocat
- 25.93% boost::beast::http::detail::write_some_lambda<boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&> >::operator()<boost::beast::detail::buffers
- 25.82% boost::asio::ssl::stream<boost::asio::basic_stream_socket<boost::asio::ip::tcp>&>::write_some<boost::beast::detail::buffers_ref<boost::beast::buffers_prefix_view<boost:
- 25.69% boost::asio::ssl::detail::io<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::ssl::detail::write_op<boost::beast::detail::buffers_ref<boost::beas
- 22.20% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer>
- 22.13% boost::asio::write<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer, boost::asio::detail::transfer_all_t>
- 21.96% boost::asio::detail::write_buffer_sequence<boost::asio::basic_stream_socket<boost::asio::ip::tcp>, boost::asio::mutable_buffer, boost::asio::mutable_buffer
- 21.47% boost::asio::basic_stream_socket<boost::asio::ip::tcp>::write_some<boost::asio::const_buffers_1>
- 21.34% boost::asio::detail::reactive_socket_service_base::send<boost::asio::const_buffers_1>
- 21.17% boost::asio::detail::socket_ops::sync_send
- 20.45% 0xec6d
- 19.63% system_call_fastpath
- 19.57% sys_sendmsg
- 19.52% __sys_sendmsg
- 19.16% ___sys_sendmsg
+ 18.90% sock_sendmsg
它应该更快,而不是更慢,因为 HTTP 算法已经过优化。我想我知道发生了什么。如果您可以比较两个版本在使用常规套接字而不是 SSL 时的性能,将会很有帮助。 boost::asio::ssl::stream
有一个缺点,即当写入长度大于 1 的缓冲区序列时,它会为序列中的每个缓冲区写入套接字,而不是将这些加密缓冲区合并为一次写入。这会对性能产生重大影响。
这确实需要在 Boost.Asio 中修复,但解决方法是您可以编写自己的流包装器,其写入算法在出现长度大于 1 的缓冲区序列时,会创建一个新序列使用 memcpy
和动态分配的长度为一。我也会在这里打开一个问题:https://github.com/boostorg/asio/issues
Beast的两个版本为什么不同? Beast 的旧版本在序列化期间分配内存以保存消息的线性版本。当前版本使用的算法根本不需要内存分配。它在正常情况下更快,但正如您在使用 ssl::stream
类型时发现的那样,它可能会更慢。