Rust vs Go 并发网络服务器,为什么 Rust 在这里很慢?
Rust vs Go concurrent webserver, why is Rust slow here?
我正在尝试 Rust 书中 multi-threaded webserver example 的一些基准测试,为了比较,我在 Go 中构建了类似的东西,运行 使用 ApacheBench 的基准测试。尽管这是一个简单的例子,但差异太大了。执行相同操作的 Go Web 服务器快 10 倍。由于我期望 Rust 更快或处于同一水平,我尝试使用 futures 和 smol 进行多次修订(尽管我的目标是比较仅使用标准库的实现)但结果几乎相同。这里有人可以建议更改 Rust 实现以使其更快而不使用大量线程吗?
这是我使用的代码:https://github.com/deepu105/concurrency-benchmarks
tokio-http 版本最慢,其他 3 个 rust 版本给出几乎相同的结果
以下是基准测试:
Rust(有 8 个线程,有 100 个线程的数字更接近 Go):
❯ ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 176 bytes
Concurrency Level: 100
Time taken for tests: 26.027 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 195000 bytes
HTML transferred: 176000 bytes
Requests per second: 38.42 [#/sec] (mean)
Time per request: 2602.703 [ms] (mean)
Time per request: 26.027 [ms] (mean, across all concurrent requests)
Transfer rate: 7.32 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 2.9 1 16
Processing: 4 2304 1082.5 2001 5996
Waiting: 0 2303 1082.7 2001 5996
Total: 4 2307 1082.1 2002 5997
Percentage of the requests served within a certain time (ms)
50% 2002
66% 2008
75% 2018
80% 3984
90% 3997
95% 4002
98% 4005
99% 5983
100% 5997 (longest request)
开始:
ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 174 bytes
Concurrency Level: 100
Time taken for tests: 2.102 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 291000 bytes
HTML transferred: 174000 bytes
Requests per second: 475.84 [#/sec] (mean)
Time per request: 210.156 [ms] (mean)
Time per request: 2.102 [ms] (mean, across all concurrent requests)
Transfer rate: 135.22 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 1.4 2 5
Processing: 0 203 599.8 3 2008
Waiting: 0 202 600.0 2 2008
Total: 0 205 599.8 5 2013
Percentage of the requests served within a certain time (ms)
50% 5
66% 7
75% 8
80% 8
90% 2000
95% 2003
98% 2005
99% 2010
100% 2013 (longest request)
我只比较了你的“rustws”和 Go 版本。在 Go 中,你有无限的 goroutines(即使你将它们全部限制为只有一个 CPU 核心),而在 rustws 中,你创建了一个具有 8 个线程的线程池。
由于您的请求处理程序每 10 个请求休眠 2 秒,您将 rustws 版本限制为每秒 80/2 = 40 个请求,这就是您在 ab 结果中看到的。 Go 不会受到这种任意瓶颈的影响,因此它会向您显示它在单个 CPU 核心上处理的最大蜡烛数。
我终于能够使用 async_std lib
在 Rust 中获得类似的结果
❯ ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 176 bytes
Concurrency Level: 100
Time taken for tests: 2.094 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 195000 bytes
HTML transferred: 176000 bytes
Requests per second: 477.47 [#/sec] (mean)
Time per request: 209.439 [ms] (mean)
Time per request: 2.094 [ms] (mean, across all concurrent requests)
Transfer rate: 90.92 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 1.7 2 7
Processing: 0 202 599.7 2 2002
Waiting: 0 201 600.1 1 2002
Total: 0 205 599.7 5 2007
Percentage of the requests served within a certain time (ms)
50% 5
66% 6
75% 9
80% 9
90% 2000
95% 2003
98% 2004
99% 2006
100% 2007 (longest request)
这是实现
use async_std::net::TcpListener;
use async_std::net::TcpStream;
use async_std::prelude::*;
use async_std::task;
use std::fs;
use std::time::Duration;
#[async_std::main]
async fn main() {
let mut count = 0;
let listener = TcpListener::bind("127.0.0.1:8080").await.unwrap(); // set listen port
loop {
count = count + 1;
let count_n = Box::new(count);
let (stream, _) = listener.accept().await.unwrap();
task::spawn(handle_connection(stream, count_n)); // spawn a new task to handle the connection
}
}
async fn handle_connection(mut stream: TcpStream, count: Box<i64>) {
// Read the first 1024 bytes of data from the stream
let mut buffer = [0; 1024];
stream.read(&mut buffer).await.unwrap();
// add 2 second delay to every 10th request
if (*count % 10) == 0 {
println!("Adding delay. Count: {}", count);
task::sleep(Duration::from_secs(2)).await;
}
let contents = fs::read_to_string("hello.html").unwrap(); // read html file
let response = format!("{}{}", "HTTP/1.1 200 OK\r\n\r\n", contents);
stream.write(response.as_bytes()).await.unwrap(); // write response
stream.flush().await.unwrap();
}
我正在尝试 Rust 书中 multi-threaded webserver example 的一些基准测试,为了比较,我在 Go 中构建了类似的东西,运行 使用 ApacheBench 的基准测试。尽管这是一个简单的例子,但差异太大了。执行相同操作的 Go Web 服务器快 10 倍。由于我期望 Rust 更快或处于同一水平,我尝试使用 futures 和 smol 进行多次修订(尽管我的目标是比较仅使用标准库的实现)但结果几乎相同。这里有人可以建议更改 Rust 实现以使其更快而不使用大量线程吗?
这是我使用的代码:https://github.com/deepu105/concurrency-benchmarks
tokio-http 版本最慢,其他 3 个 rust 版本给出几乎相同的结果
以下是基准测试:
Rust(有 8 个线程,有 100 个线程的数字更接近 Go):
❯ ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 176 bytes
Concurrency Level: 100
Time taken for tests: 26.027 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 195000 bytes
HTML transferred: 176000 bytes
Requests per second: 38.42 [#/sec] (mean)
Time per request: 2602.703 [ms] (mean)
Time per request: 26.027 [ms] (mean, across all concurrent requests)
Transfer rate: 7.32 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 2.9 1 16
Processing: 4 2304 1082.5 2001 5996
Waiting: 0 2303 1082.7 2001 5996
Total: 4 2307 1082.1 2002 5997
Percentage of the requests served within a certain time (ms)
50% 2002
66% 2008
75% 2018
80% 3984
90% 3997
95% 4002
98% 4005
99% 5983
100% 5997 (longest request)
开始:
ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 174 bytes
Concurrency Level: 100
Time taken for tests: 2.102 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 291000 bytes
HTML transferred: 174000 bytes
Requests per second: 475.84 [#/sec] (mean)
Time per request: 210.156 [ms] (mean)
Time per request: 2.102 [ms] (mean, across all concurrent requests)
Transfer rate: 135.22 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 1.4 2 5
Processing: 0 203 599.8 3 2008
Waiting: 0 202 600.0 2 2008
Total: 0 205 599.8 5 2013
Percentage of the requests served within a certain time (ms)
50% 5
66% 7
75% 8
80% 8
90% 2000
95% 2003
98% 2005
99% 2010
100% 2013 (longest request)
我只比较了你的“rustws”和 Go 版本。在 Go 中,你有无限的 goroutines(即使你将它们全部限制为只有一个 CPU 核心),而在 rustws 中,你创建了一个具有 8 个线程的线程池。
由于您的请求处理程序每 10 个请求休眠 2 秒,您将 rustws 版本限制为每秒 80/2 = 40 个请求,这就是您在 ab 结果中看到的。 Go 不会受到这种任意瓶颈的影响,因此它会向您显示它在单个 CPU 核心上处理的最大蜡烛数。
我终于能够使用 async_std lib
在 Rust 中获得类似的结果❯ ab -c 100 -n 1000 http://localhost:8080/
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Completed 300 requests
Completed 400 requests
Completed 500 requests
Completed 600 requests
Completed 700 requests
Completed 800 requests
Completed 900 requests
Completed 1000 requests
Finished 1000 requests
Server Software:
Server Hostname: localhost
Server Port: 8080
Document Path: /
Document Length: 176 bytes
Concurrency Level: 100
Time taken for tests: 2.094 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 195000 bytes
HTML transferred: 176000 bytes
Requests per second: 477.47 [#/sec] (mean)
Time per request: 209.439 [ms] (mean)
Time per request: 2.094 [ms] (mean, across all concurrent requests)
Transfer rate: 90.92 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 1.7 2 7
Processing: 0 202 599.7 2 2002
Waiting: 0 201 600.1 1 2002
Total: 0 205 599.7 5 2007
Percentage of the requests served within a certain time (ms)
50% 5
66% 6
75% 9
80% 9
90% 2000
95% 2003
98% 2004
99% 2006
100% 2007 (longest request)
这是实现
use async_std::net::TcpListener;
use async_std::net::TcpStream;
use async_std::prelude::*;
use async_std::task;
use std::fs;
use std::time::Duration;
#[async_std::main]
async fn main() {
let mut count = 0;
let listener = TcpListener::bind("127.0.0.1:8080").await.unwrap(); // set listen port
loop {
count = count + 1;
let count_n = Box::new(count);
let (stream, _) = listener.accept().await.unwrap();
task::spawn(handle_connection(stream, count_n)); // spawn a new task to handle the connection
}
}
async fn handle_connection(mut stream: TcpStream, count: Box<i64>) {
// Read the first 1024 bytes of data from the stream
let mut buffer = [0; 1024];
stream.read(&mut buffer).await.unwrap();
// add 2 second delay to every 10th request
if (*count % 10) == 0 {
println!("Adding delay. Count: {}", count);
task::sleep(Duration::from_secs(2)).await;
}
let contents = fs::read_to_string("hello.html").unwrap(); // read html file
let response = format!("{}{}", "HTTP/1.1 200 OK\r\n\r\n", contents);
stream.write(response.as_bytes()).await.unwrap(); // write response
stream.flush().await.unwrap();
}