Goroutine长时间处于IO等待状态
Goroutine in IO wait state for long time
我有一台go1.7的大流量服务器(超过800K qps)。
来自 http://urltoserver:debugport/debug/pprof/goroutine?debug=2 我看到 8K goroutines,其中将近 1800 个在 IO 等待分钟。这样的 goroutine 堆栈之一如下所示。
goroutine 128328653 [IO wait, 54 minutes]:
net.runtime_pollWait(0x7f0fcc60c378, 0x72, 0x7cb)
/usr/local/go/src/runtime/netpoll.go:160 +0x59
net.(*pollDesc).wait(0xc4231d0a00, 0x72, 0xc42479fa20, 0xc42000c048)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x38
net.(*pollDesc).waitRead(0xc4231d0a00, 0x92f200, 0xc42000c048)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x34
net.(*netFD).Read(0xc4231d09a0, 0xc423109000, 0x1000, 0x1000, 0x0, 0x92f200, 0xc42000c048)
/usr/local/go/src/net/fd_unix.go:243 +0x1a1
net.(*conn).Read(0xc4234282b8, 0xc423109000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:173 +0x70
net/http.(*connReader).Read(0xc420449840, 0xc423109000, 0x1000, 0x1000, 0xc422b38b68, 0x100000000, 0xc421810601)
/usr/local/go/src/net/http/server.go:586 +0x144
bufio.(*Reader).fill(0xc422e22360)
/usr/local/go/src/bufio/bufio.go:97 +0x10c
bufio.(*Reader).Peek(0xc422e22360, 0x4, 0x7a066c, 0x4, 0x1, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:129 +0x62
net/http.(*conn).readRequest(0xc422b38b00, 0x931fc0, 0xc424d19440, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/server.go:762 +0xdff
net/http.(*conn).serve(0xc422b38b00, 0x931fc0, 0xc424d19440)
/usr/local/go/src/net/http/server.go:1532 +0x3d3
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2293 +0x44d
有人遇到过这个问题吗?
任何指针表示赞赏。
这些很可能是发起请求但从未完成请求的客户端,或者缓慢的客户端等。
您应该配置服务器的 Read/Write 超时(分别为 server.ReadTimeout
and server.WriteTimeout
):
s := new(http.Server)
// ...
s.ReadTimeout = 5 * time.Second
s.WriteTimeout = 5 * time.Second
// ...
我有一台go1.7的大流量服务器(超过800K qps)。
来自 http://urltoserver:debugport/debug/pprof/goroutine?debug=2 我看到 8K goroutines,其中将近 1800 个在 IO 等待分钟。这样的 goroutine 堆栈之一如下所示。
goroutine 128328653 [IO wait, 54 minutes]:
net.runtime_pollWait(0x7f0fcc60c378, 0x72, 0x7cb)
/usr/local/go/src/runtime/netpoll.go:160 +0x59
net.(*pollDesc).wait(0xc4231d0a00, 0x72, 0xc42479fa20, 0xc42000c048)
/usr/local/go/src/net/fd_poll_runtime.go:73 +0x38
net.(*pollDesc).waitRead(0xc4231d0a00, 0x92f200, 0xc42000c048)
/usr/local/go/src/net/fd_poll_runtime.go:78 +0x34
net.(*netFD).Read(0xc4231d09a0, 0xc423109000, 0x1000, 0x1000, 0x0, 0x92f200, 0xc42000c048)
/usr/local/go/src/net/fd_unix.go:243 +0x1a1
net.(*conn).Read(0xc4234282b8, 0xc423109000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:173 +0x70
net/http.(*connReader).Read(0xc420449840, 0xc423109000, 0x1000, 0x1000, 0xc422b38b68, 0x100000000, 0xc421810601)
/usr/local/go/src/net/http/server.go:586 +0x144
bufio.(*Reader).fill(0xc422e22360)
/usr/local/go/src/bufio/bufio.go:97 +0x10c
bufio.(*Reader).Peek(0xc422e22360, 0x4, 0x7a066c, 0x4, 0x1, 0x0, 0x0)
/usr/local/go/src/bufio/bufio.go:129 +0x62
net/http.(*conn).readRequest(0xc422b38b00, 0x931fc0, 0xc424d19440, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/server.go:762 +0xdff
net/http.(*conn).serve(0xc422b38b00, 0x931fc0, 0xc424d19440)
/usr/local/go/src/net/http/server.go:1532 +0x3d3
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go:2293 +0x44d
有人遇到过这个问题吗? 任何指针表示赞赏。
这些很可能是发起请求但从未完成请求的客户端,或者缓慢的客户端等。
您应该配置服务器的 Read/Write 超时(分别为 server.ReadTimeout
and server.WriteTimeout
):
s := new(http.Server)
// ...
s.ReadTimeout = 5 * time.Second
s.WriteTimeout = 5 * time.Second
// ...