Nginx 常量写入导致 CPU I/O 等待

Nginx constant writes causes CPU I/O wait

我 运行 nginx/1.20.1 在 G9 CentOS 7 机器上使用以下规格提供静态视频文件:

Nginx 配置:

user root;
worker_processes auto;
error_log /var/log/nginx/error.log;
pid /run/nginx.pid;

# Load dynamic modules. See /usr/share/nginx/README.dynamic.
include /usr/share/nginx/modules/*.conf;
worker_rlimit_nofile 30000;
events {
    worker_connections 2024;
    use epoll;
}

http {
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile            on;
    directio            16m;
#    output_buffers     2 32m;
#    aio                        threads;
    sendfile_max_chunk 512k;
    tcp_nopush          on;
    tcp_nodelay         on;
    keepalive_timeout   120;
    types_hash_max_size 2048;
    # allow the server to close connection on non responding client, this will free up memory
    reset_timedout_connection on;
    # request timed out -- default 60
    client_body_timeout 60;
    # if client stop responding, free up memory -- default 60
    send_timeout 30;

    include             /etc/nginx/mime.types;
    default_type        application/octet-stream;
    client_max_body_size 200m;

    # Load modular configuration files from the /etc/nginx/conf.d directory.
    # See http://nginx.org/en/docs/ngx_core_module.html#include
    # for more information.
    include /etc/nginx/conf.d/*.conf;

}

conf.d:

server{
    listen 80;
    server_name  mydomain.com;
    charset utf-8;
    sendfile   on;
    tcp_nopush on;

    fastcgi_read_timeout 600;
    client_header_timeout 600;
    client_body_timeout 600;
    client_max_body_size 0;

    access_log /var/log/nginx/static.access_log main;
    error_log  /var/log/nginx/static.error_log error;


   location / {
     proxy_pass http://localhost:7070;
     proxy_http_version 1.1;
     proxy_set_header Connection "";
    }

    # prevent nginx from serving dotfiles (.htaccess, .svn, .git, etc.)
    location ~ /\. {
        deny all;
        access_log off;
        log_not_found off;
    }


}

server {
    set $base_path "/mypath";
    set $news_video_path "/mypath2";
    listen 7070;
    server_name localhost;
    location ~ /upload/videos/(.*) {
        alias $news_video_path/;
    }

    location ~ /video/(.*) {
        alias $base_path/video/;
    }


    access_log /var/log/nginx/localhost.access_log main;
    error_log  /var/log/nginx/localhost.error_log error;
}

问题是当 nginx 进程启动时,CPU 平均负载也会增加,直到达到 100% 的使用率。我使用 htop 查看哪个进程正在消耗 CPU 并且没有这样的进程。然后我前往我们的监控仪表板,发现是 I/O 等待导致高平均负载:[​​=20=]

然后使用iotop查看哪个进程有IO等待时间:

奇怪的是,Nginx 工作进程的磁盘写入率很高。有时 Total DISK WRITE 达到 100MB/s 但 Actual Disk Write 没有相同的行为。我还应该提一下,我不使用 Nginx 缓存,因此这些写操作与缓存无关。禁用 Nginx 日志记录也无济于事。

如何调试?为什么 nginx 在磁盘上写入那么多数据?

首先创建 /var/cache/nginx 目录并为您的 nginx 系统用户提供完全 read/write 访问权限,然后在 nginx http {} 上下文中添加此指令:

proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_zone:10m max_size=300m inactive=1d;
proxy_cache_key "$scheme$request_method$host$request_uri";

然后将这些添加到 server {} 上下文或 location {} 您希望从以下位置提供缓存:

proxy_cache my_zone;
proxy_cache_valid 200 1d;
proxy_cache_valid 404 302 1m;
proxy_cache_revalidate on;
proxy_cache_bypass $http_cache_control;

proxy_http_version 1.1;

add_header X-Cache-Status $upstream_cache_status;
add_header X-Proxy-Cache  $upstream_cache_status;

没有测试,但你应该明白了并测试它。

问题是缺少 Nginx multi_accept 指令。由于我们提供的是视频文件,而且它们通常很大,如果 Nginx 正在向某些用户提供视频文件,则无法响应新连接。

multi_accept on 添加到 events 块解决了问题。

events {
    worker_connections 1024;
    multi_accept       on;
    use epoll;
}