Google 云 SQL + Apache 进程永远不会结束,直到服务器崩溃 - strace 建议 Mysql 连接挂起
Google Cloud SQL + Apache processes never end until server crashes - strace suggests Mysql connections hanging
我们的服务器有一个生产 Drupal 站点(运行s 在 PHP 和 MySQL 上)。
我们最近从服务器本身安装了 MySQL 的服务器迁移到 Google 云:
- 云计算托管服务器(Centos 7)。
- Mysql 数据库在云端 SQL
- PHP 通过 Google 云内部网络(IP 10.XXX.XX.X)
与云 SQL 连接
因为我们在这个服务器上,服务器在一天内崩溃了好几次。使用所有内存和 cpu。而且我必须终止所有进程或重新启动,直到它再次发生。
我已经扫描了所有网站代码以搜索无休止的循环、挂起的外部请求、低效的代码。配置的 apache 配置(MaxClients 等),都无济于事。
我目前能想到的唯一可能的解释是,与 Cloud SQL 的连接有时会永远挂起,原因不明。 Whosebug 上的类似问题建议我查看 strace
:
的输出
sudo strace -tt -T -p18451 &> output
(18451是父apache进程)
我以前从未看过strace,也不知道如何解释输出。我猜测 "select" 命令是与超时的 MySQL 的连接。但是我完全不确定,也不知道从哪里开始解决这个问题。
下面是上述命令的输出结果:
16:06:02.966171 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=215501}) = 0 (Timeout) <0.215792>
16:06:03.182158 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000029>
16:06:03.182260 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001129>
16:06:04.183521 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:06:04.183655 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001018>
16:06:05.184774 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:06:05.184849 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001073>
16:06:06.186021 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:06.186135 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001067>
16:06:07.187298 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000013>
16:06:07.187370 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001078>
16:06:08.188559 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:08.188649 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001071>
16:06:09.189901 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:09.189975 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001139>
16:06:10.191208 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:06:10.191280 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001115>
16:06:11.192587 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000061>
16:06:11.192781 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000039>
16:06:11.192918 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000028>
16:06:11.193005 getuid() = 0 <0.000027>
16:06:11.193079 geteuid() = 0 <0.000034>
16:06:11.193165 getgid() = 0 <0.000040>
16:06:11.193250 getegid() = 0 <0.000026>
16:06:11.193324 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 8"..., iov_len=90}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 90 <0.000052>
16:06:11.193474 close(13) = 0 <0.000019>
16:06:11.193538 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:11.193592 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001148>
16:06:12.194906 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:06:12.195043 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001184>
16:06:13.196361 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:06:13.196496 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001238>
16:06:14.197866 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:06:14.198091 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001100>
16:06:15.199307 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:15.199434 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001070>
16:06:16.200601 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:16.200692 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001114>
16:06:17.201993 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000050>
16:06:17.202127 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001141>
16:06:18.203443 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:06:18.203577 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001087>
16:06:19.204783 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000039>
16:06:19.204881 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.002977>
16:06:20.207953 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe2bdabab50) = 19466 <0.001983>
16:06:20.210087 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:20.210155 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001105>
16:06:21.211422 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000060>
16:06:21.211561 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000028>
16:06:21.211696 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000032>
16:06:21.211787 getuid() = 0 <0.000027>
16:06:21.211861 geteuid() = 0 <0.000027>
16:06:21.211934 getgid() = 0 <0.000026>
16:06:21.212006 getegid() = 0 <0.000026>
16:17:30.461828 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001122>
16:17:31.463080 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:31.463227 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001112>
16:17:32.464486 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000059>
16:17:32.464642 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000045>
16:17:32.464755 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000028>
16:17:32.464837 getuid() = 0 <0.000026>
16:17:32.464911 geteuid() = 0 <0.000027>
16:17:32.464984 getgid() = 0 <0.000027>
16:17:32.465057 getegid() = 0 <0.000033>
16:17:32.465138 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 1"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000057>
16:17:32.465255 close(13) = 0 <0.000020>
16:17:32.465315 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:17:32.465369 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001095>
16:17:33.466588 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000041>
16:17:33.466705 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001119>
16:17:34.467961 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:34.468121 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001081>
16:17:35.469339 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:17:35.469473 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001145>
16:17:36.470754 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:36.470892 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001110>
16:17:37.472142 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:17:37.472274 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001091>
16:17:38.473492 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:17:38.473669 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001139>
16:17:39.474937 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:39.475070 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001140>
16:17:40.476420 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000081>
16:17:40.476615 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001132>
16:17:41.477915 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:17:41.478056 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001159>
16:17:42.479397 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000059>
16:17:42.479538 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000029>
16:17:42.479637 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000029>
16:17:42.479749 getuid() = 0 <0.000027>
16:17:42.479827 geteuid() = 0 <0.000028>
16:17:42.479904 getgid() = 0 <0.000026>
16:17:42.479976 getegid() = 0 <0.000027>
16:17:42.480049 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 1"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000041>
16:17:42.480153 close(13) = 0 <0.000021>
16:17:42.480216 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000016>
16:17:42.480271 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001094>
16:17:43.481497 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:43.481645 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001140>
16:17:44.482956 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:44.483090 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001128>
16:17:45.484323 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:17:45.484413 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001156>
16:17:46.485727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:46.485861 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001126>
16:17:47.487119 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000062>
16:17:47.487261 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}strace: Process 18451 detached
16:35:54.742455 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=848318}) = 0 (Timeout) <0.849378>
16:35:55.592217 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:35:55.592310 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.000367>
16:35:56.592779 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000021>
16:35:56.592867 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000011>
16:35:56.592945 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000012>
16:35:56.592997 getuid() = 0 <0.000011>
16:35:56.593039 geteuid() = 0 <0.000010>
16:35:56.593080 getgid() = 0 <0.000010>
16:35:56.593120 getegid() = 0 <0.000010>
16:35:56.593167 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 3"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000540>
16:35:56.593854 close(13) = 0 <0.000017>
16:35:56.593918 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000021>
16:35:56.593975 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.008668>
16:35:57.602727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000023>
16:35:57.602803 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.003837>
16:35:58.606727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:35:58.606801 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.010840>
16:35:59.617729 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000021>
16:35:59.617802 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.004837>
16:36:00.622729 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:36:00.622804 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.002846>
16:36:01.625747 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000024>
16:36:01.625831 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.008807>
16:36:02.634734 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:36:02.634810 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}strace: Process 18451 detached
Mysql实例报告:
Mysql 实例位于 Google 云(称为云 SQL)的托管 Mysql 服务上,我无法使用 SSH 登录。以下是屏幕截图中的实例属性,以及 Wilson Hauck
要求的查询输出
- 显示全局状态:https://pastebin.com/fPDdjfME
- 显示全局变量:https://pastebin.com/PVfYt7Cc
- 显示完整的流程列表:https://pastebin.com/NtyYDEsR
- 显示完整的流程列表 2:https://pastebin.com/TYzGjTWB
在网络服务器上 "top" 的图像下方,目前正处于罕见的低 CPU 使用率时刻(现在是午夜,服务器不久前重新启动),但它有时会经常跳转到100% CPU 使用率。我知道当我明天早上看这个时,进程列表将充满永远不会结束的 httpd 进程,如果我能够运行 top(到那时我通常不能)
在网络服务器上:ulimit -a:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 29205
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
在网络服务器上:iostat -xm 5 3
03-05-20 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
22,07 0,00 0,80 0,07 0,00 77,07
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,11 0,24 0,83 0,01 0,01 37,44 0,01 5,77 11,20 4,18 2,16 0,23
avg-cpu: %user %nice %system %iowait %steal %idle
55,73 0,00 0,91 0,00 0,00 43,36
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
avg-cpu: %user %nice %system %iowait %steal %idle
0,10 0,00 0,20 0,00 0,00 99,70
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,00 0,00 2,00 0,00 0,03 28,80 0,00 1,40 0,00 1,40 0,20 0,04
- apachectl -D DUMP_RUN_CFG: https://pastebin.com/Y1pXtisa
- apachectl -D DUMP_MODULES: https://pastebin.com/CKFYKtUR
- php -i: https://pastebin.com/PDzH16G4
- httpd -v:
服务器版本:Apache/2.4.6 (CentOS)
服务器构建:2020 年 4 月 2 日 13:13:23
- httpd.conf: https://pastebin.com/ibeSqwRq
- 免费-m:
total used free shared buff/cache available
Mem: 7317 1617 3848 49 1851 5352
Swap: 4095 7 4088
关于 admin.even..com Google 云数据库标志的建议
connect_timeout=20 # from 10 to be more tolerant for connection request
net_read_timeout=90 # from 30 to be more tolerant with read timeout
net_write_timeout=90 # from 60 to be more tolerant with write timeout
slave_net_timeout=30 # from 30 to be more tolerant with slave timeout
log_error=/mysql/logs/mysql-error.log # from stderr (console) for documentation of err's
innodb_lru_scan_depth=100 # from 2048 to reduce 90% of CPU cycles used for function
innodb_fast_shutdown=0 # from 1 to avoid recovery on restart
max_connections=512 # from 4030 to conserve RAM footprint (if possible)
存在更多调整机会。当我查找 Google db-n1-standard-1 时,指示为 1 cpu 和 3840 MB RAM,top 指示您有 ~8GB RAM,iostat 指示您有 2 cpu。请过几天告诉我们你的进展。
我们的服务器有一个生产 Drupal 站点(运行s 在 PHP 和 MySQL 上)。 我们最近从服务器本身安装了 MySQL 的服务器迁移到 Google 云:
- 云计算托管服务器(Centos 7)。
- Mysql 数据库在云端 SQL
- PHP 通过 Google 云内部网络(IP 10.XXX.XX.X) 与云 SQL 连接
因为我们在这个服务器上,服务器在一天内崩溃了好几次。使用所有内存和 cpu。而且我必须终止所有进程或重新启动,直到它再次发生。
我已经扫描了所有网站代码以搜索无休止的循环、挂起的外部请求、低效的代码。配置的 apache 配置(MaxClients 等),都无济于事。
我目前能想到的唯一可能的解释是,与 Cloud SQL 的连接有时会永远挂起,原因不明。 Whosebug 上的类似问题建议我查看 strace
:
sudo strace -tt -T -p18451 &> output
(18451是父apache进程)
我以前从未看过strace,也不知道如何解释输出。我猜测 "select" 命令是与超时的 MySQL 的连接。但是我完全不确定,也不知道从哪里开始解决这个问题。
下面是上述命令的输出结果:
16:06:02.966171 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=215501}) = 0 (Timeout) <0.215792>
16:06:03.182158 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000029>
16:06:03.182260 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001129>
16:06:04.183521 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:06:04.183655 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001018>
16:06:05.184774 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:06:05.184849 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001073>
16:06:06.186021 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:06.186135 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001067>
16:06:07.187298 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000013>
16:06:07.187370 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001078>
16:06:08.188559 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:08.188649 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001071>
16:06:09.189901 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:09.189975 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001139>
16:06:10.191208 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:06:10.191280 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001115>
16:06:11.192587 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000061>
16:06:11.192781 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000039>
16:06:11.192918 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000028>
16:06:11.193005 getuid() = 0 <0.000027>
16:06:11.193079 geteuid() = 0 <0.000034>
16:06:11.193165 getgid() = 0 <0.000040>
16:06:11.193250 getegid() = 0 <0.000026>
16:06:11.193324 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 8"..., iov_len=90}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 90 <0.000052>
16:06:11.193474 close(13) = 0 <0.000019>
16:06:11.193538 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:11.193592 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001148>
16:06:12.194906 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:06:12.195043 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001184>
16:06:13.196361 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:06:13.196496 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001238>
16:06:14.197866 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:06:14.198091 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001100>
16:06:15.199307 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:15.199434 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001070>
16:06:16.200601 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:16.200692 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001114>
16:06:17.201993 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000050>
16:06:17.202127 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001141>
16:06:18.203443 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:06:18.203577 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001087>
16:06:19.204783 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000039>
16:06:19.204881 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.002977>
16:06:20.207953 clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fe2bdabab50) = 19466 <0.001983>
16:06:20.210087 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:06:20.210155 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001105>
16:06:21.211422 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000060>
16:06:21.211561 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000028>
16:06:21.211696 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000032>
16:06:21.211787 getuid() = 0 <0.000027>
16:06:21.211861 geteuid() = 0 <0.000027>
16:06:21.211934 getgid() = 0 <0.000026>
16:06:21.212006 getegid() = 0 <0.000026>
16:17:30.461828 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001122>
16:17:31.463080 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:31.463227 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001112>
16:17:32.464486 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000059>
16:17:32.464642 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000045>
16:17:32.464755 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000028>
16:17:32.464837 getuid() = 0 <0.000026>
16:17:32.464911 geteuid() = 0 <0.000027>
16:17:32.464984 getgid() = 0 <0.000027>
16:17:32.465057 getegid() = 0 <0.000033>
16:17:32.465138 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 1"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000057>
16:17:32.465255 close(13) = 0 <0.000020>
16:17:32.465315 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000014>
16:17:32.465369 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001095>
16:17:33.466588 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000041>
16:17:33.466705 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001119>
16:17:34.467961 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:34.468121 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001081>
16:17:35.469339 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:17:35.469473 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001145>
16:17:36.470754 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000053>
16:17:36.470892 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001110>
16:17:37.472142 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000051>
16:17:37.472274 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001091>
16:17:38.473492 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:17:38.473669 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001139>
16:17:39.474937 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:39.475070 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001140>
16:17:40.476420 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000081>
16:17:40.476615 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001132>
16:17:41.477915 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000054>
16:17:41.478056 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001159>
16:17:42.479397 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000059>
16:17:42.479538 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000029>
16:17:42.479637 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000029>
16:17:42.479749 getuid() = 0 <0.000027>
16:17:42.479827 geteuid() = 0 <0.000028>
16:17:42.479904 getgid() = 0 <0.000026>
16:17:42.479976 getegid() = 0 <0.000027>
16:17:42.480049 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 1"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000041>
16:17:42.480153 close(13) = 0 <0.000021>
16:17:42.480216 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000016>
16:17:42.480271 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001094>
16:17:43.481497 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:43.481645 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001140>
16:17:44.482956 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:44.483090 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001128>
16:17:45.484323 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000015>
16:17:45.484413 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001156>
16:17:46.485727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000052>
16:17:46.485861 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.001126>
16:17:47.487119 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000062>
16:17:47.487261 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}strace: Process 18451 detached
16:35:54.742455 select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=848318}) = 0 (Timeout) <0.849378>
16:35:55.592217 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:35:55.592310 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.000367>
16:35:56.592779 socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 13 <0.000021>
16:35:56.592867 getsockopt(13, SOL_SOCKET, SO_SNDBUF, [212992], [4]) = 0 <0.000011>
16:35:56.592945 setsockopt(13, SOL_SOCKET, SO_SNDBUFFORCE, [8388608], 4) = 0 <0.000012>
16:35:56.592997 getuid() = 0 <0.000011>
16:35:56.593039 geteuid() = 0 <0.000010>
16:35:56.593080 getgid() = 0 <0.000010>
16:35:56.593120 getegid() = 0 <0.000010>
16:35:56.593167 sendmsg(13, {msg_name={sa_family=AF_UNIX, sun_path="/run/systemd/notify"}, msg_namelen=21, msg_iov=[{iov_base="READY=1\nSTATUS=Total requests: 3"..., iov_len=91}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 91 <0.000540>
16:35:56.593854 close(13) = 0 <0.000017>
16:35:56.593918 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000021>
16:35:56.593975 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.008668>
16:35:57.602727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000023>
16:35:57.602803 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.003837>
16:35:58.606727 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:35:58.606801 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.010840>
16:35:59.617729 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000021>
16:35:59.617802 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.004837>
16:36:00.622729 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:36:00.622804 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.002846>
16:36:01.625747 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000024>
16:36:01.625831 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}) = 0 (Timeout) <1.008807>
16:36:02.634734 wait4(-1, 0x7ffddf9b3adc, WNOHANG|WSTOPPED, NULL) = 0 <0.000022>
16:36:02.634810 select(0, NULL, NULL, NULL, {tv_sec=1, tv_usec=0}strace: Process 18451 detached
Mysql实例报告:
Mysql 实例位于 Google 云(称为云 SQL)的托管 Mysql 服务上,我无法使用 SSH 登录。以下是屏幕截图中的实例属性,以及 Wilson Hauck
要求的查询输出- 显示全局状态:https://pastebin.com/fPDdjfME
- 显示全局变量:https://pastebin.com/PVfYt7Cc
- 显示完整的流程列表:https://pastebin.com/NtyYDEsR
- 显示完整的流程列表 2:https://pastebin.com/TYzGjTWB
在网络服务器上 "top" 的图像下方,目前正处于罕见的低 CPU 使用率时刻(现在是午夜,服务器不久前重新启动),但它有时会经常跳转到100% CPU 使用率。我知道当我明天早上看这个时,进程列表将充满永远不会结束的 httpd 进程,如果我能够运行 top(到那时我通常不能)
在网络服务器上:ulimit -a:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 29205
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 4096
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
在网络服务器上:iostat -xm 5 3
03-05-20 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
22,07 0,00 0,80 0,07 0,00 77,07
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,11 0,24 0,83 0,01 0,01 37,44 0,01 5,77 11,20 4,18 2,16 0,23
avg-cpu: %user %nice %system %iowait %steal %idle
55,73 0,00 0,91 0,00 0,00 43,36
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00 0,00
avg-cpu: %user %nice %system %iowait %steal %idle
0,10 0,00 0,20 0,00 0,00 99,70
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0,00 0,00 0,00 2,00 0,00 0,03 28,80 0,00 1,40 0,00 1,40 0,20 0,04
- apachectl -D DUMP_RUN_CFG: https://pastebin.com/Y1pXtisa
- apachectl -D DUMP_MODULES: https://pastebin.com/CKFYKtUR
- php -i: https://pastebin.com/PDzH16G4
- httpd -v: 服务器版本:Apache/2.4.6 (CentOS) 服务器构建:2020 年 4 月 2 日 13:13:23
- httpd.conf: https://pastebin.com/ibeSqwRq
- 免费-m:
total used free shared buff/cache available
Mem: 7317 1617 3848 49 1851 5352
Swap: 4095 7 4088
关于 admin.even..com Google 云数据库标志的建议
connect_timeout=20 # from 10 to be more tolerant for connection request
net_read_timeout=90 # from 30 to be more tolerant with read timeout
net_write_timeout=90 # from 60 to be more tolerant with write timeout
slave_net_timeout=30 # from 30 to be more tolerant with slave timeout
log_error=/mysql/logs/mysql-error.log # from stderr (console) for documentation of err's
innodb_lru_scan_depth=100 # from 2048 to reduce 90% of CPU cycles used for function
innodb_fast_shutdown=0 # from 1 to avoid recovery on restart
max_connections=512 # from 4030 to conserve RAM footprint (if possible)
存在更多调整机会。当我查找 Google db-n1-standard-1 时,指示为 1 cpu 和 3840 MB RAM,top 指示您有 ~8GB RAM,iostat 指示您有 2 cpu。请过几天告诉我们你的进展。