Nginx Amplify 几分钟后无法报告 PHP-FPM 统计数据

Nginx Amplify failed to report PHP-FPM stats after some minutes

VPS:DigitalOcean 放大:v0.43 Nginx:v1.13.0 PHP-FPM:v7.0.19 OS: CentOS7

我正在尝试启用 php-fpm-metrics in Nginx-Amplify 报告工具。它工作了几分钟,然后在服务重启

/var/log/amplify-agent/agent.log出现错误

agent.conf(相关部分):

[credentials]
api_key = ******************
hostname =
uuid = *******************
imagename =

[nginx]
user = nginx
stub_status = /nginx_status

[extensions]   
phpfpm = True  

agent.log(错误):

2017-05-30 21:30:48,374 [21034] supervisor running /usr/sbin/nginx -t -c /etc/nginx/nginx.conf
2017-05-30 21:31:18,079 [21034] supervisor failed to find php-fpm bin path, last attempt: "ls -la /proc/24400/exe" failed due to AmplifySubprocessError
2017-05-30 20:37:18,394 [9929] supervisor run failed
 Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/amplify/agent/managers/abstract.py", line 135, in _run
self._start_objects()
  File "/usr/lib/python2.7/site-packages/amplify/agent/managers/abstract.py", line 123, in _start_objects
child_obj.start()
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 149, in start
context.log.debug('starting object "%s" %s' % (self.type, self.definition_hash))
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 84, in definition_hash
definition_string = str(map(lambda x: u'%s:%s' % (x, self.definition[x]), sorted(self.definition.keys())))
File "/usr/lib/python2.7/site-packages/amplify/ext/abstract/object.py", line 47, in definition
return {'type': self.type, 'local_id': self.local_id, 'root_uuid': self.root_uuid}
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 115, in local_id
self._local_id = hashlib.sha256('_'.join(self.local_id_args)).hexdigest()
TypeError: sequence item 0: expected string, list found

PHP-FPM /etc/php-fpm.d/www.conf (相关部分):

[www]
user = nginx
group = nginx

listen = /var/run/php-fpm/php-fpm.sock
listen.backlog = 16383
listen.allowed_clients = 127.0.0.1

pm = dynamic
pm.max_children = 25
pm.start_servers = 10
pm.min_spare_servers = 5
pm.max_spare_servers = 10
pm.max_requests = 500

pm.status_path = /php_status

Nginx.conf(相关部分):

user nginx nginx;

如文档中所述,此 有效

$ SCRIPT_NAME=/php_status SCRIPT_FILENAME=/php_status QUERY_STRING= REQUEST_METHOD=GET cgi-fcgi -bind -connect /var/run/php-fpm/php-fpm.sock

结果:

X-Powered-By: PHP/7.0.19
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Cache-Control: no-cache, no-store, must-revalidate, max-age=0
Content-type: text/plain;charset=UTF-8

pool:                 www
process manager:      dynamic
start time:           29/May/2017:15:40:29 +0200
start since:          107193
accepted conn:        806252
listen queue:         0
max listen queue:     0
listen queue len:     0
idle processes:       10
active processes:     14
total processes:      24
max active processes: 25
max children reached: 1840
slow requests:        330

我想问题是,一段时间后重新启动进程和 "changing" pid:

supervisor failed to find php-fpm bin path, last attempt: "ls -la /proc/24400/exe" failed due to AmplifySubprocessError

不确定,但也许 pm.max_requests = 500 对此负责。

感谢举报!

这里有两个问题正在影响您的系统。首先是我们无法通过 ls:

为您的 phpfpm 找到 bin_path
supervisor failed to find php-fpm bin path, last attempt: "ls -la /proc/24400/exe" failed due to AmplifySubprocessError

这意味着我们的代理用户 运行ning 没有权限访问您的 phpfpm 主进程或其任何子进程的 proc 文件系统。此信息纯粹与元相关,不应影响代理的 运行 宁或其状态页面中的 phpfpm 指标集合。我们收集的信息是您的 bin_path,然后我们 运行 --version 在上面收集有关您的 phpfpm 的版本信息,以便显示在您的库存中。

也就是说,我们将在本周末(2017 年 6 月 16 日,星期五)或下周初(2017 年 6 月 19 日,星期一)发布新代理 0.44,它对此工作流程,可能会让您受益。

影响您系统的第二个问题比较严重,会影响您的代理运行时间:

17-05-30 21:31:18,079 [21034] supervisor failed to find php-fpm bin path, last attempt: "ls -la /proc/24400/exe" failed due to AmplifySubprocessError
2017-05-30 20:37:18,394 [9929] supervisor run failed
 Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/amplify/agent/managers/abstract.py", line 135, in _run
self._start_objects()
  File "/usr/lib/python2.7/site-packages/amplify/agent/managers/abstract.py", line 123, in _start_objects
child_obj.start()
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 149, in start
context.log.debug('starting object "%s" %s' % (self.type, self.definition_hash))
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 84, in definition_hash
definition_string = str(map(lambda x: u'%s:%s' % (x, self.definition[x]), sorted(self.definition.keys())))
File "/usr/lib/python2.7/site-packages/amplify/ext/abstract/object.py", line 47, in definition
return {'type': self.type, 'local_id': self.local_id, 'root_uuid': self.root_uuid}
File "/usr/lib/python2.7/site-packages/amplify/agent/objects/abstract.py", line 115, in local_id
self._local_id = hashlib.sha256('_'.join(self.local_id_args)).hexdigest()
TypeError: sequence item 0: expected string, list found

我不能肯定你的系统,但当我们的代理发现一个孤立的 phpfpm pool 工人时,其他客户已经报告了这个错误。例如 ps xao pid,ppid,command | grep 'php-fpm[:]' 的输出可能是:

[
    '15923     1 php-fpm: master process (/etc/php-fpm.conf)',
    '20704 15923 php-fpm: pool www',
    '20925 15923 php-fpm: pool www',
    '21350 15923 php-fpm: pool www',
    '21385 15923 php-fpm: pool www',
    '21386 15923 php-fpm: pool www',
    '21575 15923 php-fpm: pool www',
    '21699 15923 php-fpm: pool www',
    '21734 15923 php-fpm: pool www',
    '21735 15923 php-fpm: pool www',
    '21781 15923 php-fpm: pool www',
    '21782 15923 php-fpm: pool www',
    '22287 15923 php-fpm: pool www',
    '22330 15923 php-fpm: pool www',
    '22331 15923 php-fpm: pool www',
    '22495 15923 php-fpm: pool www',
    '22654 21386 php-fpm: pool www',  # <---- Orphan?
    ''
]

我们仍在调查可能导致这种情况的原因,但目前我们已经通过静默处理这种情况使我们的代理更加健壮。可以在我们的代理版本 0.44 中找到此修复程序,该版本应如上文所述很快发布。

如果您仍然感兴趣,我强烈建议您在代理 0.44 发布时下载并安装它。如果您的系统仍然存在问题,请随时通过我们的客户支持渠道与我们联系。

我还建议您考虑检查您自己的系统,看看您是否有任何 "orphaned" 池工作人员,因为它有可能指示其他问题。但同样,我们的新代理将适当地捕捉并处理这种情况。

希望对您有所帮助!

授予(NGINX 放大代理贡献者)