php日志分析:解决NGINX+PHP-FPM failed to ptrace(PEEKDATA) Input/output error出错问题

今天查看php的错误日志 (error_log = /usr/local/php/var/log/php-fpm.log )  和 慢日志(slowlog = /usr/local/php/var/log/php-fpm.log.slow ) ,  发现错误日志里很多  “ ERROR: failed to ptrace(PEEKDATA) pid 4276: Input/output error (5)  ”这样的错误 , 想找出出现这错误的原因于是从网上搜了如下的  文章 。他说是他的网站经常出现 ”bad gateway“ 错误才去查日志 发现有这个错误 。 我现在还不知道 我的 日志里出现这样的错误 是不是 我的页面也出现  ”bad gateway“ 的错误 。 带查证ing。

查了 好几个资料都说是 php开启了慢日志 引起的 为了让系统不出现异常 决定吧慢日志 注释掉。

request_slowlog_timeout = 30
slowlog = /usr/local/php/var/log/php-fpm.log.slow 

资料 一:

网站总是出现bad gateway 提示,时有,时无,查看了一下error_log日志,居然出现一堆错误,如下

[29-Mar-2014 22:40:10] ERROR: failed to ptrace(PEEKDATA) pid 4276: Input/output error (5)
[29-Mar-2014 22:53:54] ERROR: failed to ptrace(PEEKDATA) pid 4319: Input/output error (5)
[29-Mar-2014 22:56:30] ERROR: failed to ptrace(PEEKDATA) pid 4342: Input/output error (5)
[29-Mar-2014 22:56:34] ERROR: failed to ptrace(PEEKDATA) pid 4321: Input/output error (5)
[29-Mar-2014 22:56:40] ERROR: failed to ptrace(PEEKDATA) pid 4314: Input/output error (5)

网上也找了很多方法,很多人说是 rlimit_files 打开文件数的问题,但是觉得不太靠谱,最后找到鬼佬的话,看上去还有几分道理。

http://serverfault.com/questions/406532/i-o-error-with-php5-fpm-ptracepeekdata-failed

It appears you have request_slowlog_timeout enabled. This normally takes any request longer than N seconds, logs that it was taking a long time, then logs a stack trace of the script so you can see what it was doing that was taking so long.

In your case, the stack trace (to determine what the script is doing) is failing. If you’re running out of processes, it is because either:

After php-fpm stops the process to trace it, the process fails to resume because of the error tracing it
The process is resuming but continues to run forever.
My first guess would be to disable request_slowlog_timeout. Since it’s not working right, it may be doing more harm than good. If this doesn’t fix the issue of running out of processes, then set the php.ini max_execution_time to something that will kill the script for sure.

看样子是因为我打开了slowlog 然后,再设置 了 request_slowlog_timeout 这个参数,,所以后php 没有执行完就出错了。。

上面解决的办法是:

禁用 php-fpm.conf 里的 request_slowlog_timeout 和 slowlog ,然后,修改 php.ini 里的max_execution_time 参数

资料 二 :

最近服务器频繁出现502(nginx+php+php-fpm的架构),调试过程中在php-fpm的日志中发现了大量的:

[02-Aug-2014 08:55:24] ERROR: failed to ptrace(PEEKDATA) pid 4046: Input/output error (5)
[02-Aug-2014 08:55:24] NOTICE: finished trace of 4046
[02-Aug-2014 08:55:24] NOTICE: child 4047 stopped for tracing
[02-Aug-2014 08:55:24] NOTICE: about to trace 4047
[02-Aug-2014 08:55:24] ERROR: failed to ptrace(PEEKDATA) pid 4047: Input/output error (5)
[02-Aug-2014 08:55:24] NOTICE: finished trace of 4047
[02-Aug-2014 08:55:24] NOTICE: child 4054 stopped for tracing
[02-Aug-2014 08:55:24] NOTICE: about to trace 4054
[02-Aug-2014 08:55:24] ERROR: failed to ptrace(PEEKDATA) pid 4054: Input/output error (5)
[02-Aug-2014 08:55:24] NOTICE: finished trace of 4054
[02-Aug-2014 08:55:24] NOTICE: child 4076 stopped for tracing
[02-Aug-2014 08:55:24] NOTICE: about to trace 4076

经对比最近两次系统调优所使用的配置文件发现,是因为php-fpm的配置引起:

#request_slowlog_timeout = 5s
#slowlog = /usr/local/php/var/log/php-fpm.log.slow

解决办法是注释掉上面的配置即可。/etc/init.d/php-fpm restart生效