I have got a cron in PHP that crash without any log into the php_errors on a debian. Most of the time (99% of the time), it's work fine and I have no problem with it.
But randomly, it just stop and I have nothing in any log on the server. I have got the issue on 2 different server (with very similar install), always when the load increase.
I installed systemd-coredump on the server because I suspected a segfault into one of php library (the script is complex and make a lot of webservices call) but it didn't log anything on the last crash.
Out of memory are well log into the php_errors, so it doesn't seem to be the problem.
What can I do to gather any logs that can give me a hint on what happen and why my pid just stop ?
Related
I deployed NGinx, php-fpm and php 8 on a EC2 / Linux 2 instance (T4g / ARM) to run a php application. As I had done for the previous version of this application and php 7.
It runs well, excepted for all first requests. Whatever the actions (clicking a button, submitted a text, etc.), the first request always takes 2.2x minutes, then the following ones run quickly.
The browsers (Firefox and Chrome) are just waiting for the response, then react normally.
I see nothing from the logs (especially, the slow-log is empty) and the caches seem to work well.
I guess I missed a configuration point. Based on my readings, I tried a lot of things about the configuration of php-fpm and php, but unsuccessfully.
Is someone already encountered this kind of issue?
Thanks in advance
Fred
Activation of all logs for php-fpm and php,
Augmentation of the memory for the process,
Checking of the system parameters (nlimit, etc.),
etc.
You've not provided details of the nginx config, nor the fpm config.
I see nothing from the logs
There's your next issue. The default (combined) log format does not show any timing information. Try adding $upstream_response_time and $request_time to your log file. This should tell you if the issue is outside your host, between nginx and PHP, or on the PHP side.
You should also be monitoring the load and CPU when those first couple of hits arrive along with the opcache usage.
first of all, thanks to #symcbean for your point. It helped me to find the script taking a long time to render and to fix the problem.
The problem was not due to the configuration of NGinx, PHP-FPM or PHP. It occurred because of an obscure parameter for auto-update of the application running on these components, forcing the application to call a remote server and blocking the rendering.
MySQL 5.1.73
Apache/2.2.15
PHP 5.6.13
CentOS release 6.5
Cakephp 3.1
After about 4 minutes (3 min, 57 seconds) the import process I'm running stops. There are no errors or warnings in any log that I can find. The import process consists of a lot of SQL calls and data processing, nothing too crazy, but it can take about 10 minutes to get through 5500 records if it's doing a full compare for updates.
Firefox: Secure Connection Failed - The connection to the server was reset while the page was loading.
Chrome: ERR_NO RESPONSE
The php set time limit is set to 900, which is working. I can set it to 5 seconds and get an error. The limit is not being reached.
I can sleep another controller for 10 minutes, and this error does not happen, indicating that something in the actual program is causing it to fail, and not the hosting service killing the request because it's taking too long (read about VPS doing this to prevent spam).
The php errors are turned all the way up in the php.ini, and just to be sure, in the controller itself.
The import process completes if I reduce the size of the file being imported. If it's just long enough, it will complete AND show the browser message. This indicates to me it's not failing at the same point of execution each time.
I have deleted all the cache and restarted the server.
I do not see any output in the apache logs other then that the request was made.
I do not see any errors in the mysql log, however, I don't know if it's because its not turned on.
The exact same code works on my local host without any issue. It's not a perfect match to the server, but it's close. Ubuntu Desktop vs Centos, php 5.5 vs php 5.6
I have kept an eye on the memory usage and don't see any issues there.
At this point I'm looking for any good suggestions on what else to look at or insights into what could be causing the failure. There are a lot of possible places to look, and without an error, it's really difficult to narrow down where the issue might be. Thanks in advance for any advice!
UPDATE
After taking a closer look at the memory usage during the request, I noticed it was getting much higher than it ideally should.
The httpd (apache) process gets killed and a new thread spawned. Once the new thread runs out of memory, the error shows up on the screen. When I had looked at it previous, it was only at 30%, probably because it had just killed the old process. Watching it the whole way through, I saw it get as high as 80%, which with the other processes was enough to get have it run out of memory, and a killed process can't log anything, hence the no errors or warnings. It is interesting to me that the process just starts right back up.
I found a command to show which processes had been killed due to memory which proved very useful:
dmesg | egrep -i 'killed process'
I did have similar problems with debugkit.
I had bug in my code during memory peak and the context was written to html in the error "log".
I need to test the behaviour of some tool I use on my web server, but it works only in cause of server fault. So I need to crash the server by some way. I tested a lot of script found in google like: infinite loops while(true), some preg_match(...), str_repeat(...) functions - nothing crashes it) Even tried to retreive 8Gb file - no problems, php just says about Internal server error. Thanks for any help.
I think it might be possible to get apache to segfault with mod_php by providing a regex that needs backtracking, setting high pcre limits and low php memory limits. I can't recall which versions where involved unfortunately.
Are you sure it's not good enough to just send a kill signal?
--edit--
That is send kill signal to your web server. Something similar to killall -9 apache-httpd or whatever the name of your webserver process is. Just check with your admin that this will target the correct processes.
I have tons of scripts wich run in background via cronjob and many frontend stuff happening. Some weeks ago we had a system fatal error where our mounted drives have been fried and our cronjob got stuck... we had to restart the whole system and even go so far to restart the rack oldstyle.
The problem is that our debian instance was partually "kaput".
Some of the files got a total permision lock and hat no permisioins at all, when you run ls /sys/crontab/lock it had no permisions but the cronjob was still runing and making tons of problems.
The worst part is tht the php was still runing and the mysql server was up and runing, even without the file system permisions(And was making orders without the files... BIG problem). Talk about to reliable.
Now my question is is there a way if i can detect if a debian system is working?
Except trying touch on every script run, or writing locks in the DB.
I've written a PHP script that takes in two CSV files, processes them and returns an HTML table. I developed it on my MacBook running on Apache. When I uploaded the script to our production server, it began having problems. Production is an Ubuntu 10.04 LTS running nginx that forwards requests to Apache/PHP.
I added some debugging statements and tailed the logs so I can see exactly where it's stopping the execution of the script, but there are no errors thrown anywhere. The first file is 1.9 MB and it processes 366 kb before it fails. I've tested this several times and it always fails at the same place. I don't believe it's the file as it's the same file I used for testing the script and it never had a problem on my MacBook.
I've searched the internet and have increased several timeout parameters in nginx to no avail. I'm not sure where to look or what to look for at this point. Can someone point me in the right direction?
have you fully turned on error reporting on the server?