Application Memory Leak based on metrics

Application Memory Leak based on metrics - php

I have a zf2 php application which is executed in a bash script every minute. This is running inside an ec2 instance.
here's my code
while :
do
php public/index.php start-processor &
wait
sleep 60
done
Metrics Reading
Based on the metrics it keeps on leaking memory until it reaches 100% then drops. Is this normal or is there really a leak happening to my application?
I've also tried using htops and it looks fine and does not eat memory that much.
Hope someone could explain what is happening here. Should I worry about this?
Thanks and more power.

It does not look like a memory leak to me, there the used amount would just rise and never go back, causing you app to eventually crash.
This graph looks very similar to garbage collection as it's happening in JVM, does you PHP use such thing under the hood? I searched the web and looks like PHP 5.3+ has GC built in: https://secure.php.net/manual/en/features.gc.php

Related

How can I use a queue with an on-demand CakePHP worker?

In CakePHP, there are various systems for managing the queue itself (RabbitMQ, beanstalk, Amazon SQS, dereuromark’s cakephp-queue), but all of those seem to require an daemonized worker task. These always-on workers (which have the full power of CakePHP behind them) listen for jobs as they come into the queue, do their processing, then sit idle until the next job comes along.
Currently, I'm using a beanstalk-based queue (linked above), and it's worked okay, but in terms of server resources, it's not particularly efficient. We have memory leaks and have to kill and restart the processes sometimes.
However, now I'm trying to add more different kinds of "tubes" (in beanstalk's parlance), and I'm bumping up against RAM issues on our servers running so many different workers at once. When I spin up all of the different workers I want, I get fatal out-of-memory errors.
I'd rather have something like a "serverless"/Lambda-style setup where the worker is spun up on-demand, does its little job, then terminates itself. Kind of like a cron job calling a CakePHP shell, but with the job data dynamically being populated from the queue.
Does anyone have experience with this kind of setup for queuing? I'm on an AWS-based infrastructure, so anything that uses Amazon services would be especially helpful.

As far as I know, there is only two ways to run PHP. Either as a thread inside a web container (Apache, Nginx, CGI) or as a shell process (single-threaded). When you run it on the shell you're stuck with 1 thread per process.
I know that sucks, but PHP is not the best tool for server workers. A Lambda architecture isn't going to really help solve this problem. You're just off loading your multi-threading issues to another host.
At the end of the day, the easiest solution is to just run more PHP processes. If you're having crashes. You need to run PHP inside a shell script. It's just the nature of PHP on the command line.
But, I will share from my experience what other options you have.
However, now I'm trying to add more different kinds of "tubes" (in beanstalk's parlance), and I'm bumping up against RAM issues on our servers running so many different workers at once. When I spin up all of the different workers I want, I get fatal out-of-memory errors.
Last time I checked beanstalk was single threaded. So I don't think it's possible for PHP to spawn multiple workers at once with beanstalk. You have to run 1 PHP instance which gets a message and works on it. If you want to scale you have to run multiple PHP instances.
It sounds like your workers are either having memory leaks, or are simply consuming a lot of memory. I don't see how this has anything to do with beanstalk. You have to fix your leaks and change your source code to use less memory.
I've had to rewrite PHP code to use a forward XML parser, because the other XML parser would load the entire document into memory. The forward reading parser used less memory, but it was a pain to rewrite all my code. You have to decide which costs you more. Spending more money on Ram or spending time rewriting the code. That's your call.
Memory
PHP comes with a soft limit on memory usage. Even if the host machine has lots of memory the PHP thread will throw an out of memory error when it hits the soft limit. It's something you have to manually change in the php.ini file. Forgive me if you're already done this, but I thought it was worth mentioning.
Increase PHP memory limit in php.ini:
memory_limit = 128M
Disposable Pattern
I solved a lot of my memory leaks using a disposable pattern. It's just a simple interface you use on objects, and then wrap code in a using() function. I was able to reduce my memory leaks by 99% with this library. (full disclosure, this is my github library).
https://github.com/cgTag/php-disposable
Multi-threaded PHP
There is an open source project that adds multi-thread support to PHP, and it looks to me like a solid library.
https://github.com/krakjoe/pthreads
The project adds multi-thread support to PHP (with a C++ module) that basically creates a new global scope for each thread. This allows you to run a CakePHP shell in each thread, and I think there is an API for thread-to-thread sharing of data (mutex and things like that).
Dockerize
I've had some success in running docker just to handle a single CakePHP shell task. This allowed me to quickly scale up by running multiple containers on the same host machine. The overhead of extra memory for containers really wasn't that bad. I don't remember the exact number, but it's less than what you might think.
Daemons
They are the tried and tested way of running services on Linux. The only problem here is that it's 1 thead in PHP per daemon. So you have to register multiple daemons to scale up. With that said, this option works good with the multi-thread library above.

Memory leak? memory_get_usage() vs actual process memory

I have a problem with a possible memory leak in my console PHP application. It is created with the Laravel framework, using artisan. Running PHP 5.6.2. It's a huge loop that will gather data from a webservice and insert it into my own database, probably around 300k rows.
For each loop I print out memory usage in the console. The weird thing is that memory_get_usage() and memory_get_usage(true) reports that it uses roughly 13MB memory. But the php process keeps using more and more memory. If I let it run for a few hours it uses almost 1GB memory, and the loop keeps going slower and slower.
It will not terminate due to the PHP memory limit, even if it passes it by far.
I am trying to figure out why this happens and how this actually works. As I understand it, memory_get_usage reports the memory used by MY script, what I have written. So unsetting variables, cleaning up etc. should not be the problem, right? I also try to force garbage collection every ~300 entries with no luck.
Do anyone have some general tips on how I can troubleshoot this? And maybe explain why the memory used by the process is shown by the memory_get_usage function :-)
Any help is greatly appreciated.

Clearing memory_get_peak_usage Cache

I'm trying to track down a memory leak in a PHP Program (Magento, if it matters). The basic problem seems to be a leak in some object/class that's growing over time. That is, the more information that gets logged to the database, the more memory certain application processes end up using. Magento's a highly abstract system, so it's not always clear what code is being run that's consuming so much memory. That's what I'm trying to track down.
I've been using memory_get_peak_usage at the end of the program bootstrap file to benchmark performance, and seen a steady growth from 250MB of peak use, to 310MB of peak use in about a week. I would like to use memory_get_peak_usage intermittently throughout the execution cycle to ask
What was the peak usage prior to this call? [later in the cycle] What was the peak usage prior to this new call?
The problem I'm running into is, once I call memory_get_peak_usage once, any future call returns the same value as the first call, even when I know the peak usage has changed. This leads me to believe that after memory_get_peak_usage is called once, PHP caches the result. I would like to uncache it to perform the testing outlined above.
Can I call memory_get_peak_usage multiple times?
Are there alternative to profiling the scenario I've described above. Some feature of xDebug maybe?

Can I call memory_get_peak_usage multiple times?
Not sure on that one.
Are there alternative to profiling the scenario I've described above. Some feature of xDebug maybe?
Have a look at the XDebug profile page. It's been awhile since I have profiled an app, but when I did I followed the write-up and worked great.

Is it possible to attach a debugger to a running PHP process?

I have a PHP script that we run every few minutes through a cron entry, and every now and then (about once a week) instead of ending normally, it stays running, eating up 100% of a CPU core (i'm assuming, looping infinitely)
Looking at the code and "thinking" about it, I can't find any reason for this to happen, but it does. So far, when I get 3 or more of those I kill them, and that solves the CPU issue, but I'd like to do something about this...
Is there any way to dump a process, or attach to it with a debugger so that I can know something, anything about what it's doing? (Just which PHP line it's on would be of huge help). I don't mind if the process dies when I dump, or anything.
This is a PHP script, running from the command line, in a CentOS 5.6 machine, and I'm a big noob when it comes to *nix, so if you can point me to some tutorial for dummies that'd be awesome.
Thank you!
Daniel

There's no way I'm aware of to attach a debugger to a PHP process that hasn't specifically been prepared with a PHP debugging extension (such as xdebug). However, you may be able to make some guess as to what's going on using the more general-purpose utility strace, which can deliver a trace of the system calls being run by a process. This will only tell you what system calls are being executed, but this may be enough (depending on the context) to determine what's going on anyway.

php daemon possible memory leak

i've written a daemon in php and want to make sure it doesn't leak memory, as it'll be running 24/7.
even in its simplest form memory_get_peak_usage for a daemon will report that the script consumes more memory for each cycle. memory_get_usage on the other hand will not grow.
the question is: should i worry? i've stripped down the daemon to the bare basics but this is still happening. any thoughts?
#!/usr/bin/php -q
<?php
require_once "System/Daemon.php";
System_Daemon::setOption("appName", "smsd");
System_Daemon::start();
while(!System_Daemon::isDying()){
System_Daemon::info("debug: memory_get_peak_usage: ".memory_get_peak_usage());
System_Daemon::info("debug: memory_get_usage: ".memory_get_usage());
System_Daemon::iterate(2);
}
FINAL NOTE + CONCLUSION: i ended up writing my own daemon wrapper, not using pear's system_daemon. regardless of how i tweaked this library i could not stop it from leaking memory. hope this helps someone else.
FINAL NOTE + CONCLUSION 2: my script has been in production for over a week and is still not leaking 1 bytes of memory. so - writing a daemon in php actually seems to be ok, as long as you're very careful about its memory consumtion.

I got the same problem. Maybe the best idea is to report new bug at PEAR
BTW, code like that doesn't show that memleak:
#!/usr/bin/php -q
<?php
require_once "System/Daemon.php";
System_Daemon::setOption("appName", "smsd");
System_Daemon::start();
while(!System_Daemon::isDying()) {
print ("debug: memory_get_peak_usage: ".memory_get_peak_usage()."\n");
print ("debug: memory_get_usage: ".memory_get_usage()."\n\n");
System_Daemon::iterate(2);
}
Look's like System_Daemon::info() is a problem.

It turns out file_get_contents was leaking memory. Whenever I disabled that one line, peak memory usage was stable. When I commented it back in, peak memory usage would increase by 32 bytes every iteration.
Replaced the file_get_contents call (used to retrieve the number inside the pid-file in /var/run) with fread, and solved this problem.
This patch will be part of the next System_Daemon release.
Thanks whoever (can't find matching nick) also reported this bug (#18036) otherwise I'd probably never known.
Thanks again!

You can try using the new garbage collector in PHP 5.3 to prevent issues with circular references.
gc_enable()
gc_collect_cycles()

You should not use PHP to write a daemon. Why? Because PHP is not a language that is sufficiently mature to run for hours, days, weeks or months. PHP is written in C, all of the magic that it provides has to be handled. Garbage collection, depending on your version, might or might not work, depending on what extensions you have compiled and used. Yes, if they ship with official releases, they should 'play nice', but do you check to see what release you are using? Are you sure all loaded extensions realize that they might run for more than 10 - 30 seconds? Given that most execution times never spot leaks, are you sure it even works?
I am quite close to going off on a 'don't use regex to parse HTML rant' regarding this, as I see the question creeping up more and more. Twice today that I'm aware of.
Would you use a crowbar as a toothpick? Neither Zend, nor Roadsend, Nor PHC is sufficiently mature to handle running for any period of time that could be considered protracted, given the expected life of a PHP process when rendering a web page. Yes, even with the GC facilities provided by C++ based PHP compiler, it is unwise to write a daemon in PHP.
I hate answers that say you can't do that, with that, but in this case, it's true, at least for now.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.