PHP Out of Memory Exception

PHP Out of Memory Exception - php

I have a PHP program that will run forever (not a webpage a socket server). After processing over 1000 requests the program eventually crashes due to an out of memory exception.
Here is a link to my project.
Here is a link to my program.
I am not sure why this happens, I have tried using garbage collection functions in the function that processes requests (onMessage) in the program but it does not result in any changes. Any suggestions would be appreciated.

Investing huge amounts of effort, you may be able to mitigate this for a while. But in the end you will have trouble running a non-terminating PHP application.
Check out PHP is meant to die. This article discusses PHP's memory handling (among other things) and specifically focuses on why all long-running PHP processes eventually fail. Some excerpts:
There’s several issues that just make PHP the wrong tool for this. Remember, PHP will die, no matter how hard you try. First and foremost, there’s the issue of memory leaks. PHP never cared to free memory once it’s not used anymore, because everything will be freed at the end — by dying. In a continually-running process, that will slowly keep increasing the allocated memory (which is, in fact, wasted memory), until reaching PHP’s memory_limit value and killing your process without a warning. You did nothing wrong, except expecting the process to live forever. Under load, replace the “slowly” part for "pretty quickly".
There’s been improvements in the “don’t waste memory” front. Sadly, they’re not enough. As things get complex or the load increases, it’ll crash.

Related

How apache runs PHP for pages [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
Imagine there is a PHP page on http://server/page.php.
Client(s) send 100 requests from browser to the server for that page simultaneously.
Does the server run 100 separate processes of php.exe simultaneously?
Does it re-interpret the page.php 100 times?

The answer is highly variable, according to server config.
Let's answer question 1 first:
Does the server run 100 separate processes of php.exe simultaneously?
This depends on the way PHP is installed. If PHP is being run via CGI, then the answer is "Yes, each request calls a separate instance of PHP". If it's being run via an Apache module, then the answer is "No, each request starts a new PHP thread within the Apache executable".
Similar variations will exist for other web servers. Please note that for a Unix/Linux based operating system, running separate copies of the executable for each request is not necessarily a bad thing for performance; the core of the OS is designed such that in many cases, tasks are better done by a number of separate executables rather than one monolithic one.
However, no matter what you do about it, having large numbers of simultaneous requests will drain your server resources and lead to timeouts and errors for your users. This is why it is important for your PHP programs to finish running as quickly as possible. Do not write PHP programs for web consumption that are slow to run; if you're likely to have a lot of traffic, you need to test for performance as much as you do for functionality. Having your programs exit quickly will dramatically reduce the likelihood of having a significant number of simultaneous requests, which will obviously have a big impact on your site's performance.
Now your second question:
Does it re-interpret the page.php 100 times?
For a standard PHP installation, the answer here is "Yes it does, and yes it does have a performance impact."
However, PHP provides several Caching solutions that are designed specifically to mitigate this. The main options are APC and the Zend Cache, either of which can be installed as standard modules. Using these modules will mean that PHP caches the interpreted code, so it can be run much faster for subsequent calls.
The Zend Cache will be included as part of the standard PHP installation as of the forthcoming PHP 5.5 release.

Apache2 has multiple different mode to work.
In "prefork" (the most commonly used) mode, Apache will create process for every request, each process will run a own php.exe. Config file will assign a maximum number of connections (MaxClients in httpd.conf), Apache will only create MaxClients. This is to prevent memory exhaustion. More requests are queued, waiting for the previous request to complete.
If you do not install opcode cache extensions like APC, XCache, eAccelerator, php.exe will re-interpret the page.php 100 times.

It depends.
There are different ways of setting things up, and things can get quite complex.
The short answer is 'more or less'. A number of apache processes will be spawned, the PHP code will be parsed and run.
If you want to avoid the parsing overhead use an opcode cache. APC (Alternative PHP Cache) is a very popular one. This has a number of neat features which are worth digging into, but without any config other than installing it it will ensure that each php page is only parsed into opcode once.
To change how many apache services are spawned, most likely you'll be using MPM Prefork. This lets you decide if how you want Apache to deal with multiple users.
For general advice, in my experience (small sites, not a huge amount of traffic), installing APC is worth doing, for everything else the defaults are not too bad.

There are a number of answers to this. In general, Apache will create a process for an incoming request, so it is possible that 100 process are created. However, a process takes time to create, so it might be that by the time a process has finished and died, one of those 100 connections comes in a fraction of a second later (since 100 connections at exactly the same time is very rare indeed, unless you're Google).
However, let us imagine that 100 processes do really need to be held in memory simultaneously, but that there is only room for 50 in available server RAM. In that case, 50 connections will be served, and 50 will have to wait for processes to die and be re-spawned. Thus, a random half of those requests will be delayed, though if a process create-process-die sequence only takes a fraction of a second, they won't have to wait very long. This is why, when improving server capacity, reducing your page load time is as important as adding more RAM - the sooner a process finishes, the sooner a new one can take its place.
One way, incidentally, to reduce load time is to spawn a number of PHP processes and hold them in memory. This is the basis of FastCGI (or fcgid, which is compatible). Rather than creating and killing a process for every request, a process is spawned in memory immediately and is re-used for several requests. For PHP, these are usually configure to die after a certain number of page requests (e.g. 1000) as historically PHP has had quite a lot of memory leaks (the more a process is reused, the worse the memory leaks get).
You ask if a page is re-interpreted for every request. Normally yes, but if you also run a PHP Accelerator, then no - the byte-code that PHP compiles to is cached and reused. Thus, mixing the FastCGI approach with an accelerator can make for a very speedy server indeed. Standard PHP does not come with an accelerator, but Zend Cache is scheduled for inclusion into the PHP core.

Clearing memory_get_peak_usage Cache

I'm trying to track down a memory leak in a PHP Program (Magento, if it matters). The basic problem seems to be a leak in some object/class that's growing over time. That is, the more information that gets logged to the database, the more memory certain application processes end up using. Magento's a highly abstract system, so it's not always clear what code is being run that's consuming so much memory. That's what I'm trying to track down.
I've been using memory_get_peak_usage at the end of the program bootstrap file to benchmark performance, and seen a steady growth from 250MB of peak use, to 310MB of peak use in about a week. I would like to use memory_get_peak_usage intermittently throughout the execution cycle to ask
What was the peak usage prior to this call? [later in the cycle] What was the peak usage prior to this new call?
The problem I'm running into is, once I call memory_get_peak_usage once, any future call returns the same value as the first call, even when I know the peak usage has changed. This leads me to believe that after memory_get_peak_usage is called once, PHP caches the result. I would like to uncache it to perform the testing outlined above.
Can I call memory_get_peak_usage multiple times?
Are there alternative to profiling the scenario I've described above. Some feature of xDebug maybe?

Can I call memory_get_peak_usage multiple times?
Not sure on that one.
Are there alternative to profiling the scenario I've described above. Some feature of xDebug maybe?
Have a look at the XDebug profile page. It's been awhile since I have profiled an app, but when I did I followed the write-up and worked great.

Memory leakage in php unrelated to GC?

I have a php script which takes an image, processes it and then writes the new image to file. I'm using imagick/imagemagick with php 5.3.8 with fastcgi. After reading around I thought maybe the garbage collecting function might help but it hasn't stopped php's memory usage in TOP from growing to triple digits. I used to run this script in cron.
<?php
var_dump(gc_enabled()); // true
var_dump(gc_collect_cycles()); // number comes out to 0
?>
Not sure what to do. So far the only thing that helps keep php in check is by doing a 'service php-fpm reload' every hour or so. Would using imagick as a shared ext instead of statically compiled one help? Any suggestions or insight is greatly appreciated.

Two options:
Farm out the work through gearman or the like to a script that will die completely. Generally I'll run my workers through a certain number of jobs, then have them die. They'll be restarted by supervisor in my setup so it's not a problem. The death after N requests just avoids memory issues.
As of 5.4 this might help: http://ca3.php.net/manual/en/function.apache-child-terminate.php
A note about built in vs external libraries. I haven't played with this aspect of image magick, but I saw it with GD. You get a much lower memory value from the PHP functions when you're using the external library, but the actual memory usage is nearly equal.

A good start to check for memory leaks is valgrind.

If PHP has lots of available memory to use then it doesn't bother to wipe the memory since it doesn't think it needs to. As it uses more, or if other applications start to use more memory, then it will clear the memory of what it can.
You can force the memory to be cleared for a variable by setting it to NULL, but unset() is recommended because you shouldn't need to force it to use less memory as PHP will clean up by itself.
But otherwise, a snippet of your code is required to answer your question.

From PHP workers to Python threads

Right now I'm running 50 PHP (in CLI mode) individual workers (processes) per machine that are waiting to receive their workload (job). For example, the job of resizing an image. In workload they receive the image (binary data) and the desired size. The worker does it's work and returns the resized image back. Then it waits for more jobs (it loops in a smart way). I'm presuming that I have the same executable, libraries and classes loaded and instantiated 50 times. Am I correct? Because this does not sound very effective.
What I'd like to have now is one process that handles all this work and being able to use all available CPU cores while having everything loaded only once (to be more efficient). I presume a new thread would be started for each job and after it finishes, the thread would stop. More jobs would be accepted if there are less than 50 threads doing the work. If all 50 threads are busy, no additional jobs are accepted.
I am using a lot of libraries (for Memcached, Redis, MogileFS, ...) to have access to all the various components that the system uses and Python is pretty much the only language apart from PHP that has support for all of them.
Can Python do what I want and will it be faster and more efficient that the current PHP solution?

Most probably - yes. But don't assume you have to do multithreading. Have a look at the multiprocessing module. It already has an implementation of a Pool included, which is what you could use. And it basically solves the GIL problem (multithreading can run only 1 "standard python code" at any time - that's a very simplified explanation).
It will still fork a process per job, but in a different way than starting it all over again. All the initialisations done- and libraries loaded before entering the worker process will be inherited in a copy-on-write way. You won't do more initialisations than necessary and you will not waste memory for the same libarary/class if you didn't actually make it different from the pre-pool state.
So yes - looking only at this part, python will be wasting less resources and will use a "nicer" worker-pool model. Whether it will really be faster / less CPU-abusing, is hard to tell without testing, or at least looking at the code. Try it yourself.
Added: If you're worried about memory usage, python may also help you a bit, since it has a "proper" garbage collector, while in php GC is a not a priority and not that good (and for a good reason too).

Linux has shared libraries, so those 50 php processes use mostly the same libraries.
You don't sound like you even have a problem at all.
"this does not sound very effective." is not a problem description, if anything those words are a problem on their own. Writing code needs a real reason, else you're just wasting time and/or money.
Python is a fine language and won't perform worse than php. Python's multiprocessing module will probably help a lot too. But there isn't much to gain if the php implementation is not completly insane. So why even bother spending time on it when everything works? That is usually the goal, not a reason to rewrite ...

If you are on a sane operating system then shared libraries should only be loaded once and shared among all processes using them. Memory for data structures and connection handles will obviously be duplicated, but the overhead of stopping and starting the systems may be greater than keeping things up while idle. If you are using something like gearman it might make sense to let several workers stay up even if idle and then have a persistent monitoring process that will start new workers if all the current workers are busy up until a threshold such as the number of available CPUs. That process could then kill workers in a LIFO manner after they have been idle for some period of time.

Best practises to stop memory leaks and improve performance

To put it simply i am a fairly new PHP coder and i was wondering if anyone could guide me towards the best ways to improve performance in code as well as stopping those pesky memory leaks, my host is one of those that doesn't have APC or the like installed so it would all have to be hand coded -_-

I don't think ordinary memory leaks (like forgetting to dispose of objects or strings) are common in PHP, but resource leaks in general are. I've had issues with:
database connections -- you should really call pg_close/mysql_close/etc. when you're done with the connection. Though I think PHPs connection pooling mitigates this (but can have problems of its own).
Images -- if you use the gd2 extension to open or create images, you need to image_destroy these, because otherwise they'll occupy memory forever. And images tend to be big in terms of data size.
Note that if your scripts run as pure CGI (no HTTP server modules), then the resources will effectively be cleaned up when the script exits. However there may still be memory issues during the script's runtime, especially in the case of images where it's not uncommon to perform many manipulations in a single script execution.

In general, php scripts can't leak memory. The php runtime manages all memory for its scripts. The script itself may leak memory, but this will be reclaimed when the php process ends. Since php is mainly used for processing http-requests and these generally run for a very short time, this makes it a non-issue if you leak a bit of memory underway. So memory leaks should only really concern you if you use php for non-http tasks. Performance should be a bigger concern for you than memory usage. Use a tool such as xdebug to profile your code.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.