Get full details of what is using memory in PHP - php

I have a PHP script that is failing with a fatal out-of-memory error. It is a script that processes all records in the DB - it works fine up to about 10k records and then hits the memory error.
However, I can't find out what is using up my application's memory.
I've checked the $_GLOBALS array and that accounts for maybe 1 MB or so.
I've checked the call stack at various points and have seen nothing unexpected.
The base memory requirement for PHP plus all relevant class files, etc. is about 7MB.
My feeling is that there is probably somewhere in the code that is resulting in variable references persisting - either deliberately (e.g. via a static cache in some class or other) or by mistake (e.g. resource handles not being freed).
Obviously functions like memory_get_usage() can tell me how much memory is used at any given point in the script, and tracking this is a slow but effective way of debugging. However, is there any way of getting details about what is actually using that memory?
Happy to accept answers that use an external tool (e.g. XDebug) providing they give useful output (i.e. output that identifies the class/variable names rather than using PHPs internal IDs). The output I expect would be something like you get from var_dump/print_r.
[Note that this question is not about how to debug out-of-memory issues in general, but specifically about whether there is a way to expose the details of memory use when debugging.]

Paul Crovella provided a link, but couldn't be bothered to post it as an answer, so I'm doing it for him.
The php-memprof extension (https://github.com/arnaud-lb/php-memory-profiler) provides a number of tools that can be used to provide exactly the information that the question asks for.
From the readme file:
php-memprof profiles memory usage of PHP scripts, and especially can tell which function has allocated every single byte of memory currently allocated.
Memprof can be enabled during script execution by calling memprof_enable().
Then the memory usage can be dumped by calling one of the memprof_dump_ functions. Both tell which functions allocated all the currently allocated memory.
The original question asked for something like print_r(), and this is provided by the memprof_dump_array() function. However, there are a number of other ways of accessing the memory profile which may be more useful depending on what you are trying to achieve, including dumping the entire memory map in callgrind format, for offline analysis.
As it is a PHP extension, it will require access to php.ini in order to install it, so it may not be suitable for debugging issues on live sites (but nobody does that, right?).

Related

Tips for following php calls in code base

I am working with action script 3 and often I see server calls that link to php files.
var serverCall:ServerCall = new ServerCall("getDeviceFirmwareLog", getDeviceFirmwareLogResponse, getDeviceFirmwareLogResponse, false);
This line calls some php functions that cannot be searched in my IDE, so I usually go from here and I would try to grep for that string "getDeviceFirmwareLog" and then I run into some php that makes other weird calls that somehow calls some stuff on the embedded hardware we run. In general when I grep for that string I don't even get any results and I'm so confused as to how it might be connected.
I am much more used to regular code calls and includes that are easier to follow. I've asked some people at work but it seems to get glossed over and I don't want to ask the same question a third time until I've exhausted my other options. I am wondering if there are any general debugging / code following tips for this kind of a setup that could help me understand what is going on in my codebase.
Thanks in advance.
Without intimate knowledge of your environment, I'd say it appears ServerCall is a custom socket class that calls external functions, with n number of arguments.
getDeviceFirmwareLog would therefore be the function being called, and would be a native function to the API of the hardware (not PHP); this is why you wouldn't be able to find it with a grep search.
Consequently, unless it's rigged with event listeners, ServerCall would populate with the requested data asynchronously (which would likely still fire an event when the request completed).
As you're working with both Flash and PHP, it appears as though you might be testing this through a browser. If so, you could always try the native debugging tools in your browser (F12).
The PHP portion is harder as it's server side scripting, however, take a look at the Eclipse Plugin PDT, which offers debugging facilities for PHP code.

PHP Session-like storage global across all users

What is a good method to retain a small piece of data across multiple calls of a PHP script, in such a way that any call of the script can access and modify its value - no matter who calls the script?
I mean something similar to $_SESSION variable, except session is strictly per-client; one client can't access another client's session. This one would be the same, no matter who accesses it.
Losing the value or having it corrupted (e.g. through race conditions of two scripts launched at once) is not a big problem - if it reads correctly 90% of the time it's satisfactory. OTOH, while I know I could just use a simple file on disk, I'd still prefer a RAM-based solution, not just for speed, but this running from not very wear-proof flash, endless writes would be bad.
Take a look at shared memory functions. There are two libraries that can be used to access shared memory:
Semaphores
Shared Memory
For storing binary data or one huge String, the Shared Memory library is better, whereas the Semahpores library provides convenient functions to store multiple variables of different types (at the cost of some overhead, that can be quite significant especially for a lot of small-sized (boolean for example) variables.
If that is too complex, and/or you don't worry about performance, you could just store the data in files (after all, PHPs internal session management uses files, too....)
A good alternative to using a database would be memcache!

Persistent Objects in Wordpress/PHP

I would like to create a set of persistent objects that load their state from the database and are then persisted in memory for Wordpress/PHP page loads to use as cached memory objects. I would imagine an interface for these objects to include:
initialise() - load state from database and perform any other initialisation functions needed prior to servicing requests
getter_foo() - a series of getter methods for PHP code to call for memory cached responses
getter_bar() - a series of getter methods for PHP code to call for memory cached responses
update() - called by time or event driven processes that ask the object to go back to the database and refresh its state
The two tricks I suspect are:
Have the main PHP process alloc and hold the memory reference for these objects so that they remain pinned to memory across web transactions/requests without needing to reinitialise each time against the database
Having a mechanism to allow the transactional processes to gain a pointer to this objects.
Are there any examples of solutions that do this? I've been programming for years but am very new to both Wordpress and PHP so maybe this is quite straight forward. Not sure. In any event, I do recognise that technical solutions like redis and memcached might achieve similar goals but in a less elegant and non-contextual way. That said, if there's no easy way to do this I'm happy to use the 80/20 rule. :^)
It's not possible to store data in memory during 1 request, and then read it back from memory during another request using nothing but plain PHP. Sure the PHP process uses memory, but as soon as your request is finished, that part of the memory gets garbage collected. Which means that a second request cannot access that previous part of the memory again.
What you are hinting at, is called caching. Simply put, caching means that you save the output of an expensive transaction for later re-use, to save on the cost of that transaction. What you then use as a backend to store that output is up to you or what you have available. If you want to save it to the RAM, then you would need something like Memcached. You could also store it in regular file, but that is slower because of the hard drive being accessed.

Performance impact when require-ing a file many times vs reusing it

Think of PHP templating.
I was recently contemplating whether it makes sense to read a template file once, storing it in memory, and then parsing it (replace placeholders with values, e.g.) rather than require-ing that file as many times as you need it. A usage scenario would be a list with list items templated as separate files. The first thoughts I had were inclined towards the former solution, because I reckon replacing values would be an easier operation than requiring the file from the file system. Later, however, I realized that pretty much all hard disk drives (or other storage, for that matter) have their own caching, and requiring the same file over and over, will not result in it being re-read each time, but rather re-served from the cache.
Any thoughts are appreciated.
I assume by "disk cache" you're actually referring to the page cache? Wikipedia: Page Cache
If so I wouldn't really be inclined to trust something like this with the performance of my application. Don't forget the page cache only uses UNUSED memory and will happily spit it back out when needed.
I would be inclined to use something like APC as an object cache, this has the great side effect of not having to actually rewrite any of your code as it's all done behind the scenes. Another possibility would be to just assign your template to a variable and constantly reuse that. Or, if you wanted to you could even use Memcache, this kind of stuff is more useful for caching database returns though, or large datasets.
Sorry for the slightly incoherent ramblings...
I was recently contemplating
That's quite wrong of you.
Groundless contemplating out of nowhere seldom does any good but most likely will put you in a trouble. Just out of nowhere.
Instead of contemplating, one have to do profiling.
Of course no to measure any changes, like H Hatfeld said, but to determine, if they need any changes at all. Most of time it turns out that you were barking wrong tree.
Profiling is the right thing to make you bark the right one.
whether it makes sense to read a template file onc, estoring it in memory, and then parsing it
For the highload(or bloated) projects it makes.
So, PHP already have such a feature, called bytecode cache. There is a plenty of the thing on the market, at our company we are using eAccelerator.
But most of time default every-request parsing is enough.
You are absolutely right about filesystem cache and parsing being blazingly fast, much faster than usual application logic, which has to be optimized at the first place.
Every time you include a file, PHP has to parse it. This penalty can be offset using an opcode cache like APC. If your templates don't contain any PHP (which it sounds like they don't), I would recommend loading the template into memory once and then re-using it as needed.
Another thing to keep in mind when looking to optimize your code is make sure you can measure the change. Use something like Xdebug to profile your code and measure what effect your changes are having.
Edit
Since the files do currently contain PHP, take a look at this question/answer. I would recommend putting a function in the file so that it only needs to be loaded once, but can be called multiple times with different parameters.

Have PHP dump heap on OutOfMemory exception

I am currently debugging a script that constantly runs into OutOfMemory exceptions. It is run as a cronjob and usually runs fine, but when the cronjob wasn't run for a while (for whatever reason) the script has to handle to many elements that queued up and will run into a OutOfMemory exception.
From examining the code I was not able to spot the problem. I believe one of the iterative function calls might leak memory, but I am not sure which one and where.
Is there an option to get PHP to dump the heap, when an OutOfMemory exception occurs? I might be able to spot the problem from there (most likely).
While I was not able to find a "dump heap on Exception" option, I did find get_defined_vars() which is basically a heap dump if called from a global scope. Using this I was able to see that there where hundreds (actually thousands) of still referenced database rows hanging around in my memory. This was due to a not freed mysql result resource somewhere in the infamous function that caused the leak. I found it and fixed it. It runs well now.
Well, easiest approach would be to use a try-catch block around that part of your script where the error possibly occurs and you will have to dump the stack in the catch part. The problem might be that the machine won't be able to react cause the memory is full and it terminates. I do not know if it helps to discard some variables to free up some memory to output some data.
EDIT: For this purpose use the php function debug-backtrace. This will give you a stack trace. So finding the error will be much likely in case the machine is still up.
Just do not load all objects together to memory, but read-as-you-process-them?
I've had lots of problems with simpleXML and memory leaks. They are a pain in the are to track down... took me days to figure out that simpleXML was causing then and then fix them.
As far as i know you cand programatically set a handled for OOM:)
Also, PHP's functions for displaying memory info fails to detect the memory leaks, i had scripts eating up ~1Gb of ram, but PHP's functions reported only 100Mb used:)
This is as good of a 'heap dump' as I'm able to quickly write in PHP. I take the defined variables and functions, then sort by their serialized length. Serialized length isn't a 100% reliable method for getting a variable's size, but it's pretty good, and generally useful for determining which objects are your memory hogs:
$memmap = array_map(function($var) { return strlen(serialize($var)); },
array_merge(get_defined_functions(), get_defined_vars()));
arsort($memmap);
var_dump($memmap);
You may want to tweak the callback function a bit if you'd like your results to be more verbose, or to recurse through the defined variables.
I've never seen PHP provide a native facility for this but a few other things might exist:
Try: https://github.com/mcfunley/php-heap/blob/master/php-heap.py
It could also be possible to write an extension to achieve the same.

Categories