I came across the PHP's memory_get_usage() and memory_get_peak_usage().
The problem is that I found that these two functions do not provide the real memory used by the current script.
My test script is:
<?php
echo memory_get_usage();
echo '<br />';
$a = str_repeat('hello', 100000);
echo '<br />';
echo memory_get_usage();
echo '<br />';
echo memory_get_peak_usage();
?>
Which returns:
355120
5355216
5356008
What do you understand from this?
The first value is before executing the str_repeat() so it has to be the value of 0.
The second is after the process and it's OK to have a value greater than 0 but not that big value.
The third is the "peak" value and it's slightly greater than the second as I think it should be the biggest value in a processing microsecond.
So do you think that the real value of the current script's memory consumption should be like this:
memory_usage = the second memory usage - the first memory usage
peak_memory_usage = the third (peak_usage) - the first memory usage
which gives:
1) 5355216 - 355120 = 5000096 bytes
2) 5356008 - 355120 = 5000888 bytes
If this is how it works, I assume that the first 355120 bytes are the whole system allocated memory used by apache and other modules, as the first value never changes when you increase or decrease the number of repeats in the str_repeat(), only the two values after the process increase or decrease but never gets smaller that the first value.
According to the php manual, memory_get_usage returns the amount of memory allocated to php, not necessarily the amount being used.
Ok, your first assertion that the first memory_get_usage() should be 0 is wrong. According to PHP's documentation:
Returns the amount of memory, in
bytes, that's currently being
allocated to your PHP script.
Your script is running, therefore it must have some memory allocated to it. The first call informs you of how much that is.
Your second assertion that str_repeat() should not use that much memory is not looking at the whole picture.
You have the string "hello" (which uses 5 bytes) repeated 100,000 times, for a total of 500,000 bytes...minimum. The question is, how did PHP perform this action? Did they use code such as this? (pseudocode):
s = ""
for(i=0; i<100000; i++)
s += "hello"
This code would require that you reallocate a new string for each iteration of the for loop. Now I can't pretend to say that I know how PHP implements str_repeat(), but you have to be extremely careful with how you use memory to keep memory usage down. From the appearance of things, they did not manage memory in that function as well as they could have.
Third, the difference between the peak memory usage and current memory usage likely comes from the stack that was necessary to make the function call to str_repeat(), as well as any local variables necessary within that function. The memory was probably reclaimed when the function returned.
Finally, Apache runs in a different process and we are dealing with virtual memory. Nothing that Apache does will affect the result of memory_get_usage() as processes do not "share" virtual memory.
In my case (PHP 5.3.3 on Mac OS X 10.5) your script prints:
323964
824176
824980
Now, the difference between the second measurement and the first gives 500212, which is very close to the length of "hello" (5) times 100,000. So I would say no surprises here. The peak is a bit greater because of some temporary allocations when evaluating these statements.
(Your other questions are answered already)
Related
Its small code for test:
$strings = array('<big string here (2 Mb)');
$arr = array();
//--> memory usage here is 17.1Mb (checked by pmap)
echo memory_get_usage();//0.5Mb
//(i know, that other 16.6Mb of memory used by process are php libraries)
for($i = 0; $i < 20; ++$i)
{
$strings_local = array_merge($strings, array($i));
$arr[$i] = $strings_local;
unset($strings_local);
}
//--> memory usage here is 20.3Mb (checked by pmap)
echo memory_get_usage();//3.7Mb
//so, here its all ok, 17.1+3.2 = 20.3Mb
for($i = 0; $i < 20; ++$i)
{
unset($arr[$i]);
}
//--> memory usage here is 20.3Mb (checked by pmap)
//BUT?? i UNSET this variables...
echo memory_get_usage();//0.5Mb
So, seems like php is not free memory, even if you unset() your variable. How can i free memory after unset?
PHP has garbage collector which takes care of the memory management for you, which affects to memory usage (of the process) in several different ways.
First, when inspecting memory usage of a process outside of the process, even if PHP sees some memory to be freed, it may not be released back to the OS for optimization purposes related to memory allocation. This is reduce overhead from continuous frees and allocs, that happen more easily with GC’d languages, as allocation procedure is not visible to the actual program.
For that reason, even if one calls gc_collect_cycles() by hand, the memory may not be freed to the OS at all, but rather reused for future allocations. This causes PHP to see smaller memory usage than the process in reality uses, due to some early big reservation which never gets to freed back to the OS.
Second, due to nature of garbage collection, the memory may not be immediately freed after marked unused by the program. Calling gc_collect_cycles() will make the memory freed immediately, but it should be seen unnecessary, and does not work if you have logical (or something in PHP leaks) memory leak in your script.
For knowing what is going on, doing line by line inspection (for example with Xdebug’s function trace) would give you better insight about how PHP (or rather, your program) sees the memory usage.
Combining that to line-by-line inspection from outside of the process (for example your pmap commands) would tell if the PHP actually is freeing any memory at any point after reserving it.
I am trying to create a 2D array in PHP with a size of 2000x2000 (4 million entries). It seems that I run out of memory here, but the manner in which the error is appearing is confusing me.
When I define the array and fill it initially using the array_fill command, and initialize each position in the array (matrix) with 0 there is no problem.
However if I try iterating over the array and fill each position with 0, it runs out of memory.
I would assume that once I run array_fill it allocates the memory at that point, and it should not run out of memory in the loop.
Of course, this is just a simplified version of the code. In my actual application I will be using the X & Y coordinates to lookup value from another table, process it, and then store it in my matrix. These will be floating point values.
Can somebody help through some light on this please? Is there some other way I should be doing this?
Thank you!
<?php
// Set error reporting.
error_reporting(E_ALL);
ini_set('display_errors', TRUE);
ini_set('display_startup_errors', TRUE);
// Define Matrix dimensions.
define("MATRIX_WIDTH", 2000+1);
define("MATRIX_HEIGHT", 2000+1);
// Setup array for matrix and initialize it.
$matrix = array_fill(0,MATRIX_HEIGHT,array_fill(0,MATRIX_WIDTH,0));
// Populate each matrix point with calculated value.
for($y_cood=0;$y_cood<MATRIX_HEIGHT;$y_cood++) {
// Debugging statement to see where the script stops running.
if( ($y_cood % 100) == 0 ) {print("Y=$y_cood<br>"); flush();}
for($x_cood=0;$x_cood<MATRIX_WIDTH;$x_cood++) {
$fill_value = 0;
$matrix[$y_cood][$x_cood]=$fill_value;
}
}
print("Matrix width: ".count($matrix)."<br>");
print("Matrix height: ".count($matrix[0])."<br>");
?>
I would assume that once I run array_fill it allocates the memory at that point, and it should not run out of memory in the loop.
Yes ...and no. Allocating memory and executing the program code are two different shoes (usually).
The memory allocated to a program/process is usually divided in two - heap and stack. When you "allocate memory" (in the meaning you used in your question), this occurs in the heap. When you execute program code, the stack is also used. Both are not completely separated, since you may push and/or pop references (pointers to the heap) on and/or from the stack.
The thing is that the heap and the stack share part of the memory (allocated to that process) and usually the one grows (is being filled) from higher addresses to the low ones and the other - from low addresses to the higher one, and so you have a "floating" border between both. As soon as both parts reach that "border" you're "out of memory".
So, in your case, when you create and fill your array(matrix) you've used memory for 2001 x 2001 integers. If an integer requires 32 bits or 4 Bytes, then there are 2001 x 2001 x 4 Bytes = 4004001 x 4 Bytes = 16016004 Bytes ~ 16 MB.
When executing the code, the stack's being filled with the (local) variables - loop condition variable, loop counter and all the other variables.
You should also not forget that the PHP (library) code should also be loaded in the memory, so depending on the value you have set as memory_limit in your configuration, you may quickly run out of memory.
Supposing a multidimensional associative array that, when printed as text with print_r(), creates a 470 KiB file. Is it reasonable to assume that the variable in question takes up half a MiB of server memory per instance if it is different for each user? Therefore if 1000 users hit the server at the same time almost half a GiB of memory will be consumed?
Thanks.
There is an excellent article on this topic at IBM:
http://www.ibm.com/developerworks/opensource/library/os-php-v521/
UPDATE
The original page was taken down, for now the JP version is still there https://www.ibm.com/developerworks/jp/opensource/library/os-php-v521/
Basic takeaways form it are that you can use memory_get_usage() to check how much memory your script currently occupies:
// This is only an example, the numbers below will differ depending on your system
echo memory_get_usage () "\ n";. // 36640
$ A = str_repeat ( "Hello", 4242);
echo memory_get_usage () "\ n";. // 57960
unset ($ a);
echo memory_get_usage () "\ n";. // 36744
Also, you can check the peak memory usage of your script with memory_get_peak_usage().
As an answer to your questions: print_r() is a representation of data which is bloated with text and formatting. The occupied memory itself will be less than the number of characters of print_r(). How much depends on the data. You should check it like in the example above.
Whatever result you get, it will be for each user executing the script, so yes - if 1000 users are requesting it at the same time, you will need that memory.
I'm trying to track the memory usage of a script that processes URLs. The basic idea is to check that there's a reasonable buffer before adding another URL to a cURL multi handler. I'm using a 'rolling cURL' concept that processes a URLs data as the multi handler is running. This means I can keep N connections active by adding a new URL from a pool each time an existing URL processes and is removed.
I've used memory_get_usage() with some positive results. Adding the real_usage flag helped (not really clear on the difference between 'system' memory and 'emalloc' memory, but system shows larger numbers). memory_get_usage() does ramp up as URLs are added then down as the URL set is depleted. However, I just exceeded the 32M limit with my last memory check being ~18M.
I poll the memory usage each time cURL multi signals a request has returned. Since multiple requests may return at the same time, there's a chance a bunch of URLs returned data at the same time and actually jumped the memory usage that 14M. However, if memory_get_usage() is accurate, I guess that's what's happening.
[Update: Should have run more tests before asking I guess, increased php's memory limit (but left the 'safe' amount the same in the script) and the memory usage as reported did jump from below my self imposed limit of 25M to over 32M. Then, as expected slowly ramped down as URLs where not added. But I'll leave the question up: Is this the right way to do this?]
Can I trust memory_get_usage() in this way? Are there better alternative methods for getting memory usage (I've seen some scripts parse the output of shell commands)?
real_usage works this way:
Zend's memory manager does not use system malloc for every block it needs. Instead, it allocates a big block of system memory (in increments of 256K, can be changed by setting environment variable ZEND_MM_SEG_SIZE) and manages it internally. So, there are two kinds of memory usage:
How much memory the engine took from the OS ("real usage")
How much of this memory was actually used by the application ("internal usage")
Either one of these can be returned by memory_get_usage(). Which one is more useful for you depends on what you are looking into. If you're looking into optimizing your code in specific parts, "internal" might be more useful for you. If you're tracking memory usage globally, "real" would be of more use. memory_limit limits the "real" number, so as soon as all blocks that are permitted by the limit are taken from the system and the memory manager can't allocate a requested block, there the allocation fails. Note that "internal" usage in this case might be less than the limit, but the allocation still could fail because of fragmentation.
Also, if you are using some external memory tracking tool, you can set this
environment variable USE_ZEND_ALLOC=0 which would disable the above mechanism and make the engine always use malloc(). This would have much worse performance but allows you to use malloc-tracking tools.
See also an article about this memory manager, it has some code examples too.
I also assume memory_get_usage() is safe but I guess you can compare both methods and decide for yourself, here is a function that parses the system calls:
function Memory_Usage($decimals = 2)
{
$result = 0;
if (function_exists('memory_get_usage'))
{
$result = memory_get_usage() / 1024;
}
else
{
if (function_exists('exec'))
{
$output = array();
if (substr(strtoupper(PHP_OS), 0, 3) == 'WIN')
{
exec('tasklist /FI "PID eq ' . getmypid() . '" /FO LIST', $output);
$result = preg_replace('/[\D]/', '', $output[5]);
}
else
{
exec('ps -eo%mem,rss,pid | grep ' . getmypid(), $output);
$output = explode(' ', $output[0]);
$result = $output[1];
}
}
}
return number_format(intval($result) / 1024, $decimals, '.', '');
}
Use xdebug, as it was recently (January of 29th) updated to now include memory profiling information. It keeps track of the function calls and how much memory they consume. This allows you to get very insightful view into your code and at the very least sets you in a direction of being aware of the problems.
The documentation is helpful, but essentially you, install it enable the profiling xdebug.profiler_enable = 1 and give the output xdebug.profiler_output_dir=/some/path to a tool such as qcachegrind to do the heavy lifting, letting visually see it.
Well I have never really had a memory problem with my PHP scripts so I do not think I could be of much help finding the cause of the problem but what I can recomend is that you get a PHP accelerator, you will notice a serious performance increase and memory usage with decline. Here is a list of accelerators and an article comparing a few of them (3x better performance with any of them)
Wikipedia List
Benchmark
The benchmarks are 2 years old but you get the idea of the performance increases.
If you have to you can also increase you memory limit in PHP if you are still having problems even with the accelerator. Open up your php.ini and find:
memory_limit = 32M;
and just increase it a little.
Is there a function in PHP (or a PHP extension) to find out how much memory a given variable uses? sizeof just tells me the number of elements/properties.
memory_get_usage helps in that it gives me the memory size used by the whole script. Is there a way to do this for a single variable?
Note that this is on a development machine, so loading extensions or debug tools is feasible.
There's no direct way to get the memory usage of a single variable, but as Gordon suggested, you can use memory_get_usage. That will return the total amount of memory allocated, so you can use a workaround and measure usage before and after to get the usage of a single variable. This is a bit hacky, but it should work.
$start_memory = memory_get_usage();
$foo = "Some variable";
echo memory_get_usage() - $start_memory;
Note that this is in no way a reliable method, you can't be sure that nothing else touches memory while assigning the variable, so this should only be used as an approximation.
You can actually turn that to an function by creating a copy of the variable inside the function and measuring the memory used. Haven't tested this, but in principle, I don't see anything wrong with it:
function sizeofvar($var) {
$start_memory = memory_get_usage();
$tmp = unserialize(serialize($var));
return memory_get_usage() - $start_memory;
}
You Probably need a Memory Profiler. I have gathered information fro SO but I have copied the some important thing which may help you also.
As you probably know, Xdebug dropped the memory profiling support since the 2.* version. Please search for the "removed functions" string here: http://www.xdebug.org/updates.php
Removed functions
Removed support for Memory profiling as that didn't work properly.
Other Profiler Options
php-memory-profiler
https://github.com/arnaud-lb/php-memory-profiler. This is what I've done on my Ubuntu server to enable it:
sudo apt-get install libjudy-dev libjudydebian1
sudo pecl install memprof
echo "extension=memprof.so" > /etc/php5/mods-available/memprof.ini
sudo php5enmod memprof
service apache2 restart
And then in my code:
<?php
memprof_enable();
// do your stuff
memprof_dump_callgrind(fopen("/tmp/callgrind.out", "w"));
Finally open the callgrind.out file with KCachegrind
Using Google gperftools (recommended!)
First of all install the Google gperftools by downloading the latest package here: https://code.google.com/p/gperftools/
Then as always:
sudo apt-get update
sudo apt-get install libunwind-dev -y
./configure
make
make install
Now in your code:
memprof_enable();
// do your magic
memprof_dump_pprof(fopen("/tmp/profile.heap", "w"));
Then open your terminal and launch:
pprof --web /tmp/profile.heap
pprof will create a new window in your existing browser session with something like shown below:
Xhprof + Xhgui (the best in my opinion to profile both cpu and memory)
With Xhprof and Xhgui you can profile the cpu usage as well or just the memory usage if that's your issue at the moment.
It's a very complete solutions, it gives you full control and the logs can be written both on mongo or in the filesystem.
For more details see here.
Blackfire
Blackfire is a PHP profiler by SensioLabs, the Symfony2 guys https://blackfire.io/
If you use puphpet to set up your virtual machine you'll be happy to know it's supported ;-)
Xdebug and tracing memory usage
XDEBUG2 is a extension for PHP. Xdebug allows you to log all function calls, including parameters and return values to a file in different formats.There are three output formats. One is meant as a human readable trace, another one is more suited for computer programs as it is easier to parse, and the last one uses HTML for formatting the trace. You can switch between the two different formats with the setting. An example would be available here
forp
forp simple, non intrusive, production-oriented, PHP profiler. Some of features are:
measurement of time and allocated memory for each function
CPU usage
file and line number of the function call
output as Google's Trace Event format
caption of functions
grouping of functions
aliases of functions (useful for anonymous functions)
DBG
DBG is a a full-featured php debugger, an interactive tool that helps you debugging php scripts. It works on a production and/or development WEB server and allows you debug your scripts locally or remotely, from an IDE or console and its features are:
Remote and local debugging
Explicit and implicit activation
Call stack, including function calls, dynamic and static method calls, with their parameters
Navigation through the call stack with ability to evaluate variables in corresponding (nested) places
Step in/Step out/Step over/Run to cursor functionality
Conditional breakpoints
Global breakpoints
Logging for errors and warnings
Multiple simultaneous sessions for parallel debugging
Support for GUI and CLI front-ends
IPv6 and IPv4 networks supported
All data transferred by debugger can be optionally protected with SSL
No, there is not. But you can serialize($var) and check the strlen of the result for an approximation.
In answer to Tatu Ulmanens answer:
It should be noted, that $start_memory itself will take up memory (PHP_INT_SIZE * 8).
So the whole function should become:
function sizeofvar($var) {
$start_memory = memory_get_usage();
$var = unserialize(serialize($var));
return memory_get_usage() - $start_memory - PHP_INT_SIZE * 8;
}
Sorry to add this as an extra answer, but I can not yet comment on an answer.
Update: The *8 is not definate. It can depend apparently on the php version and possibly on 64/32 bit.
You can't retrospectively calculate the exact footprint of a variable as two variables can share the same allocated space in the memory
Let's try to share memory between two arrays, we see that allocating the second array costs half of the memory of the first one. When we unset the first one, nearly all the memory is still used by the second one.
echo memory_get_usage()."\n"; // <-- 433200
$c=range(1,100);
echo memory_get_usage()."\n"; // <-- 444348 (+11148)
$d=array_slice($c, 1);
echo memory_get_usage()."\n"; // <-- 451040 (+6692)
unset($c);
echo memory_get_usage()."\n"; // <-- 444232 (-6808)
unset($d);
echo memory_get_usage()."\n"; // <-- 433200 (-11032)
So we can't conclude than the second array uses half the memory, as it becomes false when we unset the first one.
For a full view about how the memory is allocated in PHP and for which use, I suggest you to read the following article: How big are PHP arrays (and values) really? (Hint: BIG!)
The Reference Counting Basics in the PHP documentation has also a lot of information about memory use, and references count to shared data segment.
The different solutions exposed here are good for approximations but none can handle the subtle management of PHP memory.
calculating newly allocated space
If you want the newly allocated space after an assignment, then you have to use memory_get_usage() before and after the allocation, as using it with a copy does give you an erroneous view of the reality.
// open output buffer
echo "Result: ";
// call every function once
range(1,1); memory_get_usage();
echo memory_get_usage()."\n";
$c=range(1,100);
echo memory_get_usage()."\n";
Remember that if you want to store the result of the first memory_get_usage(), the variable has to already exist before, and memory_get_usage() has to be called another previous time, and every other function also.
If you want to echo like in the above example, your output buffer has to be already opened to avoid accounting memory needed to open the output buffer.
calculating required space
If you want to rely on a function to calculate the required space to store a copy of a variable, the following code takes care of different optimizations:
<?php
function getMemorySize($value) {
// existing variable with integer value so that the next line
// does not add memory consumption when initiating $start variable
$start=1;
$start=memory_get_usage();
// json functions return less bytes consumptions than serialize
$tmp=json_decode(json_encode($value));
return memory_get_usage() - $start;
}
// open the output buffer, and calls the function one first time
echo ".\n";
getMemorySize(NULL);
// test inside a function in order to not care about memory used
// by the addition of the variable name to the $_GLOBAL array
function test() {
// call the function name once
range(1,1);
// we will compare the two values (see comment above about initialization of $start)
$start=1;
$start=memory_get_usage();
$c=range(1,100);
echo memory_get_usage()-$start."\n";
echo getMemorySize($c)."\n";
}
test();
// same result, this works fine.
// 11044
// 11044
Note that the size of the variable name matters in the memory allocated.
Check your code!!
A variable has a basic size defined by the inner C structure used in the PHP source code. This size does not fluctuate in the case of numbers. For strings, it would add the length of the string.
typedef union _zvalue_value {
long lval; /* long value */
double dval; /* double value */
struct {
char *val;
int len;
} str;
HashTable *ht; /* hash table value */
zend_object_value obj;
} zvalue_value;
If we do not take the initialization of the variable name into account, we already know how much a variable uses (in case of numbers and strings):
44 bytes in the case of numbers
+ 24 bytes in the case of strings
+ the length of the string (including the final NUL character)
(those numbers can change depending on the PHP version)
You have to round up to a multiple of 4 bytes due to memory alignment. If the variable is in the global space (not inside a function), it will also allocate 64 more bytes.
So if you want to use one of the codes inside this page, you have to check that the result using some simple test cases (strings or numbers) match those data taking into account every one of the indications in this post ($_GLOBAL array, first function call, output buffer, ...)
See:
memory_get_usage() — Returns the amount of memory allocated to PHP
memory_get_peak_usage() — Returns the peak of memory allocated by PHP
Note that this won't give you the memory usage of a specific variable though. But you can put calls to these function before and after assigning the variable and then compare the values. That should give you an idea of the memory used.
You could also have a look at the PECL extension Memtrack, though the documentation is a bit lacking, if not to say, virtually non-existent.
You could opt for calculating memory difference on a callback return value. It's a more elegant solution available in PHP 5.3+.
function calculateFootprint($callback) {
$startMemory = memory_get_usage();
$result = call_user_func($callback);
return memory_get_usage() - $startMemory;
}
$memoryFootprint = calculateFootprint(
function() {
return range(1, 1000000);
}
);
echo ($memoryFootprint / (1024 * 1024)) . ' MB' . PHP_EOL;
I had a similar problem, and the solution I used was to write the variable to a file then run filesize() on it. Roughly like this (untested code):
function getVariableSize ( $foo )
{
$tmpfile = "temp-" . microtime(true) . ".txt";
file_put_contents($tmpfile, $foo);
$size = filesize($tmpfile);
unlink($tmpfile);
return $size;
}
This solution isn't terribly fast because it involves disk IO, but it should give you something much more exact than the memory_get_usage tricks. It just depends upon how much precision you require.
The following script shows total memory usage of a single variable.
function getVariableUsage($var) {
$total_memory = memory_get_usage();
$tmp = unserialize(serialize($var));
return memory_get_usage() - $total_memory;
}
$var = "Hey, what's you doing?";
echo getVariableUsage($var);
Check this out
http://www.phpzag.com/how-much-memory-do-php-variables-use/
Never tried, but Xdebug traces with xdebug.collect_assignments may be enough.