I have a simple object thing that is able to have children of the same type.
This object has a toHTML method, which does something like:
$html = '<div>' . $this->name . '</div>';
$html .= '<ul>';
foreach($this->children as $child)
$html .= '<li>' . $child->toHTML() . '</li>';
$html .= '</ul>';
return $html;
The problem is that when the object is complex, like lots of children with children with children etc, memory usage skyrockets.
If I simply print_r the multidimensional array that feeds this object I get like 1 MB memory usage, but after I convert the array to my object and do print $root->toHtml() it takes 10 MB !!
How can I fix this?
====================================
Made a simple class that is similar to my real code (but smaller):
class obj{
protected $name;
protected $children = array();
public function __construct($name){
$this->name = $name;
}
public static function build($name, $array = array()){
$obj = new self($name);
if(is_array($array)){
foreach($array as $k => $v)
$obj->addChild(self::build($k, $v));
}
return $obj;
}
public function addChild(self $child){
$this->children[] = $child;
}
public function toHTML(){
$html = '<div>' . $this->name . '</div>';
$html .= '<ul>';
foreach($this->children as $child)
$html .= '<li>' . $child->toHTML() . '</li>';
$html .= '</ul>';
return $html;
}
}
And tests:
$big = array_fill(0, 500, true);
$big[5] = array_fill(0, 200, $big);
print_r($big);
// memory_get_peak_usage() shows 0.61 MB
$root = obj::build('root', $big);
// memory_get_peak_usage() shows 18.5 MB wtf lol
print $root->toHTML();
// memory_get_peak_usage() shows 24.6 MB
The problem is that you're buffering all the data in memory, which you don't actually need to do, as you're just outputting the data, rather than actually processing it.
Rather than buffering everything in memory, if all you want to do is output it you should just output it to wherever it's going to:
public function toHTMLOutput($outputStream){
fwrite($outputStream, '<div>' . $this->name . '</div>';
fwrite($outputStream, '<ul>');
foreach($this->children as $child){
fwrite($outputStream, '<li>');
$child->toHTMLOutput($outputStream);
fwrite($outputStream, '</li>');}
}
fwrite($outputStream, '</ul>');
}
$stdout = fopen('php://stdout', 'w');
print $root->toHTMLOutput($stdout);
or if you want to save the output to a file
$stdout = fopen('htmloutput.html', 'w');
print $root->toHTMLOutput($stdout);
Obviously I've only implemented it for the toHTML() function but the same principle should be done for the build function, which could lead to you skipping a separate toHTML function at all.
Introduction
Since you are sill going to output the HTML there is no need to save it indirectly consuming memory.
Here is a simple class that :
Builds menu from multidimensional array
Memory efficient uses Iterator
Can Write to Socket , Stream , File , array , Iterator etc
Example
$it = new ListBuilder(new RecursiveArrayIterator($big));
// Use Echo
$m = memory_get_peak_usage();
$it->display();
printf("%0.5fMB\n", (memory_get_peak_usage() - $m) / (1024 * 1024));
Output
0.03674MB
Other Output Interfaces
$big = array_fill(0, 500, true);
$big[5] = array_fill(0, 200, $big);
Simple Compare
// Use Echo
$m = memory_get_peak_usage();
$it->display();
$responce['echo'] = sprintf("%0.5fMB\n", (memory_get_peak_usage() - $m) / (1024 * 1024));
// Output to Stream or File eg ( Socket or HTML file)
$m = memory_get_peak_usage();
$it->display(fopen("php://output", "w"));
$responce['stream'] = sprintf("%0.5fMB\n", (memory_get_peak_usage() - $m) / (1024 * 1024));
// Output to ArrayIterator
$m = memory_get_peak_usage();
$it->display($array = new ArrayIterator());
$responce['iterator'] = sprintf("%0.5fMB\n", (memory_get_peak_usage() - $m) / (1024 * 1024));
// Output to Array
$m = memory_get_peak_usage();
$it->display($array = []);
$responce['array'] = sprintf("%0.5fMB\n", (memory_get_peak_usage() - $m) / (1024 * 1024));
echo "\n\nResults \n";
echo json_encode($responce, 128);
Output
Results
{
"echo": "0.03684MB\n",
"stream": "0.00081MB\n",
"iterator": "32.04364MB\n",
"array": "0.00253MB\n"
}
Class Used
class ListBuilder extends RecursiveIteratorIterator {
protected $pad = "\t";
protected $o;
public function beginChildren() {
$this->output("%s<ul>\n", $this->getPad());
}
public function endChildren() {
$this->output("%s</ul>\n", $this->getPad());
}
public function current() {
$this->output("%s<li>%s</li>\n", $this->getPad(1), parent::current());
return parent::current();
}
public function getPad($n = 0) {
return str_repeat($this->pad, $this->getDepth() + $n);
}
function output() {
$args = func_get_args();
$format = array_shift($args);
$var = vsprintf($format, $args);
switch (true) {
case $this->o instanceof ArrayIterator :
$this->o->append($var);
break;
case is_array($this->o) || $this->o instanceof ArrayObject :
$this->o[] = $var;
break;
case is_resource($this->o) && (get_resource_type($this->o) === "file" || get_resource_type($this->o) === "stream") :
fwrite($this->o, $var);
break;
default :
echo $var;
break;
}
}
function display($output = null) {
$this->o = $output;
$this->output("%s<ul>\n", $this->getPad());
foreach($this as $v) {
}
$this->output("%s</ul>\n", $this->getPad());
}
}
Conclusion
As you can see looping with iterator is fast but store values in iterator or object might not be that memory efficient.
Total number of elements in Your array is a little over 100000.
Each element of Your array is just one byte (boolean) so for over 100000 elements it takes 100000bytes ~0.1MB
Each of Your objects is ~100 bytes it is 100*100000 = 100000000 bytes ~ 10MB
But You have ~18MB so where is this 8 from?
If You run this code
<?php
$c = 0; //we use this to count object isntances
class obj{
protected $name;
protected $children = array();
public static $c=0;
public function __construct($name){
global $c;
$c++;
$this->name = $name;
}
public static function build($name, $array = array()){
global $c;
$b = memory_get_usage();
$obj = new self($name);
$diff = memory_get_usage()-$b;
echo $c . ' diff ' . $diff . '<br />'; //display change in allocated size
if(is_array($array)){
foreach($array as $k => $v)
$obj->addChild(self::build($k, $v));
}
return $obj;
}
public function addChild(self $child){
$this->children[] = $child;
}
public function toHTML(){
$html = '<div>' . $this->name . '</div>';
$html .= '<ul>';
foreach($this->children as $child)
$html .= '<li>' . $child->toHTML() . '</li>';
$html .= '</ul>';
return $html;
}
}
$big = array_fill(0, 500, true);
$big[5] = array_fill(0, 200, $big);
$root = obj::build('root', $big);
You will notice a change is constant with exception for objects created as
1024th, 2048th, 4096th...
I don't have link to any article or manual page about it but my guess is that php hold references to each created object in array with initial size of 1024. When You make this array full its size will get doubled to make space for new objects.
If You take difference from for example 2048th object subtract a size of object( the constant value You have in other lines) and divide by 2048 You will always get 32 - standard size of pointer in C.
So for 100000 objects this array grown to size of 131072 elements.
131072*32 = 4194304B = 4MB
This calculation are just approximate but I think it answers Your question what takes so much memory.
To answer how to keep memory low - avoid using objects for large set of data.
Obviously objects are nice and stuff but primitive data types are faster and smaller.
Maybe You can make it work with one object containing array with data. Hard to propose any alternative without more info about this objects and what methods/interface they require.
One thing that might be catching you is that you might be getting close to blowing your stack because of recursion. It might make sense in this case to create a rendering function that deals with the tree as a whole to render instead of relying on recursion to do the rendering for you. For informative topics on this see tail call recursion and tail call optimization.
To stick with your code's current structure and dodge a lot of the resource problems that you are likely facing the simplest solution may be to simply pass in the html string as a reference like:
class obj{
protected $name;
protected $children = array();
public function __construct($name){
$this->name = $name;
}
public static function build($name, $array = array()){
$obj = new self($name);
if(is_array($array)){
foreach($array as $k => $v)
$obj->addChild(self::build($k, $v));
}
return $obj;
}
public function addChild(self $child){
$this->children[] = $child;
}
public function toHTML(&$html = ""){
$html .= '<div>' . $this->name . '</div>';
$html .= '<ul>';
foreach($this->children as $child){
$html .= '<li>';
$html .= $child->toHTML($html);
$html .= '</li>';
}
$html .= '</ul>';
}
}
This will keep you from hauling around a bunch of duplicate partial tree renders while the recursive calls are resolving.
As for the actual build of the tree I think a lot of the memory usage is just the price of playing with data that big, your options there are either render instead of building up a hierarchical model just to render (just render output instead of building a tree) or, to employ some sort of caching strategies to either cache copies of the object tree or copies of the rendered html depending on how the data is used within your site. If you have control of the inbound data invalidating relevant cache keys can be added to that work flow to keep the cache from getting stale.
Related
I need to turn each end-point in a multi-dimensional array (of any dimension) into a row containing the all the descendant nodes using PHP. In other words, I want to resolve each complete branch in the array. I am not sure how to state this more clearly, so maybe the best way is to give an example.
If I start with an array like:
$arr = array(
'A'=>array(
'a'=>array(
'i'=>1,
'j'=>2),
'b'=>3
),
'B'=>array(
'a'=>array(
'm'=>4,
'n'=>5),
'b'=>6
)
);
There are 6 end points, namely the numbers 1 to 6, in the array and I would like to generate the 6 rows as:
A,a,i,1
A,a,j,2
A,b,2
B,a,m,3
B,a,n,4
B,b,2
Each row contains full path of descendants to the end-point. As the array can have any number of dimensions, this suggested a recursive PHP function and I tried:
function array2Rows($arr, $str='', $out='') {
if (is_array($arr)) {
foreach ($arr as $att => $arr1) {
$str .= ((strlen($str)? ',': '')) . $att;
$out = array2Rows($arr1, $str, $out);
}
echo '<hr />';
} else {
$str .= ((strlen($str)? ',': '')) . $arr;
$out .= ((strlen($out)? '<br />': '')) . $str;
}
return $out;
}
The function was called as follows:
echo '<p>'.array2Rows($arr, '', '').'</p>';
The output from this function is:
A,a,i,1
A,a,i,j,2
A,a,b,3
A,B,a,m,4
A,B,a,m,n,5
A,B,a,b,6
Which apart from the first value is incorrect because values on some of the nodes are repeated. I have tried a number of variations of the recursive function and this is the closest I can get.
I will welcome any suggestions for how I can get a solution to this problem and apologize if the statement of the problem is not very clear.
You were so close with your function... I took your function and modified is slightly as follows:
function array2Rows($arr, $str='', $csv='') {
$tmp = $str;
if (is_array($arr)) {
foreach ($arr as $att => $arr1) {
$tmp = $str . ((strlen($str)? ', ': '')) . $att;
$csv = array2Rows($arr1, $tmp, $csv);
}
} else {
$tmp .= ((strlen($str)? ', ': '')) . $arr;
$csv .= ((strlen($csv)? '<br />': '')) . $tmp;
}
return $csv;
}
The only difference is the introduction of a temporary variable $tmp to ensure that you don't change the $str value before the recursion function is run each time.
The output from your function becomes:
This is a nice function, I can think of a few applications for it.
The reason that you are repeating the second to last value is that in your loop you you are appending the key before running the function on the next array. Something like this would work better:
function array2Rows($arr, &$out=[], $row = []) {
if (is_array($arr)) {
foreach ($arr as $key => $newArray) {
if (is_array($newArray)) {
$row[] = $key; //If the current value is an array, add its key to the current row
array2Rows($newArray, $out, $row); //process the new value
} else { //The current value is not an array
$out[] = implode(',',array_merge($row,[$key,$newArray])); //Add the current key and value to the row and write to the output
}
}
}
return $out;
}
This is lightly optimized and utilizes a reference to hold the full output. I've also changed this to use and return an array rather than strings. I find both of those changes to make the function more readable.
If you wanted this to return a string formatted similarly to the one that you have in your function, replace the last line with
return implode('<br>', $out);
Alternatively, you could do that when calling, which would be what I would call "best practice" for something like this; e.g.
$result = array2Rows($arr);
echo implode('<br>', $result);
Note, since this uses a reference for the output, this also works:
array2Rows($arr, $result);
echo implode('<br>', $result);
I have a recursive function in php which gets from a database a folder tree. Each folder has an id, a name and a parent id.
function show_subfolders($parent=0, $indent=0) {
$indent++;
$folders = sql_to_assoc("SELECT * FROM `folders` WHERE 'parent' = ".$parent.";");
foreach($folders as $folder) {
echo ' '.$folder['naam'].' <br>';
show_subfolders($folder['id'], $indent);
}
}
show_subfolders();
I expect that the variable $indent tells us the level of nestedness of the recursive function, but it is not.. it just counts the number of calls. I hope it is clear that I want to know the 'generation' of each child-element.
Try taking the $indent var outside of the function scope, also, after you end traversing a node(folder) contents, you are going back up a level so at some point you should do a $indent--;
$indent = 0;
function show_subfolders(){
// give this function access to $indent
//you could also use a class var $this->indent if you make this into a class method
global $indent;
$folders = sql_to_assoc("SELECT * FROM `folders` WHERE 'parent' = ".$parent.";");
foreach($folders as $folder) {
echo str_repeat (' ', $indent).' '.$folder['naam'].' <br>';
$indent++;
show_subfolders($folder['id']);
$indent--;
}
}
Also added the str_repeat function so that your links are 'indented' when rendered in the browser. Although a better approach would be to draw the links in a which will allow you to control the visual indentation with css. That would make it:
$indent = 0;
function show_subfolders(){
// give this function access to $indent
//you could also use a class var $this->indent if you make this into a class method
global $indent;
$folders = sql_to_assoc("SELECT * FROM `folders` WHERE 'parent' = ".$parent.";");
if (count($folders)){
echo '<ul>';
foreach($folders as $folder) {
echo '<li> '.$folder['naam'].' </li>';
$indent++;
show_subfolders($folder['id']);
$indent--;
}
echo '</ul>';
}
}
I'm using a custom read filter to read files in chunks:
class chunkReadFilter implements PHPExcel_Reader_IReadFilter{
private $start_row, $end_row, $chunk_size;
public function __construct($chunk_size, $start_row=1){
$this->chunk_size = $chunk_size;
$this->start_row = $start_row;
$this->end_row = $start_row+$chunk_size-1;
}
public function moveCursor(){
$this->start_row += $this->chunk_size;
$this->end_row += $this->chunk_size;
}
public function readCell($column, $row, $worksheetName = ''){
return $row>=$this->start_row && $row<=$this->end_row;
}
}
My problem is that I'm not sure about know how to detect I've finished. Examples and documentation always hard-code a maximum row:
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
}
The PHPExcel_Worksheet::getHighestRow() and PHPExcel_Worksheet::getHighestDataRow() methods seem to work on filtered data (kind of). For instance, in a 200 row file:
If I read rows from 100 to 120 I get 120
If I attempt to read rows from 300 to 320 I get 1 :-?
What's the best way to stop the loop?
The best way to stop the loop is to know how many rows you should be reading in the first place.
There is a helper method in every Reader that will provide some basic meta data about the file without needing to load it all.
Before starting your loop:
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
$worksheetData = $objReader->listWorksheetInfo($inputFileName);
echo '<h3>Worksheet Information</h3>';
echo '<ol>';
foreach ($worksheetData as $worksheet) {
echo '<li>', $worksheet['worksheetName'], '<br />';
echo 'Rows: ', $worksheet['totalRows'],
' Columns: ', $worksheet['totalColumns'], '<br />';
echo 'Cell Range: A1:',
$worksheet['lastColumnLetter'], $worksheet['totalRows'];
echo '</li>';
}
echo '</ol>';
This is documented in section 7 of the User documentation for Reading Spreadsheet files, and in Examples/Reader/exampleReader19.php
The best way to loop thoughout cells is using getRowIterator and getCellIterator:
$rows = $sheet->getRowIterator();
foreach ($rows as $r => $row) {
$cells = $row->getCellIterator();
foreach ($cells as $c => $cell) {
$value = $cell->getValue();
}
}
I'm using a debugging aid in an application that uses var_dump() with output buffering to capture variables and display them. However, I'm running into an issue with large objects that end up using up too much memory in the buffer.
function getFormattedOutput(mixed $var) {
if (isTooLarge($var)) {
return 'Too large! Abort!'; // What a solution *might* look like
}
ob_start();
var_dump($var); // Fatal error: Allowed memory size of 536870912 bytes exhausted
$data = ob_get_clean();
// Return the nicely-formated data to use later
return $data
}
Is there a way I can prevent this? Or a work-around to detect that it's about to output a gigantic amount of info for a particular variable? I don't really have control which variables get passed into this function. It could be any type.
As all the others are mentioning what you ask is impossible. The only thing you can do is try to handle it as good as possible.
What you can try is to split it up into smaller pieces and then combine it. I've created a little test to try and get the memory error. Obviously a real world example might behave differently, but this seems to do the trick.
<?php
define('mem_limit', return_bytes(ini_get('memory_limit'))); //allowed memory
/*
SIMPLE TEST CLASS
*/
class test { }
$loop = 260;
$t = new Test();
for ($x=0;$x<=$loop;$x++) {
$v = 'test'.$x;
$t->$v = new Test();
for ($y=0;$y<=$loop;$y++) {
$v2 = 'test'.$y;
$t->$v->$v2 = str_repeat('something to test! ', 200);
}
}
/* ---------------- */
echo saferVarDumpObject($t);
function varDumpToString($v) {
ob_start();
var_dump($v);
$content = ob_get_contents();
ob_end_clean();
return $content;
}
function saferVarDumpObject($var) {
if (!is_object($var) && !is_array($var))
return varDumpToString($var);
$content = '';
foreach($var as $v) {
$content .= saferVarDumpObject($v);
}
//adding these smaller pieces to a single var works fine.
//returning the complete larger piece gives memory error
$length = strlen($content);
$left = mem_limit-memory_get_usage(true);
if ($left>$length)
return $content; //enough memory left
echo "WARNING! NOT ENOUGH MEMORY<hr>";
if ($left>100) {
return substr($content, 0, $left-100); //100 is a margin I choose, return everything you have that fits in the memory
} else {
return ""; //return nothing.
}
}
function return_bytes($val) {
$val = trim($val);
$last = strtolower($val[strlen($val)-1]);
switch($last) {
// The 'G' modifier is available since PHP 5.1.0
case 'g':
$val *= 1024;
case 'm':
$val *= 1024;
case 'k':
$val *= 1024;
}
return $val;
}
?>
UPDATE
The version above still has some error. I recreated it to use a class and some other functions
Check for recursion
Fix for single large attribute
Mimic var_dump output
trigger_error on warning to be able to catch/hide it
As shown in the comments, the resource identifier for a class is different from the output of var_dump. As far as I can tell the other things are equal.
<?php
/*
RECURSION TEST
*/
class sibling {
public $brother;
public $sister;
}
$brother = new sibling();
$sister = new sibling();
$brother->sister = $sister;
$sister->sister = $brother;
Dump::Safer($brother);
//simple class
class test { }
/*
LARGE TEST CLASS - Many items
*/
$loop = 260;
$t = new Test();
for ($x=0;$x<=$loop;$x++) {
$v = 'test'.$x;
$t->$v = new Test();
for ($y=0;$y<=$loop;$y++) {
$v2 = 'test'.$y;
$t->$v->$v2 = str_repeat('something to test! ', 200);
}
}
//Dump::Safer($t);
/* ---------------- */
/*
LARGE TEST CLASS - Large attribute
*/
$a = new Test();
$a->t2 = new Test();
$a->t2->testlargeattribute = str_repeat('1', 268435456 - memory_get_usage(true) - 1000000);
$a->smallattr1 = 'test small1';
$a->smallattr2 = 'test small2';
//Dump::Safer($a);
/* ---------------- */
class Dump
{
private static $recursionhash;
private static $memorylimit;
private static $spacing;
private static $mimicoutput = true;
final public static function MimicOutput($v) {
//show results similar to var_dump or without array/object information
//defaults to similar as var_dump and cancels this on out of memory warning
self::$mimicoutput = $v===false ? false : true;
}
final public static function Safer($var) {
//set defaults
self::$recursionhash = array();
self::$memorylimit = self::return_bytes(ini_get('memory_limit'));
self::$spacing = 0;
//echo output
echo self::saferVarDumpObject($var);
}
final private static function saferVarDumpObject($var) {
if (!is_object($var) && !is_array($var))
return self::Spacing().self::varDumpToString($var);
//recursion check
$hash = spl_object_hash($var);
if (!empty(self::$recursionhash[$hash])) {
return self::Spacing().'*RECURSION*'.self::Eol();
}
self::$recursionhash[$hash] = true;
//create a similar output as var dump to identify the instance
$content = self::Spacing() . self::Header($var);
//add some spacing to mimic vardump output
//Perhaps not the best idea because the idea is to use as little memory as possible.
self::$spacing++;
//Loop trough everything to output the result
foreach($var as $k=>$v) {
$content .= self::Spacing().self::Key($k).self::Eol().self::saferVarDumpObject($v);
}
self::$spacing--;
//decrease spacing and end the object/array
$content .= self::Spacing().self::Footer().self::Eol();
//adding these smaller pieces to a single var works fine.
//returning the complete larger piece gives memory error
//length of string and the remaining memory
$length = strlen($content);
$left = self::$memorylimit-memory_get_usage(true);
//enough memory left?
if ($left>$length)
return $content;
//show warning
trigger_error('Not enough memory to dump "'.get_class($var).'" memory left:'.$left, E_USER_WARNING);
//stop mimic output to prevent fatal memory error
self::MimicOutput(false);
if ($left>100) {
return substr($content, 0, $left-100); //100 is a margin I chose, return everything you have that fits in the memory
} else {
return ""; //return nothing.
}
}
final private static function Spacing() {
return self::$mimicoutput ? str_repeat(' ', self::$spacing*2) : '';
}
final private static function Eol() {
return self::$mimicoutput ? PHP_EOL : '';
}
final private static function Header($var) {
//the resource identifier for an object is WRONG! Its always 1 because you are passing around parts and not the actual object. Havent foundnd a fix yet
return self::$mimicoutput ? (is_array($var) ? 'array('.count($var).')' : 'object('.get_class($var).')#'.intval($var).' ('.count((array)$var).')') . ' {'.PHP_EOL : '';
}
final private static function Footer() {
return self::$mimicoutput ? '}' : '';
}
final private static function Key($k) {
return self::$mimicoutput ? '['.(gettype($k)=='string' ? '"'.$k.'"' : $k ).']=>' : '';
}
final private static function varDumpToString($v) {
ob_start();
var_dump($v);
$length = strlen($v);
$left = self::$memorylimit-memory_get_usage(true);
//enough memory left with some margin?
if ($left-100>$length) {
$content = ob_get_contents();
ob_end_clean();
return $content;
}
ob_end_clean();
//show warning
trigger_error('Not enough memory to dump "'.gettype($v).'" memory left:'.$left, E_USER_WARNING);
if ($left>100) {
$header = gettype($v).'('.strlen($v).')';
return $header . substr($v, $left - strlen($header));
} else {
return ""; //return nothing.
}
}
final private static function return_bytes($val) {
$val = trim($val);
$last = strtolower($val[strlen($val)-1]);
switch($last) {
// The 'G' modifier is available since PHP 5.1.0
case 'g':
$val *= 1024;
case 'm':
$val *= 1024;
case 'k':
$val *= 1024;
}
return $val;
}
}
?>
Well, if the physical memory is limited (you see the fatal error:)
Fatal error: Allowed memory size of 536870912 bytes exhausted
I would suggest to do the output buffering on disk (see callback parameter on ob_start). Output buffering works chunked, that means, if there still is enough memory to keep the single chunk in memory, you can store it into a temporary file.
// handle output buffering via callback, set chunksize to one kilobyte
ob_start($output_callback, $chunk_size = 1024);
However you must keep in mind that this will only prevent the fatal error while buffering. If you now want to return the buffer, you still need to have enough memory or you return the file-handle or file-path so that you can also stream the output.
However you can use that file then to obtain the size in bytes needed. The overhead for PHP strings is not much IIRC, so if there still is enough memory free for the filesize this should work well. You can substract offset to have a little room and play safe. Just try and error a little what it makes.
Some Example code (PHP 5.4):
<?php
/**
* #link http://stackoverflow.com/questions/5446647/how-can-i-use-var-dump-output-buffering-without-memory-errors/
*/
class OutputBuffer
{
/**
* #var int
*/
private $chunkSize;
/**
* #var bool
*/
private $started;
/**
* #var SplFileObject
*/
private $store;
/**
* #var bool Set Verbosity to true to output analysis data to stderr
*/
private $verbose = true;
public function __construct($chunkSize = 1024) {
$this->chunkSize = $chunkSize;
$this->store = new SplTempFileObject();
}
public function start() {
if ($this->started) {
throw new BadMethodCallException('Buffering already started, can not start again.');
}
$this->started = true;
$result = ob_start(array($this, 'bufferCallback'), $this->chunkSize);
$this->verbose && file_put_contents('php://stderr', sprintf("Starting Buffering: %d; Level %d\n", $result, ob_get_level()));
return $result;
}
public function flush() {
$this->started && ob_flush();
}
public function stop() {
if ($this->started) {
ob_flush();
$result = ob_end_flush();
$this->started = false;
$this->verbose && file_put_contents('php://stderr', sprintf("Buffering stopped: %d; Level %d\n", $result, ob_get_level()));
}
}
private function bufferCallback($chunk, $flags) {
$chunkSize = strlen($chunk);
if ($this->verbose) {
$level = ob_get_level();
$constants = ['PHP_OUTPUT_HANDLER_START', 'PHP_OUTPUT_HANDLER_WRITE', 'PHP_OUTPUT_HANDLER_FLUSH', 'PHP_OUTPUT_HANDLER_CLEAN', 'PHP_OUTPUT_HANDLER_FINAL'];
$flagsText = '';
foreach ($constants as $i => $constant) {
if ($flags & ($value = constant($constant)) || $value == $flags) {
$flagsText .= (strlen($flagsText) ? ' | ' : '') . $constant . "[$value]";
}
}
file_put_contents('php://stderr', "Buffer Callback: Chunk Size $chunkSize; Flags $flags ($flagsText); Level $level\n");
}
if ($flags & PHP_OUTPUT_HANDLER_FINAL) {
return TRUE;
}
if ($flags & PHP_OUTPUT_HANDLER_START) {
$this->store->fseek(0, SEEK_END);
}
$chunkSize && $this->store->fwrite($chunk);
if ($flags & PHP_OUTPUT_HANDLER_FLUSH) {
// there is nothing to d
}
if ($flags & PHP_OUTPUT_HANDLER_CLEAN) {
$this->store->ftruncate(0);
}
return "";
}
public function getSize() {
$this->store->fseek(0, SEEK_END);
return $this->store->ftell();
}
public function getBufferFile() {
return $this->store;
}
public function getBuffer() {
$array = iterator_to_array($this->store);
return implode('', $array);
}
public function __toString() {
return $this->getBuffer();
}
public function endClean() {
return ob_end_clean();
}
}
$buffer = new OutputBuffer();
echo "Starting Buffering now.\n=======================\n";
$buffer->start();
foreach (range(1, 10) as $iteration) {
$string = "fill{$iteration}";
echo str_repeat($string, 100), "\n";
}
$buffer->stop();
echo "Buffering Results:\n==================\n";
$size = $buffer->getSize();
echo "Buffer Size: $size (string length: ", strlen($buffer), ").\n";
echo "Peeking into buffer: ", var_dump(substr($buffer, 0, 10)), ' ...', var_dump(substr($buffer, -10)), "\n";
Output:
STDERR: Starting Buffering: 1; Level 1
STDERR: Buffer Callback: Chunk Size 1502; Flags 1 (PHP_OUTPUT_HANDLER_START[1]); Level 1
STDERR: Buffer Callback: Chunk Size 1503; Flags 0 (PHP_OUTPUT_HANDLER_WRITE[0]); Level 1
STDERR: Buffer Callback: Chunk Size 1503; Flags 0 (PHP_OUTPUT_HANDLER_WRITE[0]); Level 1
STDERR: Buffer Callback: Chunk Size 602; Flags 4 (PHP_OUTPUT_HANDLER_FLUSH[4]); Level 1
STDERR: Buffer Callback: Chunk Size 0; Flags 8 (PHP_OUTPUT_HANDLER_FINAL[8]); Level 1
STDERR: Buffering stopped: 1; Level 0
Starting Buffering now.
=======================
Buffering Results:
==================
Buffer Size: 5110 (string length: 5110).
Peeking into buffer: string(10) "fill1fill1"
...string(10) "l10fill10\n"
When you insall xdebug you can limit how deep var_dump follows objects. In some software products you might encounter a kind of recursion, which bloats the output of var_dump.
Other than that, you could raise the memory limit.
See http://www.xdebug.org/docs/display
I'm sorry, but I think there is no solution for your problem. You are asking for the determination of a size to prevent the memory allocation for that size. PHP can't give you an answer about "how much memory will it consume", as the ZVAL structs are created at the time of usage in PHP. Please refer to Programming PHP - 14.5. Memory Management for an overview of PHP's memory allocation internals.
You gave the correct hint "there can be anything in it" and this is the problem from my point of view. There is an architectural problem that leads to the case you describe. And I think you try to solve it on the wrong end.
For example: you can start with a switch for each type in php and try to set limits for each size. This lasts as long as nobody comes to the idea of changing the memory limit within the process.
Xdebug is a good solution as it keeps you application from exploding because of a (even non-business-critical) log function and it is a bad solution as you should no activate xdebug in production.
I think that a memory exception is the correct behavior and you should not try to work around it.
[rant]If the one who dumps a 50 megabytes or more string does not care about his/her app behavior, he/she deserves to suffer from it ;)[/rant]
I do not believe that there is any way to determine how much memory a specific function will eventually take up. One thing you can do is use memory_get_usage() to check how much memory the script is currently taking right before $largeVar is set, then compare it with the amount after. This will give you a good idea of the size of $largeVar, and you can run trials to determine what a maximum acceptable size limit would be before you exit gracefully.
You could also reimplement the var_dump() function yourself. Have the function walk through the structure and echo the resulting content as it is generated, or store it in a temp file, rather that storing a gigantic string in memory. This will allow you to get the same desired result, but without the memory problems you are encountering.
I found a couple of questions (this one and this question) related to the SPL iterators, but I'm not sure if they're helpful in my case, as I'm using a rather high level extension of the RecursiveIteratorIterator; the DirectoryTreeIterator.
Could somebody perhaps show me how to alter the DirectoryTreeIterator or how to sort the returned array per directory after it has been outputted by the iterator?
A method of sorting the files correctly directly on the Apache server is also an option for me, if it's possible using .htaccess for example.
This is the code of DirectoryTreeIterator from the SPL:
/** #file directorytreeiterator.inc
* #ingroup Examples
* #brief class DirectoryTreeIterator
* #author Marcus Boerger
* #date 2003 - 2005
*
* SPL - Standard PHP Library
*/
/** #ingroup Examples
* #brief DirectoryIterator to generate ASCII graphic directory trees
* #author Marcus Boerger
* #version 1.1
*/
class DirectoryTreeIterator extends RecursiveIteratorIterator
{
/** Construct from a path.
* #param $path directory to iterate
*/
function __construct($path) {
parent::__construct(
new RecursiveCachingIterator(
new RecursiveDirectoryIterator($path, RecursiveDirectoryIterator::KEY_AS_FILENAME
),
CachingIterator::CALL_TOSTRING|CachingIterator::CATCH_GET_CHILD
),
parent::SELF_FIRST
);
}
/** #return the current element prefixed with ASCII graphics
*/
function current() {
$tree = '';
for ($l=0; $l < $this->getDepth(); $l++) {
$tree .= $this->getSubIterator($l)->hasNext() ? ' ' : ' ';
}
return $tree . ($this->getSubIterator($l)->hasNext() ? ' ' : ' ')
. $this->getSubIterator($l)->__toString();
}
/** Aggregates the inner iterator
*/
function __call($func, $params) {
return call_user_func_array(array($this->getSubIterator(), $func), $params);;
}
}
To clarify why I'm using the code above is because it fits my needs exactly. I want to generate a recursive directory tree prefixed by whitespaces - the original code example by Marcus Boerger adds some ASCI elements. The problem is I don't have control over the sorting of files and directories, so I would like the directory tree to appear like this:
dir001
subdir001
subdir002
subfile001.jpg
file001.png
file002.png
file003.png
dir002
apple.txt
bear.txt
contact.txt
dir003
[...]
Instead, the listings returned by the iterator isn't sorted at all and it shows me something like this:
dir002
bear.txt
apple.txt
contact.txt
dir001
subdir001
subdir002
subfile001.jpg
file002.png
file001.png
file003.png
dir003
[...]
So I guess the solution I'm looking for is some way to call a sort method every time a subdirectory is indexed and added to the directory tree.
I hope I've made it somewhat clearer, as a nonnative speaker it's sometimes hard to put thoughts into coherent sentences (or even words for that matter).
Well, I'm not sure where you got that class from, but it's doing some pretty messed up things (including a few bugs to say the least). And while it uses SPL, it's not an SPL class.
Now, I'm not 100% sure what you mean by "sort", but assuming you're talking about a natural sort, why not just flatten an array, and then sort it?
$it = new RecursiveTreeIterator(
new RecrusiveDirectoryIterator($dir),
RecursiveTreeIterator::BYPASS_KEY,
CachingIterator::CALL_TOSTRING
);
$files = iterator_to_array($it);
natsort($files);
echo implode("\n", $files);
or
$it = new RecursiveIteratorIterator(
new RecursiveDirectoryIterator($dir),
RecursiveIteratorIterator::SELF_FIRST
);
$files = iterator_to_array($it);
$files = array_map(function($file) { return (string) $file; }, $files);
natsort($files);
echo implode("\n", $files);
Edit: Based on your edit, here's how I would solve it:
function BuildTree($it, $separator = ' ', $level = '') {
$results = array();
foreach ($it as $file) {
if (in_array($file->getBasename(), array('.', '..'))) {
continue;
}
$tmp = $level . $file->getBaseName();
if ($it->hasChildren()) {
$newLevel = $level . $separator;
$tmp .= "\n" . BuildTree($it->getChildren(), $separator, $newLevel);
}
$results[] = $tmp;
}
natsort($results);
return implode("\n", $results);
};
$it = new RecursiveDirectoryIterator($dir);
$tree = BuildTree($it);
It's a pretty simple recursive parser, and does the natural sort on each level.
Don't know about SPL iterators, but for your iterator you should put the items in an array, then sort them and add them to $tree. I modified the function current but didn't test it:
function current()
{
$tree = '';
$treeitems = array();
for ($l=0; $l < $this->getDepth(); $l++) {
//NOTE: On this line I think you have an error in your original code:
// This ? ' ' : ' ' is strange
$treeitems[] = $this->getSubIterator($l)->hasNext() ? ' ' : ' ';
}
$treeitems.sort();
for each ($treeitems as $treeitem)
$tree .= $treeitem;
return $tree . ($this->getSubIterator($l)->hasNext() ? ' ' : ' ')
. $this->getSubIterator($l)->__toString();
}