Recursive MySQL function call eats up too much memory and dies

Recursive MySQL function call eats up too much memory and dies - php

I have the following recursive function which works... up until a point. Then the script asks for more memory once the queries exceed about 100, and when I add more memory, the script typically just dies (I end up with a white screen on my browser).
public function returnPArray($parent=0,$depth=0,$orderBy = 'showOrder ASC'){
$query = mysql_query("SELECT *, UNIX_TIMESTAMP(lastDate) AS whenTime
FROM these_pages
WHERE parent = '".$parent."' AND deleted = 'N' ORDER BY ".$orderBy."");
$rows = mysql_num_rows($query);
while($row = mysql_fetch_assoc($query)){
// This uses my class and places the content in an array.
MyClass::$_navArray[] = array(
'id' => $row['id'],
'parent' => $row['parent']
);
MyClass::returnPArray($row['id'],($depth+1));
}
$i++;
}
Can anyone help me make this query less resource intensive? Or find a way to free up memory between calls... somehow.

The white screen is likely because of a stack overflow. Do you have a row where the parent_id is it's own id? Try adding AND id != '".(int)$parent."' to the where clause to prevent that kind of bug from creeping in...
**EDIT: To account for circular references, try modifying the assignment to something like:
while($row = mysql_fetch_assoc($query)){
if (isset(MyClass::$_navArray[$row['id']])) continue;
MyClass::$_navArray[$row['id']] = array(
'id' => $row['id'],
'parent' => $row['parent']
);
MyClass::returnPArray($row['id'],($depth+1));
}

Shouldn't you stop recursion at some point (I guess you do need to return from method if the number of rows is 0) ? From the code you posted I see an endless recursive calls to returnPArray.

Let me ask you this... are you just trying to build out a tree of pages? If so, is there some point along the hierarchy that you can call an ultimate parent? I've found that when storing tress in a db, storing the ultimate parent id in addition to the immediate parent makes it much faster to get back as you don't need any recursion or iteration against the db.
It is a bit of denormalization, but just a small bit, and it's better to denorm than to recurse or iterate vs the db.
If your needs are more complex, it may be better to retrieve more of the tree than you need and use app code to iterate through to get just the nodes/rows you need. Most application code is far superior to any DB at iteration/recursion.

Most likely you're overloading on active query result sets. If, as you say, you're getting about 100 iterations deep into the recursion, that means you've got 100 queries/resultsets open. Even if each query only returns one row, the whole resultset is kept open until the second fetch call (which would return false). You never get back to any particular level to do that second call, so you just keep firing off new queries and opening new result sets.
If you're going for a simple breadcrumb trail, with a single result needed per tree level, then I'd suggest not doing a while() loop to iterate over the result set. Fetch the record for each particular level, then close the resultset with mysql_free_result(), THEN do the recursive call.
Otherwise, try switching to a breadth-first query method, and again, free the resulset after building each tree level.

Why are you using a recursive function? When I look at the code, it looks as though you're simply creating a table which will contain both the child and parent ID of all records. If that's what you want as a result then you don't even need recursion. A simple select, not filtering on parent_id (but probably ordering on it) will do, and you only iterate over it once.
The following will probably return the same results as your current recursive function :
public function returnPArray($orderBy = 'showOrder ASC'){
$query = mysql_query("SELECT *, UNIX_TIMESTAMP(lastDate) AS whenTime
FROM these_pages
WHERE deleted = 'N' ORDER BY parent ASC,".$orderBy."");
$rows = mysql_num_rows($query);
while($row = mysql_fetch_assoc($query)){
// This uses my class and places the content in an array.
MyClass::$_navArray[] = array(
'id' => $row['id'],
'parent' => $row['parent']
);
}
}

I'd suggest getting all rows in one query and build up the tree-structure using pure PHP:
$nodeList = array();
$tree = array();
$query = mysql_query("SELECT *, UNIX_TIMESTAMP(lastDate) AS whenTime
FROM these_pages WHERE deleted = 'N' ORDER BY ".$orderBy."");
while($row = mysql_fetch_assoc($query)){
$nodeList[$row['id']] = array_merge($row, array('children' => array()));
}
mysql_free_result($query);
foreach ($nodeList as $nodeId => &$node) {
if (!$node['parent_id'] || !array_key_exists($node['parent_id'], $nodeList)) {
$tree[] = &$node;
} else {
$nodeList[$node['parent_id']]['children'][] = &$node;
}
}
unset($node);
unset($nodeList);
Adjust as needed.

There are a few problems.
You already noticed the memory problem. You can set unlimited memory by using ini_set('memory_limit', -1).
The reason you get a white screen is because the script exceeds the max execution time and you either have display_errors turned off or error_reporting is set to E_NONE. You can set unlimited execution time by using set_time_limit(0).
Even with "unlimited" memory and "unlimited" time, you are still obviously constrained by the limits of your server and your own precious time. The algorithm and data model that you have selected will not scale well, and if this is meant for a production website, then you have already blown your time and memory budget.
The solution to #3 is to use a better data model which supports a more efficient algorithm.
Your function is named poorly, but I'm guessing it means to "return an array of all parents of a particular page".
If that's what you want to do, then check out Modified Pre-order Tree Traversal as a strategy for more efficient querying. This behavior is already built into some frameworks, such as Doctrine ORM, which makes it particularly easy to use.

Related

I have been taught 2 different ways of accessing data from a database. Which one performs better?

As the title suggests, I am looking to get some explanation about which of these methods would be best performance wise. I would think that #2 would be better because it uses only one loop instead of a while loop and a for loop. Is there even any point in using method #1?
The first method
// assign data from row to $users[]
while($row = mysqli_fetch_assoc($result))
{
$users[] = $row;
}
// and later on use a for loop to access the data in $users[]
for($i=0; $i < count($users); $i++)
{
echo $users[$i]['name'] . "<br>";
}
Method 2:
// only using a while loop
while ($row = mysqli_fetch_assoc($result))
{
echo $row['name'] . "<br>";
}

Second method is better if by performs better you mean more optimal. It's less intense on memory and cpu. First method makes your script to save all needed data to array from database, so you need some memory to "cache" it, and it also takes little of cpu time to rewrite and than read it. Of course those aren't big numbers.
But while that method performs better in regard of pure optimization, the first method is better if you want to use the fetched data to do something more intresting with it, but if that is for example printing record's from DB and than forgetting, it's not worth the wasted resources.

It depends so much about what you really want in your application. If you read mysqli_fetch_assoc will learn about this function is already returning an array, so call to iteration it's dispensable, principally with the order of array (vector, one dimension) is needed, because the function already do it for you, turning the seconde method the best option.
The first method is likable if you want to order your array more specifically, multidimensional kind of stuff or turn into an object.
PS: It's a good practice if your application it's fully OOP implemented, to do first method putting that information into a private var inside class.

How do I efficiently run a PHP script that doesn't take forever to execute in wamp enviornemnt...?

I've made a script that pretty much loads a huge array of objects from a mysql database, and then loads a huge (but smaller) list of objects from the same mysql database.
I want to iterate over each list to check for irregular behaviour, using PHP. BUT everytime I run the script it takes forever to execute (so far I haven't seen it complete). Is there any optimizations I can make so it doesn't take this long to execute...? There's roughly 64150 entries in the first list, and about 1748 entries in the second list.
This is what the code generally looks like in pseudo code.
// an array of size 64000 containing objects in the form of {"id": 1, "unique_id": "kqiweyu21a)_"}
$items_list = [];
// an array of size 5000 containing objects in the form of {"inventory: "a long string that might have the unique_id", "name": "SomeName", id": 1};
$user_list = [];
Up until this point the results are instant... But when I do this it takes forever to execute, seems like it never ends...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"]) !== false)
{
echo("Found a version of the item");
}
}
}
Note that the echo should rarely happen.... The issue isn't with MySQL as the $items_list and $user_list array populate almost instantly.. It only starts to take forever when I try to iterate over the lists...

With 130M iterations, adding a break will help somehow despite it rarely happens...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"])){
echo("Found a version of the item");
break;
}
}
}
alternate solutions 1 with PHP 5.6: You could also use PTHREADS and split your big array in chunks to pool them into threads... with break, this will certainly improve it.
alternate solutions 2: use PHP7, the performances improvements regarding arrays manipulations and loop is BIG.
Also try to sort you arrays before the loop. depends on what you are looking at but very oftenly, sorting arrays before will limit a much as possible the loop time if the condition is found.

Your example is almost impossible to reproduce. You need to provide an example that can be replicated ie the two loops as given if only accessing an array will complete extremely quickly ie 1 - 2 seconds. This means that either the string your searching is kilobytes or larger (not provided in question) or something else is happening ie a database access or something like that while the loops are running.

You can let SQL do the searching for you. Since you don't share the columns you need I'll only pull the ones I see.
SELECT i.unique_id, u.inventory
FROM items i, users u
WHERE LOCATE(i.unique_id, u inventory)

PHP array optimization for 80k rows

I need help to find workaround for getting over memory_limit. My limit is 128MB, from database I'm getting something about 80k rows, script stops at 66k. Thanks for help.
Code:
$posibilities = [];
foreach ($result as $item) {
$domainWord = str_replace("." . $item->tld, "", $item->address);
for ($i = 0; $i + 2 < strlen($domainWord); $i++) {
$tri = $domainWord[$i] . $domainWord[$i + 1] . $domainWord[$i + 2];
if (array_key_exists($tri, $possibilities)) {
$possibilities[$tri] += 1;
} else {
$possibilities[$tri] = 1;
}
}
}

Your bottleneck, given your algorithm, is most possibly not the database query, but the $possibilities array you're building.
If I read your code correctly, you get a list of domain names from the database. From each of the domain names you strip off the top-level-domain at the end first.
Then you walk character-by-character from left to right of the resulting string and collect triplets of the characters from that string, like this:
example.com => ['exa', 'xam', 'amp', 'mpl', 'ple']
You store those triplets in the keys of the array, which is nice idea, and you also count them, which doesn't have any effect on the memory consumption. However, my guess is that the sheer number of possible triplets, which is for 26 letters and 10 digits is 36^3 = 46656 possibilities each taking 3 bytes just for key inside array, don't know how many boilerplate code around it, take quite a lot from your memory limit.
Probably someone will tell you how PHP uses memory with its database cursors, I don't know it, but you can do one trick to profile your memory consumption.
Put the calls to memory-get-usage:
before and after each iteration, so you'll know how many memory was wasted on each cursor advancement,
before and after each addition to $possibilities.
And just print them right away. So you'll be able to run your code and see in real time what and how seriously uses your memory.
Also, try to unset the $item after each iteration. It may actually help.
Knowledge of specific database access library you are using to obtain $result iterator will help immensely.

Given the tiny (pretty useless) code snippet you've provided I want to provide you with a MySQL answer, but I'm not certain you're using MySQL?
But
- Optimise your table.
Use EXPLAIN to optimise your query. Rewrite your query to put as much of the logic in the query rather than in the PHP code.
edit: if you're using MySQL then prepend EXPLAIN before your SELECT keyword and the result will show you an explanation of actually how the query you give MySQL turns into results.
Do not use PHP strlen function as this is memory inefficient - instead you can compare by treating a string as a set of array values, thus:
for ($i = 0; !empty($domainWord[$i+2]); $i++) {
in your MySQL (if that's what you're using) then add a LIMIT clause that will break the query into 3 or 4 chunks, say of 25k rows per chunk, which will fit comfortably into your maximum operating capacity of 66k rows. Burki had this good idea.
At the end of each chunk clean all the strings and restart, set into a loop
$z = 0;
while ($z < 4){
///do grab of data from database. Preserve only your output
$z++;
}
But probably more important than any of these is provide enough details in your question!!
- What is the data you want to get?
- What are you storing your data in?
- What are the criteria for finding the data?
These answers will help people far more knowledgable than me to show you how to properly optimise your database.

My PHP nested loops use way too much memory

I'm trying to program a utility that handles a large volume of data and memory is a factor. Unfortunately each time this set of loops I have runs, it eats apx. 14MB of memory because it is executed thousands of times, even with the unset() calls (and yes I'm aware they do not clean up memory entirely, kind of why I'm asking the question). I'm wondering if there is an easier way to do this. Current working code:
$qr = array();
foreach($XML->row as $row)
{
$ra = array();
foreach($row as $key => $value)
{
$ra[$key] = $value[0];
unset($key,$value);
}
$qr[] = $ra;
unset($row,$ra);
}
unset($XML);
return $qr;
Another attempt was to do this, but it lags out. Anybody know what I'm doing wrong?
$qr = array();
while(list(,$row) = each($XML->row))
{
$ra = array();
while(list($key,$value) = each($row))
{
$ra[$key] = $value[0];
unset($key,$value);
}
$qr[] = $ra;
unset($row,$ra);
}
unset($XML);
return $qr;
Basically in the first loop, I'm just trying to do a basic array/object iteration. In the 2nd loop, I'm trying to go through each array value and get the 1st element while maintaining object/array index association. It seems I originally wrote it like this due to it being the only thing that worked (because it's looping through a SimpleXML Object). Any tips on speeding this thing up or figuring out how to make it not eat memory would be appreciated.
I'm looking for solutions for garbage collection or more efficient code. I do not plan on replacing SimpleXML as there is no need for it. More clearly, I'm looking for:
A way to iterate the SimpleXML object without needing to call the inner loop (which is only due to me doing $value[0]. Why is that necessary?
A way which is more efficient (either speed or memory-wise) for iterating through the data

If you want to use less memory i recommend you start looking at SAX parser. Here is example. It is more difficult to develop parser with SAX but it's more efficient then SimpleXML, and you could parse big xml files with it.

Your memory load is high because SimpleXML loads the entire document into memory when parsing. So your unset() calls just decrement the reference count, and because the data still persists in memory it isn't freed. This is a consequence of working with SimpleXML: the benefit of which is that the document is in memory and represented as a PHP object.
If you want to reduce your memory usage, you need use something else like XMLReader or XML Parser. These are SAX-based, or event-based, which won't load the XML file into memory, but will walk the tree one element at a time. Since you don't appear to be using something like XPath this is your better choice.

That's not how you access data from a SimpleXML object. I see you are using index [0] to get the string contents of each part of the object and treating it as an array. It's not an array, it's an object. This is how you should access string data... Example: http://php.net/manual/en/simplexml.examples-basic.php#example-5095
Something like this will do the trick:
$qr = array();
foreach($XML->row as $row)
{
$ra = array();
$ra['name'] = $value->name;
$ra['name2'] = $value->name2;
//Add a line for each element name, etc...
$qr[] = $ra;
unset($row,$ra);
}
unset($XML);
return $qr;
It will also get rid of your inner loop and save you memory.

Initiating the same loop with either a while or foreach statement

I have code in php such as the following:
while($r = mysql_fetch_array($q))
{
// Do some stuff
}
where $q is a query retrieving a set of group members. However, certain groups have there members saved in memcached and that memcached value is stored in an array as $mem_entry. To run through that, I'd normally do the following
foreach($mem_entry as $k => $r)
{
// Do some stuff
}
Here's the problem. I don't want to have two blocks of identical code (the //do some stuff section) nested in two different loops just because in one case I have to use mysql for the loop and the other memcached. Is there some way to toggle starting off the loop with the while or foreach? In other words, if $mem_entry has a non-blank value, the first line of the loop will be foreach($mem_entry as $k => $r), or if it's empty, the first line of the loop will be while($r = mysql_fetch_array($q))
Edit
Well, pretty much a few seconds after I wrote this I ended up coming with the solution. Figure I'd leave this up for anyone else that might come upon this problem. I first set the value of $members to the memcached value. If that's blank, I run the mysql query and use a while loop to transfer all the records to an array called $members. I then initiate the loop using foreach($members as as $k => $r). Basically, I'm using a foreach loop everytime, but the value of $members is set differently based on whether or not a value for it exists in memcached.

Why not just refactor out doSomeStuff() as a function which gets called from within each loop. Yes, you'll need to see if this results in a performance hit, but unless that's significant, this is a simple approach to avoiding code repetition.
If there's a way to toggle as you suggest, I don't know of it.

Not the ideal solution but i will give you my 2 cents. The ideal would have been to call a function but if you dont want to do that then, you can try something like this:
if(!isset($mem_entry)){
$mem_entry = array();
while($r = mysql_fetch_array($q))
{
$mem_entry[] = $r;
}
}
The idea is to just use the foreach loop to do the actual work, if there is nothing in memcache then fill your mem_entry array with stuff from mysql and then feed it to your foreach loop.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.