PHP array optimization for 80k rows - php

I need help to find workaround for getting over memory_limit. My limit is 128MB, from database I'm getting something about 80k rows, script stops at 66k. Thanks for help.
Code:
$posibilities = [];
foreach ($result as $item) {
$domainWord = str_replace("." . $item->tld, "", $item->address);
for ($i = 0; $i + 2 < strlen($domainWord); $i++) {
$tri = $domainWord[$i] . $domainWord[$i + 1] . $domainWord[$i + 2];
if (array_key_exists($tri, $possibilities)) {
$possibilities[$tri] += 1;
} else {
$possibilities[$tri] = 1;
}
}
}

Your bottleneck, given your algorithm, is most possibly not the database query, but the $possibilities array you're building.
If I read your code correctly, you get a list of domain names from the database. From each of the domain names you strip off the top-level-domain at the end first.
Then you walk character-by-character from left to right of the resulting string and collect triplets of the characters from that string, like this:
example.com => ['exa', 'xam', 'amp', 'mpl', 'ple']
You store those triplets in the keys of the array, which is nice idea, and you also count them, which doesn't have any effect on the memory consumption. However, my guess is that the sheer number of possible triplets, which is for 26 letters and 10 digits is 36^3 = 46656 possibilities each taking 3 bytes just for key inside array, don't know how many boilerplate code around it, take quite a lot from your memory limit.
Probably someone will tell you how PHP uses memory with its database cursors, I don't know it, but you can do one trick to profile your memory consumption.
Put the calls to memory-get-usage:
before and after each iteration, so you'll know how many memory was wasted on each cursor advancement,
before and after each addition to $possibilities.
And just print them right away. So you'll be able to run your code and see in real time what and how seriously uses your memory.
Also, try to unset the $item after each iteration. It may actually help.
Knowledge of specific database access library you are using to obtain $result iterator will help immensely.

Given the tiny (pretty useless) code snippet you've provided I want to provide you with a MySQL answer, but I'm not certain you're using MySQL?
But
- Optimise your table.
Use EXPLAIN to optimise your query. Rewrite your query to put as much of the logic in the query rather than in the PHP code.
edit: if you're using MySQL then prepend EXPLAIN before your SELECT keyword and the result will show you an explanation of actually how the query you give MySQL turns into results.
Do not use PHP strlen function as this is memory inefficient - instead you can compare by treating a string as a set of array values, thus:
for ($i = 0; !empty($domainWord[$i+2]); $i++) {
in your MySQL (if that's what you're using) then add a LIMIT clause that will break the query into 3 or 4 chunks, say of 25k rows per chunk, which will fit comfortably into your maximum operating capacity of 66k rows. Burki had this good idea.
At the end of each chunk clean all the strings and restart, set into a loop
$z = 0;
while ($z < 4){
///do grab of data from database. Preserve only your output
$z++;
}
But probably more important than any of these is provide enough details in your question!!
- What is the data you want to get?
- What are you storing your data in?
- What are the criteria for finding the data?
These answers will help people far more knowledgable than me to show you how to properly optimise your database.

Related

array_count_values used with array_filter reaching allowed memory size

I need an advice how to rewrite a piece of PHP code, so that it is more efficient on memory usage.
Code Description: I am fetching comma separated values (like a,b,c,d) from MySQL and adding each of them in a new comma separated string via while loop. After that I use array_count_values(array_filter(explode(',',$string) in order to get the count of times each different value is contained in the string.
The code itself works. However, for massive arrays, PHP returns an error Fatal error: Allowed memory size of 524288000 bytes exhausted and I need to find a solution for it.
Note: I would like to avoid assigning a higher memory_limit in php.ini or via ini_set. What I am looking for is a solution that will give the same result but without having to deal with memory issues. I was thinking that I can record the value count on each loop but somehow I cannot find the right way to do it.
Any suggestions?
$options = '';
while(!$rs_results->EOF){
$options .= $rs_results->fields['options'].",";
$rs_results->MoveNext();
}
$arrOptions = array_count_values(array_filter(explode(',', $options)));
Why not just count while you go?
$options = [];
while (!$rs_results->EOF) {
if ($rs_results->fields['options']) {
foreach (explode(',', $rs_results->fields['options']) as $value) {
if (!isset($options[$value])) $options[$value] = 0;
++$options[$value];
}
}
}
Obligatory observation that storing data this way is generally a bad idea. If you had a normalised structure you could just COUNT them in MySQL.

How do I efficiently run a PHP script that doesn't take forever to execute in wamp enviornemnt...?

I've made a script that pretty much loads a huge array of objects from a mysql database, and then loads a huge (but smaller) list of objects from the same mysql database.
I want to iterate over each list to check for irregular behaviour, using PHP. BUT everytime I run the script it takes forever to execute (so far I haven't seen it complete). Is there any optimizations I can make so it doesn't take this long to execute...? There's roughly 64150 entries in the first list, and about 1748 entries in the second list.
This is what the code generally looks like in pseudo code.
// an array of size 64000 containing objects in the form of {"id": 1, "unique_id": "kqiweyu21a)_"}
$items_list = [];
// an array of size 5000 containing objects in the form of {"inventory: "a long string that might have the unique_id", "name": "SomeName", id": 1};
$user_list = [];
Up until this point the results are instant... But when I do this it takes forever to execute, seems like it never ends...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"]) !== false)
{
echo("Found a version of the item");
}
}
}
Note that the echo should rarely happen.... The issue isn't with MySQL as the $items_list and $user_list array populate almost instantly.. It only starts to take forever when I try to iterate over the lists...
With 130M iterations, adding a break will help somehow despite it rarely happens...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"])){
echo("Found a version of the item");
break;
}
}
}
alternate solutions 1 with PHP 5.6: You could also use PTHREADS and split your big array in chunks to pool them into threads... with break, this will certainly improve it.
alternate solutions 2: use PHP7, the performances improvements regarding arrays manipulations and loop is BIG.
Also try to sort you arrays before the loop. depends on what you are looking at but very oftenly, sorting arrays before will limit a much as possible the loop time if the condition is found.
Your example is almost impossible to reproduce. You need to provide an example that can be replicated ie the two loops as given if only accessing an array will complete extremely quickly ie 1 - 2 seconds. This means that either the string your searching is kilobytes or larger (not provided in question) or something else is happening ie a database access or something like that while the loops are running.
You can let SQL do the searching for you. Since you don't share the columns you need I'll only pull the ones I see.
SELECT i.unique_id, u.inventory
FROM items i, users u
WHERE LOCATE(i.unique_id, u inventory)

How do I pre-allocate memory for an array in PHP?

How do I pre-allocate memory for an array in PHP? I want to pre-allocate space for 351k longs. The function works when I don't use the array, but if I try to save long values in the array, then it fails. If I try a simple test loop to fill up 351k values with a range(), it works. I suspect that the array is causing memory fragmentation and then running out of memory.
In Java, I can use ArrayList al = new ArrayList(351000);.
I saw array_fill and array_pad but those initialize the array to specific values.
Solution:
I used a combination of answers. Kevin's answer worked alone, but I was hoping to prevent problems in the future too as the size grows.
ini_set('memory_limit','512M');
$foundAdIds = new \SplFixedArray(100000); # google doesn't return deleted ads. must keep track and assume everything else was deleted.
$foundAdIdsIndex = 0;
// $foundAdIds = array();
$result = $gaw->getAds(function ($googleAd) use ($adTemplates, &$foundAdIds, &$foundAdIdsIndex) { // use call back to avoid saving in memory
if ($foundAdIdsIndex >= $foundAdIds->count()) $foundAdIds->setSize( $foundAdIds->count() * 1.10 ); // grow the array
$foundAdIds[$foundAdIdsIndex++] = $googleAd->ad->id; # save ids to know which to not set deleted
// $foundAdIds[] = $googleAd->ad->id;
PHP has an Array Class with SplFixedArray
$array = new SplFixedArray(3);
$array[1] = 'test1';
$array[0] = 'test2';
$array[2] = 'test3';
foreach ($array as $k => $v) {
echo "$k => $v\n";
}
$array[] = 'fails';
gives
0 => test1
1 => test2
2 => test3
As other people have pointed out, you can't do this in PHP (well, you can create an array of fixed length, but that's not really want you need). What you can do however is increase the amount of memory for the process.
ini_set('memory_limit', '1024M');
Put that at the top of your PHP script and you should be ok. You can also set this in the php.ini file. This does not allocate 1GB of memory to PHP, but rather allows PHP to expand it's memory usage up to that point.
A couple of things to point out though:
This might not be allowed on some shared hosts
If you're using this much memory, you might need to have a look at how you're doing things and see if they can be done more efficiently
Look out for opportunities to clear out unneeded resources (do you really need to keep hold of $x that contains a huge object you've already used?) using unset($x);
The quick answer is: you can't
PHP is quite different from java.
You can make an array with specific values as you said, but you already know about them. You can 'fake' it by filling it with null values, but that's about the same to be honest.
So unless you want to just create one with array_fill and null (which is a hack in my head), you just can't.
(You might want to check your reasoning about the memory. Are you sure this isn't an XY-problem? As memory is limited by a number (max usage) I don't think the fragmentation would have much effect. Check what is taking your memory rather then try going down this road)
The closest you will get is using SplFixedArray. It doesn't preallocate the memory needed to store the values (because you can't pre-specify the type of values used), but it preallocates the array slots and doesn't need to resize the array itself as you add values.

Need a code snippet for backward paging

Hi guys I'm in a bit on a fix here. I know how easy it is to build simple pagination links for dynamic pages whereby you can navigate between partial sets of records from sql queries. However the situation I have is as below:
COnsider that I wish to paginate between records listed in a flat file - I have no problem with the retrieval and even the pagination assuming that the flat file is a csv file with the first field as an id and new reocrds on new lines.
However I need to make a pagination system which paginates backwards i.e I want the LAST entry in the file to appear as the first as so forth. Since I don't have the power of sql to help me here I'm kinda stuck - all I have is a fixed sequence which needs to be paginated, also note that the id mentioned as first field is not necessarily numeric so forget about sorting by numerics here.
I basically need a way to loop through the file but backwards and paginate it as such.
How can I do that - I'm working in php - I just need the code to loop through and paginate i.e how to tell which is the offset and which is the current page etc.
I'm assuming you have a well-formed document with delimiters.
$array = explode("<>", $source); //parse data into an array
$backward = array_reverse($array); //entire array is reversed - last elements are now first
Use this code for a jumping off point.
$records = file('filedata.csv');
$recordsInOrder = array_reverse($records);
$first = 5;
$last = 10;
for($x = $first; $x <= $last; $x++) {
$viewTheseResults[] = $recordsInOrder[$x];
}
You can use an offset to determine the starting and ending keys in the array similar to how you would if you were pulling the data from a database.

PHP Change Array Over and Over

I have any array
$num_list = array(42=>'0',44=>'0',46=>'0',48=>'0',50=>'0',52=>'0',54=>'0',56=>'0',58=>'0',60=>'0');
and I want to change specific values as I go through a loop
while(list($pq, $oin) = mysql_fetch_row($result2)) {
$num_list[$oin] = $pq;
}
So I want to change like 58 to 403 rather then 0.
However I always end up getting just the last change and non of the earlier ones. So it always ends up being something like
0,0,0,0,0,0,0,0,0,403
rather then
14,19,0,24,603,249,0,0,0,403
How can I do this so it doesn't overwrite it?
Thanks
Well, you explicititly coded that each entry should be replaced with the values from the database (even with "0").
You could replace the values on non-zero-values only:
while(list($pq, $oin) = mysql_fetch_row($result2)) {
if ($pq !== "0") $num_list[$oin] = $pq;
}
I don't get you more clear, i thought your asking this only. Check this
while(list($pq, $oin) = mysql_fetch_row($result2)) {
if($oin==58) {
$num_list[$oin] = $pq;
}
}
In my simulated tests (although You are very scarce with information), Your code works well and produces the result that You want. Check the second query parameter, that You put into array - namely $pg, thats what You should get there 0,0,0,0,0...403 OR Other thing might be that Your $oin numbers are not present in $num_list keys.
I tested Your code with mysqli driver though, but resource extraction fetch_row is the same.
Bear in mind one more thing - if Your query record number is bigger than $numlist array, and $oin numbers are not unique, Your $numlist may be easily overwritten by the folowing data, also $numlist may get a lot more additional unwanted elements.
Always try to provide the wider context of Your problem, there could be many ways to solve that and help would arrive sooner.

Categories