Memcache alternatives, more control - php

My new PHP application could be sped up with some caching of MySQL results.
I have limited experience with memcached, but I don't think it can do what I require.
As I am working on a multi-user application I would like to be able to delete several stored values at once without removing everything.
So I might store:
account_1.value_a = foo
account_1.value_b = bar
account_2.value_a = dog
account_2.value_b = cat
Is there a caching system that would allow me to delete based on a wildcard (or similar method) such as "delete account_1.*" leaving me with:
account_1.value_a = <unset>
account_1.value_b = <unset>
account_2.value_a = dog
account_2.value_b = cat
Thanks,
Jim

Not really, but you can fake it by using version numbers in your keys.
For example, if you use keys like this:
{entitykey}.{version}.{fieldname}
So now your account_1 object keys would be:
account_1.1.value_a
account_1.1.value_b
When you want to remove account_1 from the cache, just increment the version number for that object. Now your keys will be:
account_1.2.value_a
account_1.2.value_b
You don't even need to delete the original cached values - they will fall out of the cache automatically since you'll no longer be using them.

This might help: memcache and wildcards

Open source module to get tags for keys in memcache, and other:
http://github.com/jamm/memory/

Scache (http://scache.nanona.fi) has nested keyspaces so you could store data on subkeys and expire parent when needed.

Memcached delete by tag can be done like this;
Searching and deleting 100,000 keys is quite fast, but performance should be monitored in much larger caches.
Before Php 8.0
$tag = "account_1";
$cached_keys = $this->memcached->getAllKeys();
foreach($cached_keys as $key){
if(substr($key, 0, strlen($tag)) === $tag){
$this->memcached->delete($key);
}
}
Php 8.0 >
$tag = "account_1";
$cached_keys = $this->memcached->getAllKeys();
foreach($cached_keys as $key){
if (str_starts_with($key, $tag)) {
$this->memcached->delete($key);
}
}

Related

Caching SQL Lookups & Retrieving Data

I'm trying to setup caching of postcode lookups, which adds the resulting lookup to a text file using the following;
file_put_contents($cache_file, $postcode."\t".$result."\n", FILE_APPEND);
I'd like to be able to check this file before running a query, which i have done using this:
if( strpos(file_get_contents($cache_file),$postcode) !== false) {
// Run function
}
What I'd like to do, is search for the $postcode with in the text file (as above) and return the data one tab over ($result).
Firstly, is this possible?
Secondly, is this is even a good way to cache SQL lookups?
1) yes it's possible - the easiest way would be storing the lookup data in an array and write/read it from a file with serialze / unserialze
$lookup_codes = array(
'10101' => 'data postcode 1 ...',
'10102' => 'data postcode 2 ...',
// ...
);
file_put_contents($cache_file, serialize($lookup_codes));
$lookup_codes = unserialize(file_get_contents($cache_file));
$postcode = '10101';
if(array_key_exists($postcode, $lookup_codes)){
// ... is available
}
2) is the far more interesting question. It really depends on your data, the structure, the amount etc.
In my opinion, caching add more complexity to your application, and so if possible avoid it :-)
You could try to:
Optimizing your SQL query or database structure to speed it up for requesting postcode data.
Normally Databases are quite fast - and therefore made for such use-cases
I'm not sure which db you are running, but for MySQL look into Select Optimization. Or as another keyword you can search for INDEX which boost queries quite heavy
file_get_contents is really fast, but when you are changing the file often maybe look into other ways of caching, like Memcached for storing it In-Memory

How do I efficiently run a PHP script that doesn't take forever to execute in wamp enviornemnt...?

I've made a script that pretty much loads a huge array of objects from a mysql database, and then loads a huge (but smaller) list of objects from the same mysql database.
I want to iterate over each list to check for irregular behaviour, using PHP. BUT everytime I run the script it takes forever to execute (so far I haven't seen it complete). Is there any optimizations I can make so it doesn't take this long to execute...? There's roughly 64150 entries in the first list, and about 1748 entries in the second list.
This is what the code generally looks like in pseudo code.
// an array of size 64000 containing objects in the form of {"id": 1, "unique_id": "kqiweyu21a)_"}
$items_list = [];
// an array of size 5000 containing objects in the form of {"inventory: "a long string that might have the unique_id", "name": "SomeName", id": 1};
$user_list = [];
Up until this point the results are instant... But when I do this it takes forever to execute, seems like it never ends...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"]) !== false)
{
echo("Found a version of the item");
}
}
}
Note that the echo should rarely happen.... The issue isn't with MySQL as the $items_list and $user_list array populate almost instantly.. It only starts to take forever when I try to iterate over the lists...
With 130M iterations, adding a break will help somehow despite it rarely happens...
foreach($items_list as $item)
{
foreach($user_list as $user)
{
if(strpos($user["inventory"], $item["unique_id"])){
echo("Found a version of the item");
break;
}
}
}
alternate solutions 1 with PHP 5.6: You could also use PTHREADS and split your big array in chunks to pool them into threads... with break, this will certainly improve it.
alternate solutions 2: use PHP7, the performances improvements regarding arrays manipulations and loop is BIG.
Also try to sort you arrays before the loop. depends on what you are looking at but very oftenly, sorting arrays before will limit a much as possible the loop time if the condition is found.
Your example is almost impossible to reproduce. You need to provide an example that can be replicated ie the two loops as given if only accessing an array will complete extremely quickly ie 1 - 2 seconds. This means that either the string your searching is kilobytes or larger (not provided in question) or something else is happening ie a database access or something like that while the loops are running.
You can let SQL do the searching for you. Since you don't share the columns you need I'll only pull the ones I see.
SELECT i.unique_id, u.inventory
FROM items i, users u
WHERE LOCATE(i.unique_id, u inventory)

Select all key data for a specific pattern from Redis using PHP

I am using PHPRedis for this.
I need to create script that copies all of the keys with the pattern mobile* from one Redis host1 to host2.
I have got this working by selecting all keys from host1 with the pattern mobile*. Then looping over each of these keys using the get key method to return the data. I then set the key for host2 using the set method:
$auKeys = $redis->keys("mobile*");
foreach ($auKeys as $key) {
$data = $redis->get($key);
$redis2->set($key, $data, 6000);
echo $key;
}
The problem is this takes around 5 minutes - I need to get it down to 2-3 minutes. Is there another way to do this?
The simplest route to SET you can take for a better performance is to PIPE the keys and hit the redis server once to execute all of them instead of a trip/key .
https://github.com/phpredis/phpredis/issues/251
$pipeline = $redis->multi($host, Redis::PIPELINE);
//put result in our shared list
foreach ($items as $item) {
$pipeline->sAdd($key, $item);
}
$ret = $pipeline->exec();
At the same time, there is also libraries out there if you are seeking a different way to trasnlate commands to Redis Protocol .
redis bulk import using --pipe
Typically, it's best to avoid KEYS in production code. It's preferable to modify the application that's writing the keys yo keep a list of keys in use, where possible, or use the newer SCAN operation.
In this case you revealed that KEYS wasn't taking a long time (it will when you have a very large key space, will the number of keys grow with time?), so the slow performance is due to all the network roundtrips. One per GET. Pipelines are indeed a great way of grouping up operations to avoid roundtrips.
In this case I suggest the use of MGET to get all the values in one network op and MSET to update them in one network op.

How do I pre-allocate memory for an array in PHP?

How do I pre-allocate memory for an array in PHP? I want to pre-allocate space for 351k longs. The function works when I don't use the array, but if I try to save long values in the array, then it fails. If I try a simple test loop to fill up 351k values with a range(), it works. I suspect that the array is causing memory fragmentation and then running out of memory.
In Java, I can use ArrayList al = new ArrayList(351000);.
I saw array_fill and array_pad but those initialize the array to specific values.
Solution:
I used a combination of answers. Kevin's answer worked alone, but I was hoping to prevent problems in the future too as the size grows.
ini_set('memory_limit','512M');
$foundAdIds = new \SplFixedArray(100000); # google doesn't return deleted ads. must keep track and assume everything else was deleted.
$foundAdIdsIndex = 0;
// $foundAdIds = array();
$result = $gaw->getAds(function ($googleAd) use ($adTemplates, &$foundAdIds, &$foundAdIdsIndex) { // use call back to avoid saving in memory
if ($foundAdIdsIndex >= $foundAdIds->count()) $foundAdIds->setSize( $foundAdIds->count() * 1.10 ); // grow the array
$foundAdIds[$foundAdIdsIndex++] = $googleAd->ad->id; # save ids to know which to not set deleted
// $foundAdIds[] = $googleAd->ad->id;
PHP has an Array Class with SplFixedArray
$array = new SplFixedArray(3);
$array[1] = 'test1';
$array[0] = 'test2';
$array[2] = 'test3';
foreach ($array as $k => $v) {
echo "$k => $v\n";
}
$array[] = 'fails';
gives
0 => test1
1 => test2
2 => test3
As other people have pointed out, you can't do this in PHP (well, you can create an array of fixed length, but that's not really want you need). What you can do however is increase the amount of memory for the process.
ini_set('memory_limit', '1024M');
Put that at the top of your PHP script and you should be ok. You can also set this in the php.ini file. This does not allocate 1GB of memory to PHP, but rather allows PHP to expand it's memory usage up to that point.
A couple of things to point out though:
This might not be allowed on some shared hosts
If you're using this much memory, you might need to have a look at how you're doing things and see if they can be done more efficiently
Look out for opportunities to clear out unneeded resources (do you really need to keep hold of $x that contains a huge object you've already used?) using unset($x);
The quick answer is: you can't
PHP is quite different from java.
You can make an array with specific values as you said, but you already know about them. You can 'fake' it by filling it with null values, but that's about the same to be honest.
So unless you want to just create one with array_fill and null (which is a hack in my head), you just can't.
(You might want to check your reasoning about the memory. Are you sure this isn't an XY-problem? As memory is limited by a number (max usage) I don't think the fragmentation would have much effect. Check what is taking your memory rather then try going down this road)
The closest you will get is using SplFixedArray. It doesn't preallocate the memory needed to store the values (because you can't pre-specify the type of values used), but it preallocates the array slots and doesn't need to resize the array itself as you add values.

Handling large datasets with PHP/Drupal

I have a report page that deals with ~700k records from a database table. I can display this on a webpage using paging to break up the results. However, my export to PDF/CSV functions rely on processing the entire data set at once and I'm hitting my 256MB memory limit at around 250k rows.
I don't feel comfortable increasing the memory limit and I haven't got the ability to use MySQL's save into outfile to just serve a pre-generated CSV. However, I can't really see a way of serving up large data sets with Drupal using something like:
$form = array();
$table_headers = array();
$table_rows = array();
$data = db_query("a query to get the whole dataset");
while ($row = db_fetch_object($data)) {
$table_rows[] = $row->some attribute;
}
$form['report'] = array('#value' => theme('table', $table_headers, $table_rows);
return $form;
Is there a way of getting around what is essentially appending to a giant array of arrays? At the moment I don't see how I can offer any meaningful report pages with Drupal due to this.
Thanks
With such a large dataset, I would use Drupal's Batch API which allows for time intensive operations to be broken into batches. It is also better for users because it will give them a progress bar with some indication of how long the operation will take.
Start the batch operation by opening a temporary file, then append new records to it on each new batch until done. The final page can do the final processing to deliver the data as cvs or convert to PDF. You'd probably want to add some cleanup afterwords as well.
http://api.drupal.org/api/group/batch/6
If you are generating PDF or CSV you shouldn't use the Drupal native functions. What about writing to the output file inside your while loop? This way, only one result set is in memory at a given time.
At the moment you store everything in the array $table_rows.
Can't you flush at least parts of the report while you're reading it from the database (e.g. every so and so many lines) in order to free some of the memory? I can't see why it should only be possible to write to a csv at once.
I don't feel comfortable increasing the memory limit
Increasing the memory limit doesn't mean that every php process will use that amount of memory. However you could exec the cli version of php with a custom memory limit - but that's not the right solution either....
and I haven't got the ability to use MySQL's save into outfile to just serve a pre-generated CSV
Then don't save it all in an array - write each line to the output buffer when you fetch it from the database (IIRC the entire result set is buffered outside the limited php memory). Or write it directly to a file then do a redirect when the file is completed and closed.
C.
You should include paging into that with a pager_query, and break results into 50-100 per page. That should help a lot. You say you want to use paging but I don't see it in the code.
Check this out: http://api.drupal.org/api/function/pager_query/6
Another things to keep in mind is that in PHP5 (before 5.3), assigning an array to a new variable or passing it to a function copies the array and does not create a reference. You may be creating many copies of the same data, and if none are unset or go out of scope they cannot be garbage collected to free up memory. Where possible, using references to perform operations on the original array can save memory
function doSomething($arg){
foreach($arg AS $var)
// a new copy is created here internally: 3 copies of data exist
$internal[] = doSomethingToValue($var);
return $internal;
// $arg goes out of scope and can be garbage collected: 2 copies exist
}
$var = array();
// a copy is passed to function: 2 copies of data exist
$var2 = doSomething($var);
// $var2 will be a reference to the same object in memory as $internal,
// so only 2 copies still exist
if the $var is set to the return value of the function, the old value can be garbage collected, but not until after the assignment, so more memory will still be needed for a brief time
function doSomething(&$arg){
foreach($arg AS &$var)
// operations are performed on original array data:
// only two copies of an array element exist at once, not the whole array
$var = doSomethingToValue($var);
unset($var); // not needed here, but good practice in large functions
}
$var = array();
// a reference is passed to function: 1 copy of data exists
doSomething($var);
The way I approach such huge reports is to generate them with the php cli/Java/CPP/C# (i.e. CRONTAB) + use the unbuffered query option mysql has.
Once the file/report creation is done on the disk, you can give a link to it...

Categories