How i can design a cache system using PDO and memcached?

How i can design a cache system using PDO and memcached? - php

I'm using PDO for connect to the database in a system where I want implement memcached.
I don't know what keys use for caching the results because I can't get the string of the final query with PDO (because the prepared statements).
Any good idea for resolve this?
Thanks in advance.

If you're just going to cache query results directly based on query string, Mysql's query cache already does this for you. Don't reinvent the wheel. The one potential difference is Mysql's query cache is aggressively invalidated so that stale (out of date, incorrect) data is never returned; depending on how you handle invalidation, your strategy may further reduce database load, but at the cost of serving stale, out of date data on a regular basis.
Additionally, you won't really be able to selectively expire your various cache keys when updates happen (how would you know which query strings should be expired when an insert/update runs?); as a result you'll just have to set a short expiration time (probably in seconds), to minimize the amount of time you're serving stale data. This will probably mean a low cache hit rate. In the end, the caching strategy you describe is simple to implement, but it's not very effective.
Make sure to read the "Generic Design Approaches" section of the memecached FAQ. A good caching strategy deletes/replaces cached data immediately when updates occur -- this allows you to cache data for hours/days/weeks, and simultaneously never serve out of date data to users.

Here is interesting tutorial it might be helpful - http://techportal.inviqa.com/2009/02/16/getting-started-with-memcached/
I guess you can automate the process by implementing a function like this:
function query($name, $sql, $params, $db, $cache) {
$result = $this->cache->get($name);
if (!$result) {
$stmt = $db->prepare($sql);
$exec = $stmt->execute($params);
$result = $stmt->fetch(PDO::FETCH_ASSOC);
$cache->add($name, $result);
}
return $result;
}

Related

Can memcached be used to reduce the load on these SELECT * queries

I have many users polling my php script on an apache server and the mysql query they run is a
"SELECT * FROM `table` WHERE `id`>'example number'
Where example number can vary from user to user, but has a known lower bound which is updated every 10 minute.
The server is getting polled twice a second by each user.
Can memcache by used? It's not crucial that the user is displayed the most uptodate information, if it's a second behind or so that is fine.
The site has 200 concurrent users at peak times. It's hugely inefficient and costing a lot of resources.

To give an accurate answer, I will need more information
Whether the query is pulling personalised information.
Whether the polling request got the 'example number' coming along with the request.
looking at the way you have structured your question , it doesn't seem like the user is polling for any personalised information. So I assume the 'example number' is also coming as a part of the polling request.
I agree to #roberttstephens and #Filippos Karapetis , That you can use ideal solutions
Redis
NoSQL
Tune the MySQL
Memcache
But as you guys have the application already out there in the wild, implementing above solutions will have a cost, so these are the practical solutions I would recommend.
Add indexes for your table wrt to relevant columns. [first thing to check /do]
Enable mysql query caching.
Use a reverse proxy - eg : varnish . [assumption 'example number' comes as a part of the request]
To intersect the requests even before it hits your application server so that the MySQL query , MemCache/ Redis lookup doesn't happen.
Make sure that you are setting specific cache headers set on the response so that varnish caches it.
So, of the 200 concurrent requests , if 100 of them are querying for same number varnish takes the hit. [it is the same advantage that memcache can also offer].
Implementation wise it doesn't cost much in terms of development / testing efforts.
I understand this is not the answer to the exact question . But I am sure this could solve your problem.
If the 'example number' doesn't come as a part of the request , and you have to fetch it from the DB [by looking at the user table may be..] Then #roberttstephens approach is the way to go. just to give you the exact picture , I have refactored the code a little.
`addServer('localhost', 11211);
$inputNumber = 12345;
$cacheKey = "poll:".$inputNumber;
$result = $m->get($cacheKey);
if ($result) {
return unserialize($result);
}
$sth = $dbh->prepare("SELECT column1, column2 FROM poll WHERE id = $inputNumber");
$sth->execute();
$poll_results = $sth->fetch(PDO::FETCH_OBJ);
$m->set($cacheKey, serialize($poll_results));`

In my opinion, you're trying to use the wrong tool for the job here.
memcached is a key/value storage, so you can make it store and retrieve several values with a given set of keys very quickly. However, you don't seem to know the keys you want in advance, since you're looking for all records where the id is GREATER THAN a number, rather than a collection of IDs. So, in my opinion, memcached won't be appropriate to use in your scenario.
Here are your options:
Option 1: keep using MySQL and tune it properly
MySQL is quite fast if you tune it properly. You can:
add the appropriate indexes to each table
use prepared statements, which can help performance-wise in your case, as users are doing the same query over and over with different parameters
use query caching
Here's a guide with some hints on MySQL tuning, and mysqltuner, a Perl script that can guide you through the options needed to optimize your MySQL database.
Option 2: Use a more advanced key-value storage
There are alternatives to memcached, with the most known one being redis. redis does allow more flexibility, but it's more complex than memcached. For your scenario, you could use the redis zrange command to retrieve the results you want - have a look at the available redis commands for more information.
Option 3: Use a document storage NoSQL database
You can use a document storage NoSQL database, with the most known example being MongoDB.
You can use more complex queries in MongoDB (e.g. use operators, like "greater than", which you require) than you can do in memcached. Here's an example of how to search through results in a mongo collection using operators (check example 2).
Have a look at the PHP MongoDB manual for more information.
Also, this SO question is an interesting read regarding document storage NoSQL databases.

You can absolutely use memcached to cache the results. You could instead create a cache table in mysql with less effort.
In either case, you would need to create an id for the cache, and retrieve the results based on that id. You could use something like entity_name:entity_id, or namespace:entity_name:entity_id, or whatever works for you.
Keep in mind, memcached is another service running on the server. You have to install it, set it up to start on reboot (or you should at least), allocate memory, etc. You'll also need php-memcached.
With that said, please view the php documentation on memcached. http://php.net/manual/en/memcached.set.php . Assuming your poll id is 12345, you could use memcached like so.
<?php
// Get your results however you normally would.
$sth = $dbh->prepare("SELECT column1, column2 FROM poll WHERE id = 12345");
$sth->execute();
$poll_results = $sth->fetch(PDO::FETCH_OBJ);
// Set up memcached. It should obviously be installed, configured, and running by now.
$m = new Memcached();
$m->addServer('localhost', 11211);
$m->set('poll:12345', serialize($poll_results));
This example doesn't have any error checking or anything, but this should explain how to do it. I also don't have a php, mysql, or memcached instance running right now, so the above hasn't been tested.

How does memcache with MySQL work?

I am trying to understand (and probably deploy) memcached in our env.
We have 4 web servers on loadbalancer running a big web app developed in PHP. We are already using APC.
I want to see how memcached works? At least, may be I don't understand how caching works.
We have some complex dynamic queries that combine several tables to pull data. Each time, the data is going to be from different client databases and data keeps changing. From my understanding, if some data is stored in cache, and if the request is same next time, the same data is returned. (Or I may be completely wrong here).
How does this whole memcache (or for that matter, any caching stuff works)?

Cache, in general, is a very fast key/value storage engine where you can store values (usually serialized) by a predetermined key, so you can retrieve the stored values by the same key.
In relation to MySQL, you would write your application code in such a way, that you would check for the presence of data in cache, before issuing a request to the database. If a match was found (matching key exists), you would then have access to the data associated to the key. The goal is to not issue a request to the more costly database if it can be avoided.
An example (demonstrative only):
$cache = new Memcached();
$cache->addServer('servername', 11211);
$myCacheKey = 'my_cache_key';
$row = $cache->get($myCacheKey);
if (!$row) {
// Issue painful query to mysql
$sql = "SELECT * FROM table WHERE id = :id";
$dbo->prepare($sql);
$stmt->bindValue(':id', $someId, PDO::PARAM_INT);
$row = $stmt->fetch(PDO::FETCH_OBJ);
$cache->set($myCacheKey, serialize($row));
}
// Now I have access to $row, where I can do what I need to
// And for subsequent calls, the data will be pulled from cache and skip
// the query altogether
var_dump(unserialize($row));
Check out PHP docs on memcached for more info, there are some good examples and comments.

There are several examples on how memcache works. Here is one of the links.
Secondly, Memcache can work with or without MySQL.
It caches your objects which are in PHP, now whether it comes from MySQL, or anywhere else, if its an PHP Object, it can be stored in MemCache.
APC gives you some more functionality than Memcache. Other than storing/caching PHP objects, it also caches PHP-executable-machine-readable-opcodes so that your PHP files won't go through the processes of loading in memory-> Being Comiled, rather, it directly runs the already compiled opcode from the memory.

If your data keeps changing(between requests) then caching is futile, because that data is going to be stale. But most of the times(I bet even in your cache) multiple requests to database result in same data set in which case a cache(in memory) is very useful.
P.S: I did a quick google search and found this video about memcached which has rather good quality => http://www.bestechvideos.com/2009/03/21/railslab-scaling-rails-episode-8-memcached. The only problem could be that it talks about Ruby On Rails(which I also don't use that much, but is very easy to understand). Hopefully it is going to help you grasp the concept a little better.

How to cache a mysql_query using memcache?

I would like to know if it's possible to store a "ressource" within memcache, I'm currently trying the following code but apparently it's not correct:
$result = mysql_query($sSQL);
$memcache->set($key, $result, 0, $ttl);
return $result;

I have to disagree with zerkms. Just because MySQL has a caching system (actually, it has several), doesn't mean that there's no benefit to optimizing your database access. MySQL's Query Cache is great, but it still has limitations:
it's not suitable for large data sets
queries have to be identical (character for character)
it does not support prepared statements or queries using user-defined functions, temporary tables, or tables with column-level privileges
cache results are cleared every time the table is modified, regardless of whether the result set is affected
unless it resides on the same machine as the web server it still incurs unnecessary network overhead
Even with a remote server, Memcached is roughly 23% faster than MQC. And using APC's object cache, you can get up to a 990% improvement over using MQC alone.
So there are plenty of reasons to cache database result sets outside of MySQL's Query Cache. After all, you cache result data locally in a PHP variable when you need to access it multiple times in the same script. So why wouldn't you extend this across multiple requests if the result set doesn't change?
And just because the server is fast enough doesn't mean you shouldn't strive to write efficient code. It's not like it takes that much effort to cache database results—especially when accelerators like APC and Memcached were designed for this exact purpose. (And I wouldn't dismiss this question as such a "strange idea" when some of the largest sites on the internet use Memcached in conjunction with MySQL.)
That said, zerkms is correct in that you have to fetch the results first, then you can cache the data using APC or Memcached. There is however another option to caching query results manually, which is to use the Mysqlnd query result cache plugin. This is a client-side cache of MySQL query results.
The Mysqlnd query result cache plugin lets you transparently cache your queries using APC, Memcached, sqlite, or a user-specified data source. However, this plugin currently shares the same limitation as MQC in that prepared statements can't be cached.

Why do you need so? Mysql has its own performant query cache
but if you still want to follow your strange idea - you need to fetch all the data into array (with mysql_fetch_assoc or whatever) and after that store that array into the memcached.

Php and mysql caching

I am currently working on a php/mysql project with the AbleDating system, my customer is worried about server load so he asked me to use "caching" as much as I could, he asked me to cache mysql query and some html zones...
Is that possible to cache only some html zones with php? If yes how can I do this?
For the mysql caching is it just an option to check or must I change something in the coding?
Thanks!

MySql caching basically just caches resultsets against SQL issued to the database: if the SQL statement/query is in the cache, then the resultset gets returned without any work being done by the database engine. There is thus a certain amount of overhead in maintaining accuracy (i.e. the DB must track changes and flush cache entries accordingly).
Compare this to other DBs such as Oracle, where the caching mechanism can take into account placeholders (bound variables) and omits a "hard" parse (i.e. checking if the SQL is valid etc.) if the SQL plan is found in the SQL common cache.
If you find yourself repeatedly submitting identical SQL to the database, then caching may make a substantial difference. If this is not case, you may even find that the additional overhead cancels out any benefit. But you won't know for sure until you have some metrics from your system (i.e. profiling your SQL, analysing the query logs etc.)

Sure caching is very important.
I use self a php chacher called minicache have a look
http://code.google.com/p/minicache/

memcached is a great way to cache anything (PHP, mysql results, whatever) in memory.
Couple it with an easy to use caching library like Zend_Cache and it makes caching a cinch:
$frontendOptions = array(
'lifetime' => 60, // Seconds to cache
'automatic_serialization' => true
);
$cache = Zend_Cache::factory('Core',
'Memcached',
$frontendOptions);
if (!$my_nasty_large_result = $cache->load('my_nasty_large_result')) {
$nasty_big_long_query = $db->query('SELECT * FROM huge_table');
$nasty_big_long_result = array();
foreach ($nasty_big_long_result as $result)
$nasty_big_long_result[] = $result;
$cache->save($nasty_big_long_result, 'my_nasty_large_resultt');
}

PHP memcache design patterns

We use memcache basically as an after thought to just cache query results.
Invalidation is a nightmare due to the way it was implemented. We since learned some techniques with memcache thru reading the mailing list, for example the trick to allow group invalidation of a bunch of keys. For those who know it, skip the next paragraph..
For those who don't know and are interested, the trick is adding a sequence number to your keys and storing that sequence number in memcache. Then every time before you do your "get" you grab the current sequence number and build your keys around that. Then, to invalidate the whole group you just increment that sequence number.
So anyway, I'm currently revising our model to implement this.
My question is..
We didn't know about this pattern, and I'm sure there are others we don't know about. I've searched and haven't been able to find any design patterns on the web for implementing memcache, best practices, etc.
Can someone point me to something like this or even just write up an example? I would like to make sure we don't make a beginners mistake in our new refactoring.

One point to remember with object caching is that it's just that - a cache of objects/complex structures. A lot of people make the mistake of hitting their caches for straightforward, efficient queries, which incurs the overhead of a cache check/miss, when the database would have obtained the result far faster.
This piece of advice is one I've taken to heart since it was taught to me; know when not to cache, that is, when the overhead cancels out the perceived benefits. I know it doesn't answer the specific question here, but I thought it was worth pointing out as a general hint.

What rob is saying is good advice. From my experience, there are two common ways to identify and invalidate tags: unique identification and tag-based identification. Those are usually combined to form a complete solution in which:
A cache record is assigned a unique identifier (which usually depends somehow on the data that it caches) and optionally any number of tags.
Cache records are recalled by their unique identifier.
Cache records can be invalidated by their unique identifier (one at a time), or by any tag they are tagged with (possibly invalidating multiple records at the same time).
This is relatively simple to implement and generally works very well. I have yet to come across a system that needed more, though there are probably some edge cases out there that require specific solutions.

I use the Zend Cache component (you don't have to use the entire framework just the zend cache stuff if you want). It abstracts some of the caching stuff (it supports grouping cache by 'tags' though that feature is not supported for the memcache back end I've rolled my own support for 'tags' with relative ease). So the pattern i use for functions that access cache (generally in my model) is:
public function getBySlug($ignoreCache = true)
{
if($ignoreCache || !$result = $this->cache->load('someKeyBasedOnQuery'))
{
$select = $this->select()
->where('slug = ?', $slug);
$result = $this->fetchRow($select);
try
{
$this->cache->save($result,'someKeyBasedOnQuery');
}
catch(Zend_Exception $error)
{
//log exception
}
}
else
{
$this->registry->logger->info('someKeyBasedOnQuery came from cache');
}
return $result;
}
basing the cache key on a hash of the query means that if another developer bypasses my models or used another function elsewhere that does the same thing it's still pulled from cache. Generally I tag the cache with a couple generate tag (the name of the table is one and the other is the name of the function). So by default our code invalidates on insert,delete and update the cached items with the tag of the table. All in all caching is pretty automatic in our base code and developers can be secure that caching 'just works' in projects that we do. (also the great side effect of making use of tagging is that we have a page that offers granular cache clearing/management, with options to clear cache by model functions, or tables).

We also store the query results from our database (PostgreSQL) in memcache and we are using triggers on the tables to invalidate the cache - there are several APIs out there (e.g. pgmemcache, I think mysql has something like that too but I don't know for sure). The benefit is that the database self (triggers) can handle the invalidation of data on changes (update,insert,delete), you don't need to write all that stuff into your "application".

mysqlnd_qc, which inserts memcaching at the database query results return level, auto caches result sets from mysql. It is FANTASTIC and automatic.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.