MongoDB as MySQL cache

MongoDB as MySQL cache - php

I just had this idea and thinks it's a good solution for this problem but I ask if there are some downsides to this method. I have a webpage that often queries database, as much as 3-5 queries per page load. Each query is making a dozen(literally) joins and then each of these queries results are used for another queries to construct PHP objects. Needless to say the load times are ridiculous even on the cloud but it's the way it works now.
I thought about storing the already constructed objects as JSON, or in MongoDB - BSON format. Will it be a good solution to use MongoDB as a cache engine of this type? Here is the example of how I think it will work:
When the user opens the page, if there is no data in Mongo with the proper ID, the queries to MySQL fire, each returning data that is being converted to a properly constructed object. The object is sent to the views and is converted to JSON and saved in Mongo.
If there was data in Mongo with the corresponding ID, it is being sent to PHP and converted.
When some of the data changes in MySQL (administrator edits/deletes content) a delete function is fired that will delete the edited/deleted object in MongoDB as well.
Is it a good way to use MongoDB? What are the downsides of this method? Is it better to use Redis for this task? I also need NoSQL for other elements of the project, that's why I'm considering to use one of these two instead of memcache.
MongoDB as a cache for frequent joins and queries from MySQL has some information, but it's totally irrelevant.

I think you would be better off using memcached or Redis to cache the query results. MongoDB is more of a full database than a cache. While both memcached and Redis are optimized for caching.
However, you could implement your cache as a two-level cache. Memcached, for example, does not guarantee that data will stay in the cache. (it might expire data when the storage is full). This makes it hard to implement a system for tags (so, for example, you add a tag for a MySQL table, and then you can trigger expiration for all query results associated with that table). A common solution for this, is to use memcached for caching, and a second slower, but more reliable cache, which should be faster than MySQL though. MongoDB could be a good candidate for that (as long as you can keep the queries to MongoDB simple).

Well you can go with Memcached or Redis for caching objects. Mongodb can be also used as a cache. I use mongodb for caching aggregation results, since it has advantage of wide range of queries as well unlike Memcached.
For example, in a tagging application, if I have to display page count corresponding to each tag, it scans whole table for a group by query. So I have a cronjob which computes that group by query and cache the aggregation result in Mongo. This works perfectly well for me in production. You can do this for countless other complex computations as well.
Also mongodb capped collections and TTL collections are perfect for caching.

Related

How to store PHP Trie for all later uses?

I'm designing an application in PHP which involves Trie data structure.
For time efficient prefix search, I'm using Trie.
I'm constructing the Trie using records from the database.
Now, the database has millions of records. So it is not feasible to everytime create the Trie and then search in it, for every new user request.
Instead can I create the Trie only once and somehow store this information, such that it does not have to be re-created for every new user request, and then searching can be immediately done. Is there somehow I can cache the created Trie (not just for one user session, but for all user requests) using PHP?
Any help would be much appreciated.

You have a couple of standard options.
Cache the database result in memory, using a simple cache like memcached
Cache using Redis, perhaps taking advantage of some of its extra features. This might involve a process where you load the data into a structure in REDIS and have your trie search code work against Redis directly rather than the database result set.
In either case, you are going to cache the result for some period of time that is acceptable, and since the database result will be in memory in some form, there is no load placed on the RDBMS.
In your related question, you indicated that he raw serialized form of the variable would be about 200mb in size. That is well within the max object size (512mb) for Redis, but could be problematic for memcached. I personally use Redis for most app server caching these days.

Difference between mysql cache and memcached

I have 1 mysql table which is controlled strictly by admin. Data entry is very low but query is high in that table. Since the table will not change content much I was thinking to use mysql query cache with PHP but got confused (when i googled about it) with memcached.
What is the basic difference between memcached and mysqlnd_qc ?
Which is most suitable for me as per below condition ?
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
My queries will return less than 30 rows mostly of very few bytes data and will have same SELECT queries. I am on a single server and no load sharing will be done. Thankyou in advance.

If your query is always the same, i.e. you do SELECT title, stock FROM books WHERE stock > 5 and your condition never changes to stock > 6 etc., I would suggest using MySQL Query Cache.
Memcached is a key-value store. Basically it can cache anything if you can turn it into key => value. There are a lot of ways you can implement caching with it. You could query your 30 rows from database, then cache it row by row but I don't see a reason to do that here if you're returning the same set of rows over and over. The most basic example I can think of for memcached is:
// Run the query
$result = mysql_query($con, "SELECT title, stock FROM books WHERE stock > 5");
// Fetch result into one array
$rows = mysqli_fetch_all($result);
// Put the result into memcache.
$memcache_obj->add('my_books', serialize($rows), false, 30);
Then do a $memcache_obj->get('my_books'); and unserialize it to get the same results.
But since you're using the same query over and over. Why add the complication when you can let MySQL handle all the caching for you? Remember that if you go with memcached option, you need to setup memcached server as well as implementing logic to check if the result is already in cache or not, or if the records have been changed in the database.
I would recommend using MySQL query cache over memcached in this case.
One thing you need to be careful with MySQL query cache, though, is that your query must be exactly the same, no extra blank spaces, comments whatsoever. This is because MySQL does no parsing to determine compare the query string from cache at all. Any extra character somewhere in the query means a different query.
Peter Zaitsev explained very well about MySQL Query Cache at http://www.mysqlperformanceblog.com/2006/07/27/mysql-query-cache/, worth taking a look at it. Make sure you don't need anything that MySQL Query Cache does not support as Peter Zaitsev mentioned.

If the queries run fast enough and does not really slows your application, do not cache it. With a table this small, MySQL will keep it in it's own cache. If your application and database are on the same server, the benefit will be very small, maybe even not measurable at all.
So, for your 3rd question, it also depends on how you query the underlying tables. Most of the time, it is sufficient to let MySQL cache it internally. An other approach is to generate all the possible combinations and store these, so mysql does not need to compute the matching rows and returns the right one straight away.
As a general rule: build your application without caching and only add caches for things that do not change often if a) the computation for the resultset is complex and timeconsuming or b) you have multiple application instances calling the database over a network. In those cases caching results in better performance.
Also, if you run PHP in a web server like Apache, caching inside your program does not add much benefit as it only uses the cache for the current page. An external cache (like memcache)- is then needed to cache over multiple results.

What is the basic difference between memcached and mysqlnd_qc ?
There is rather nothing common at all between them
Which is most suitable for me as per below condition ?
mysql query cache
I also intend to extend the same for autcomplete box, which will be suitable in such case ?
Sphinx Search

Pulling very large # of rows very often -- do I need memcached here?

I have about 10 tables with ~10,000 rows each which need to be pulled very often.
For example, list of countries, list of all schools in the world, etc.
PHP can't persist this stuff in memory (to my knowledge) so I would have to query the server for a SELECT * FROM TABLE every time. Should I use memcached here? At first though it's a clear absolutely yes, but at second thought, wouldn't mysql already be caching for me and this would be almost redundant?
I don't have too much understanding of how mysql caches data (or if it even does cache entire tables).

You could use MySQL query cache, but then you are still using DB resources to establish the connection and execute the query. Another option is opcode caching if your pages are relatively static. However I think memcached is the most flexible solution. For example if you have a list of countries which need to be accessed from various code-points within your application, you could pull the data from the persistent store (mysql), and store them into memcached. Then the data is available to any part of your application (including batch processes and cronjobs) for any business requirement.

I'd suggest reading up on the MySQL query cache:
http://dev.mysql.com/doc/refman/5.6/en/query-cache.html

You do need some kind of a cache here, certainly; layers of caching within and surrounding the database are considerably less efficient than what memcached can provide.
That said, if you're jumping to the conclusion that the Right Thing is to cache the query itself, rather than to cache the content you're generating based on the query, I think you're jumping to conclusions -- more analysis is needed.
What data, other than the content of these queries, is used during output generation? Would a page cache or page fragment cache (or caching reverse-proxy in front) make more sense? Is it really necessary to run these queries "often"? How frequently does the underlying data change? Do you have any kind of a notification event when that happens?
Also, SELECT * queries without a WHERE clause are a "code smell" (indicating that something probably is being done the Wrong Way), especially if not all of the data pulled is directly displayed to the user.

How to cache a mysql_query using memcache?

I would like to know if it's possible to store a "ressource" within memcache, I'm currently trying the following code but apparently it's not correct:
$result = mysql_query($sSQL);
$memcache->set($key, $result, 0, $ttl);
return $result;

I have to disagree with zerkms. Just because MySQL has a caching system (actually, it has several), doesn't mean that there's no benefit to optimizing your database access. MySQL's Query Cache is great, but it still has limitations:
it's not suitable for large data sets
queries have to be identical (character for character)
it does not support prepared statements or queries using user-defined functions, temporary tables, or tables with column-level privileges
cache results are cleared every time the table is modified, regardless of whether the result set is affected
unless it resides on the same machine as the web server it still incurs unnecessary network overhead
Even with a remote server, Memcached is roughly 23% faster than MQC. And using APC's object cache, you can get up to a 990% improvement over using MQC alone.
So there are plenty of reasons to cache database result sets outside of MySQL's Query Cache. After all, you cache result data locally in a PHP variable when you need to access it multiple times in the same script. So why wouldn't you extend this across multiple requests if the result set doesn't change?
And just because the server is fast enough doesn't mean you shouldn't strive to write efficient code. It's not like it takes that much effort to cache database results—especially when accelerators like APC and Memcached were designed for this exact purpose. (And I wouldn't dismiss this question as such a "strange idea" when some of the largest sites on the internet use Memcached in conjunction with MySQL.)
That said, zerkms is correct in that you have to fetch the results first, then you can cache the data using APC or Memcached. There is however another option to caching query results manually, which is to use the Mysqlnd query result cache plugin. This is a client-side cache of MySQL query results.
The Mysqlnd query result cache plugin lets you transparently cache your queries using APC, Memcached, sqlite, or a user-specified data source. However, this plugin currently shares the same limitation as MQC in that prepared statements can't be cached.

Why do you need so? Mysql has its own performant query cache
but if you still want to follow your strange idea - you need to fetch all the data into array (with mysql_fetch_assoc or whatever) and after that store that array into the memcached.

What to use instead of memcache

I'm using memcache for caching (obviously) which is great. But I'm also using it as a cross-request/process data store. For instance I have a web chat on one of my pages and I use memcache to store the list of online users in it. Works great but it bothers me that if I have to flush the whole memcache server (for whatever reason) I loose the online list. I also use it to keep record of views for some content (I then periodically update the actual rows in the DB), and if I clear the cache I loose all data about views (from the last write to db).
So what I'm asking is this: what should I use instead of memcache for this kind of things? It needs to be fast and preferably store it's data in memory. I think some noSQL product would be a good fit here, but I've no idea which one. I'd like to use something that I could use for other use cases in the future, analytics come to mind (what are users searching the most for instance).
I'm using PHP so it has to have good bindings for it.

Redis! It's like memcache on steroids (the good kind). Here are some drivers.
Redis is an advanced key-value store. It is similar to memcached but the dataset is not volatile, and values can be strings, exactly like in memcached, but also lists, sets, and ordered sets. All this data types can be manipulated with atomic operations to push/pop elements, add/remove elements, perform server side union, intersection, difference between sets, and so forth. Redis supports different kind of sorting abilities.

You could try memcachedb. It uses exactly the same protocol as memcache, but it's persistent store.
You could also try cassandra or redis

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.