Is there a simple way to cache MySQL queries in PHP or failing that, is there a small class set that someone has written and made available that will do it? I can cache a whole page but that won't work as some data changes but some do not, I want to cache the part that does not.
This is a great overview of how to cache queries in MySQL:
The MySQL Query Cache
You can use Zend Cache to cache results of your queries among other things.
I think the query cache size is 0 by default, which is off.
Edit your my.cnf file to give it at least a few megabytes.
No PHP changes necessary :)
It may be complete overkill for what you're attempting, but have a look at eAccelerator or memcache. If you have queries that will change regularly and queries that won't, you may not want all of your db queries cached for the same length of time by mysql.
Caching engines like the above allow you to decide, on a query-by-query basis, how long the data should be cached for. So say you've data in your header that will change infrequently, you can check if it's currently in the cache - if so, return it, otherwise do the query, and put it into cache with a lifetime of N, so for the next N seconds every page load will pull the data from cache without going near MySQL.
You're then free to pull your other data "live" from the db as and when required, by-passing the cache.
I would recommend the whole page caching route. If some of the data changes, simply place tokens/placeholders in place of the dynamic data. Cache the entire page with those tokens in place, then post process the tokens for the cached data for the tokens. Thus you now have a cached page that contains dynamic content.
Related
Currently i m using shared hosting domain for my site .But we have currently near about 11,00,000 rows in one of the tables.So its taking a lot of time to load the webpage.So we want to implement the database caching techniques like APC or memcache for our site.But in shared domain we dont have those facilities available,we have only eaccelerator.But eaccelerator does not cache db calls,If i m not wrong.So considering all these points we want to move to VPS and in this case.which database caching technique we need to use APC or memcache to decrease the page load time...Please guide on VPS and better caching technique of two
we have similar website and we use APC
APC will cache the opcode as well the html that is generated. This helps to avoid unrequired hits to the page
you should also enable caching on mysql to cache results of your query
I had a task where i needed to fetch rows from a database table that had more than 100.000 record. it was a scrollable page. So what i did was to fetch the first 50 records and cache the next 50 in the first call. and on scroll down events i wrote an ajax request to check if the data is available in cache; if not i fetched it from the database and also cached the next 50. It worked pretty well and solved the inconvenient load time.
if you have a similar scenario you might benefit from this approach.
ps: I used memcache.
From your comment I take it you're doing a LIKE %..% query and want to paginate the result. First of all, investigate whether FULLTEXT indices are an option for you, as they should perform better. If that's not an option, you can add a simple cache like so:
Treat each unique search term as an id, i.e. if in your URL you have ..?search=foobar, then "foobar" is the id of the result set. Keep that in all your links, e.g. ..?search=foobar&page=2.
If the result set does not yet exist (see below), create it:
Query the database with your slow query.
Get all the results into an array. Don't overdo it, you don't want to be storing hundreds of megabytes.
Create a unique filename per query, e.g. sha1($query), or maybe sha1(strtolower($query)).
serialize the data and store it in the file.
Get the data from the file, unserialize it, display the portion of the array corresponding to the requested page.
Occasionally, delete old cached results. You can do that with something like if (rand(0, 100) == 1) .., which will run the cleanup job every 100 queries on average. Strike a balance between server load and data freshness. Cache invalidation is a topic whole books can be written about, BTW.
That's a simple poor man's cache implementation. It's not great, but if you have absolutely nothing else to work with, it's better than running slow queries over and over.
APC is Alternative PHP Cache and works only with PHP. Whereas Memcahced will work independently with any language.
Users in my webgame are having certain player information cached in the $_SESSION of PHP.
Each time they load the game it checks if the session exists, if not they get the player information from a MySQL database and then it gets stored in the $_SESSION.
Now my problem is, what if the player information gets updated by another process or player? They can't update the $_SESSION cache of the other player.
I know memcached is most probably the solution for this, but I'm not sure if I should take the time for something like this. $_SESSION cache is doing well for me, except for this.
I was thinking about creating a MySQL table for it which get read at every request and if there's a record for the player that it recreates the cache.
One other solution would be to create a file in a directory with the id of the player in the name of the file. Every request PHP will check with file_exist if it should clear the cache or not.
What would you guys do? It gets executed every request so it's pretty important to get this optimized.
From a design standpoint alone I'd avoid the file_exists and directory approach. Sure 'file_exists' is fast, but it won't scale well... What happens if a use changes their name?
If you're using APC (and you should) you could APC's user memory cache. As long as you're on a single server it should give you similar performance benifits as memcached without the need for a separate memory caching server process. If a user entry changes frequently, you could run into fragmemntation issues with APC though. In that case, time to bite the bullet and go with memcached--you can even store your session data in memcached for a performance boost.
Also, neither APC or your file_exists solution will scale to multiple load balanced servers--you'd need a DB solution or memcached for that.
The way you exposed it, is not about how fast is one vs the other, the SESSION approach is just not valid because of your concurrency issue.
If your data can change concurrently, then your data storage needs to be able to handle that concurrency and whatever caching layer you want to use needs to behave accordingly to the nature of your problem.
If it is only about cache, and you dont want to install memcache(d), you can go with a mysql table in memory. It is not as fast as memcached, but still a fine solution. And make sure to create proper indexes on all your tables (maybe that is the better solution, no cache, just select it from your table).
CREATE TABLE t (i INT) ENGINE = MEMORY;
I have about 10 tables with ~10,000 rows each which need to be pulled very often.
For example, list of countries, list of all schools in the world, etc.
PHP can't persist this stuff in memory (to my knowledge) so I would have to query the server for a SELECT * FROM TABLE every time. Should I use memcached here? At first though it's a clear absolutely yes, but at second thought, wouldn't mysql already be caching for me and this would be almost redundant?
I don't have too much understanding of how mysql caches data (or if it even does cache entire tables).
You could use MySQL query cache, but then you are still using DB resources to establish the connection and execute the query. Another option is opcode caching if your pages are relatively static. However I think memcached is the most flexible solution. For example if you have a list of countries which need to be accessed from various code-points within your application, you could pull the data from the persistent store (mysql), and store them into memcached. Then the data is available to any part of your application (including batch processes and cronjobs) for any business requirement.
I'd suggest reading up on the MySQL query cache:
http://dev.mysql.com/doc/refman/5.6/en/query-cache.html
You do need some kind of a cache here, certainly; layers of caching within and surrounding the database are considerably less efficient than what memcached can provide.
That said, if you're jumping to the conclusion that the Right Thing is to cache the query itself, rather than to cache the content you're generating based on the query, I think you're jumping to conclusions -- more analysis is needed.
What data, other than the content of these queries, is used during output generation? Would a page cache or page fragment cache (or caching reverse-proxy in front) make more sense? Is it really necessary to run these queries "often"? How frequently does the underlying data change? Do you have any kind of a notification event when that happens?
Also, SELECT * queries without a WHERE clause are a "code smell" (indicating that something probably is being done the Wrong Way), especially if not all of the data pulled is directly displayed to the user.
I have a personal caching class, which can be seen here ( based off WordPress' ):
http://pastie.org/988427
I recently learned about memcache and it said to memcache EVERYTHING:
http://highscalability.com/blog/2010/5/17/7-lessons-learned-while-building-reddit-to-270-million-page.html
My first thought was just to keep my class with the current functions and make it use memcache instead -- is there any downside to doing this?
The main difference I see is that memcache stays on with the server from page to page, while mine is for 1 page load. The problem I see arising, and this is with any system, is that they're dynamic. They change all the time. Whether its search results, visible products, etc. etc. If it's all cached, won't the create a problem?
Is there a way to handle this? Obviously if something is bringing back the same results everytime it would be cached, but that's why I was doing it on a per page load basis. I'm sure there is a way to handle this, or is the cache time usually set between 5 minutes and an hour?
You certainly need a good caching strategy to avoid problems with stale data. With dynamic data and using memcached, you would have to delete cache entries on certain data updates. You can't just rely on cache entries to time out. With memcached you can cache just parts of your dynamic content for a specific page generation. If you want to cache complete html documents, I would recommend using a reverse proxy like varnish (http://varnish-cache.org/).
First of all, the website I run is hosted and I don't have access to be able to install anything interesting like memcached.
I have several web pages displaying HTML tables. The data for these HTML tables are generated using expensive and complex MySQL queries. I've optimized the queries as far as I can, and put indexes in place to improve performance. The problem is if I have high traffic to my site the MySQL server gets hammered, and struggles.
Interestingly - the data within the MySQL tables doesn't change very often. In fact it changes only after a certain 'event' that takes place every few weeks.
So what I have done now is this:
Save the HTML table once generated to a file
When the URL is accessed check the saved file if it exists
If the file is older than 1hr, run the query and save a new file, if not output the file
This ensures that for the vast majority of requests the page loads very fast, and the data can at most be 1hr old. For my purpose this isn't too bad.
What I would really like is to guarantee that if any data changes in the database, the cache file is deleted. This could be done by finding all scripts that do any change queries on the table and adding code to remove the cache file, but it's flimsy as all future changes need to also take care of this mechanism.
Is there an elegant way to do this?
I don't have anything but vanilla PHP and MySQL (recent versions) - I'd like to play with memcached, but I can't.
Ok - serious answer.
If you have any sort of database abstraction layer (hopefully you will), you could maintain a field in the database for the last time anything was updated, and manage that from a single point in your abstraction layer.
e.g. (pseudocode): On any update set last_updated.value = Time.now()
Then compare this to the time of the cached file at runtime to see if you need to re-query.
If you don't have an abstraction layer, create a wrapper function to any SQL update call that does this, and always use the wrapper function for any future functionality.
There are only two hard things in
Computer Science: cache invalidation
and naming things.
—Phil Karlton
Sorry, doesn't help much, but it is sooooo true.
You have most of the ends covered, but a last_modified field and cron job might help.
There's no way of deleting files from MySQL, Postgres would give you that facility, but MySQL can't.
You can cache your output to a string using PHP's output buffering functions. Google it and you'll find a nice collection of websites explaining how this is done.
I'm wondering however, how do you know that the data expires after an hour? Or are you assuming the data wont change that dramatically in 60 minutes to warrant constant page generation?