Within a php/mysql system we have a number of configuration values (approx 200) - these are mostly booleans or ints and store things such as the number of results per page and whether pages are 2 or 3 columns. This is all stored in a single mysql table and we use a single function to return these values as they are requested, on certain pages loads there can probably be up to around 100 requests to this config table. With the number of sites using this system this means potentially thousands of requests each second to retrieve these values. The question is whether this method makes sense or whether it would be preferable to perform a single request per page and store all the configs in an array and retrieve from here each time instead.
Use a cache such as memcache, APC, or any other. Load the settings once, cache it, and share it through your sessions with a singleton object.
Even if the query cache is saved, it's a waste of time and resources to query the database over and over. Rather, on any request that modifies the values, invalidate the cache that is in memory so it is reloaded immediately the next time someone requests a value from it.
If you enable MySQL query cache, the query that selects your values will be cached in memory, and MySQL will give an instant answer from memory unless the query or data in the underlying tables are changed.
This is excellent both for performance and for manageability.
The query results may be reused between the sessions: that means, if you have 1000 sessions, you don't need to keep 1000 copies of your data.
You might want to consider using memcache for this. I think it would be faster than multiple DB queries (even with query caching on), and you won't need a database connection to get them.
You might want to consider just loading them from a flat file into memory, this affords you the opportunity to version control your config values.
I would defintely recommend memcache for this. We have a similar setup and it has noticably brought resource usage down on that server.
Related
Currently i m using shared hosting domain for my site .But we have currently near about 11,00,000 rows in one of the tables.So its taking a lot of time to load the webpage.So we want to implement the database caching techniques like APC or memcache for our site.But in shared domain we dont have those facilities available,we have only eaccelerator.But eaccelerator does not cache db calls,If i m not wrong.So considering all these points we want to move to VPS and in this case.which database caching technique we need to use APC or memcache to decrease the page load time...Please guide on VPS and better caching technique of two
we have similar website and we use APC
APC will cache the opcode as well the html that is generated. This helps to avoid unrequired hits to the page
you should also enable caching on mysql to cache results of your query
I had a task where i needed to fetch rows from a database table that had more than 100.000 record. it was a scrollable page. So what i did was to fetch the first 50 records and cache the next 50 in the first call. and on scroll down events i wrote an ajax request to check if the data is available in cache; if not i fetched it from the database and also cached the next 50. It worked pretty well and solved the inconvenient load time.
if you have a similar scenario you might benefit from this approach.
ps: I used memcache.
From your comment I take it you're doing a LIKE %..% query and want to paginate the result. First of all, investigate whether FULLTEXT indices are an option for you, as they should perform better. If that's not an option, you can add a simple cache like so:
Treat each unique search term as an id, i.e. if in your URL you have ..?search=foobar, then "foobar" is the id of the result set. Keep that in all your links, e.g. ..?search=foobar&page=2.
If the result set does not yet exist (see below), create it:
Query the database with your slow query.
Get all the results into an array. Don't overdo it, you don't want to be storing hundreds of megabytes.
Create a unique filename per query, e.g. sha1($query), or maybe sha1(strtolower($query)).
serialize the data and store it in the file.
Get the data from the file, unserialize it, display the portion of the array corresponding to the requested page.
Occasionally, delete old cached results. You can do that with something like if (rand(0, 100) == 1) .., which will run the cleanup job every 100 queries on average. Strike a balance between server load and data freshness. Cache invalidation is a topic whole books can be written about, BTW.
That's a simple poor man's cache implementation. It's not great, but if you have absolutely nothing else to work with, it's better than running slow queries over and over.
APC is Alternative PHP Cache and works only with PHP. Whereas Memcahced will work independently with any language.
I have about 10 tables with ~10,000 rows each which need to be pulled very often.
For example, list of countries, list of all schools in the world, etc.
PHP can't persist this stuff in memory (to my knowledge) so I would have to query the server for a SELECT * FROM TABLE every time. Should I use memcached here? At first though it's a clear absolutely yes, but at second thought, wouldn't mysql already be caching for me and this would be almost redundant?
I don't have too much understanding of how mysql caches data (or if it even does cache entire tables).
You could use MySQL query cache, but then you are still using DB resources to establish the connection and execute the query. Another option is opcode caching if your pages are relatively static. However I think memcached is the most flexible solution. For example if you have a list of countries which need to be accessed from various code-points within your application, you could pull the data from the persistent store (mysql), and store them into memcached. Then the data is available to any part of your application (including batch processes and cronjobs) for any business requirement.
I'd suggest reading up on the MySQL query cache:
http://dev.mysql.com/doc/refman/5.6/en/query-cache.html
You do need some kind of a cache here, certainly; layers of caching within and surrounding the database are considerably less efficient than what memcached can provide.
That said, if you're jumping to the conclusion that the Right Thing is to cache the query itself, rather than to cache the content you're generating based on the query, I think you're jumping to conclusions -- more analysis is needed.
What data, other than the content of these queries, is used during output generation? Would a page cache or page fragment cache (or caching reverse-proxy in front) make more sense? Is it really necessary to run these queries "often"? How frequently does the underlying data change? Do you have any kind of a notification event when that happens?
Also, SELECT * queries without a WHERE clause are a "code smell" (indicating that something probably is being done the Wrong Way), especially if not all of the data pulled is directly displayed to the user.
Concern about my page loading speed, I know there are a lot of factors that affect page loading time.
Does retrieving records (Categories) in a array instead of DB is faster?
Thanks
It is faster to keep it all in PHP till you have an absurd amount of records and you use up RAM.
BUT, both of these things are super fast. Selecting a handful of records on a single table that has an index should take less than a msec. Are you sure that you know the source of your web page slowness?
I would be a little bit cautious of having your Data in your code. It will make your system less maintainable. How will users change categories?
THis gets back to deciding if you want your site static versus dynamic.
Yes of course retrieving data from an array is much faster than retrieving data from a Database, but usually arrays and databases have totally different use cases, because data in an array is static (you type the value in code or in a separate file and you can't modify them) while data in a database is dynamic
Yes, it's probably faster to have an array of your categories directly in your PHP script, especially if you need all the categories on every page load. This makes it possible for APC to cache the array (if you have APC running), and also lessen the traffic to/from the database.
But is this where your bottleneck is? It seems to me as the categories should have been cached in the query cache and therefore be easily retrieved. If this is not your biggest bottleneck, chances are you won't see any decrease in loading times. Make sure to profile your application to find the large bottlenecks or you will waste your time on getting only small performance gains.
If you store categories in a database, you have to connect to the database, prepare a SQL statement, send it to the server, fetch the result set, and (probably) store the results in an array. (But you'll probably already have a connection to the database anyway, and hardware and software is designed to do this kind of work quickly.)
Storing and retrieving categories
from a database trades speed for
maintenance. Your data is always up
to date; it might take a little
longer to get it.
You can also store categories as constants or as literals in an array assignment. It would be smart to generate the constants or the array literals from data stored in the database, but you might not have to do that for every page load. If "categories" doesn't change much, you might be able to get away with generating the code once or twice a day, plus whenever someone adds a category. It depends on your application.
Storing and retrieving categories
from an array trades maintenance for
speed. Your data loads a little
faster; it might be incomplete.
The unsatisfying answer is that you're not going to be able to tell how different storage and page generation strategies affect page loading speed until you test them. And even testing isn't that easy, because the effect of changing server and database parameters can be, umm, surprising.
(You can also generate static pages from the database using php. I suggest you test some static pages to give you an idea of "best case" performance.)
I'm creating a web app using PHP and PostgreSQL. Database are too big and some kinds of
search takes a lot of time to process.
User can wait into first search, but it sucks when paginates. Can I store the resultset into sessions vars?
It would be great to deal with pagination.
Yes, of course you can, but with a lot of different resultsets, loading the amount of data on a session_start can become cumbersome. Also, the same search could be done by more then one user, resulting in duplicate (time consuming) retrievals / large sessions (every pageload takes longer) / more storage space.
If the data is safe from alteration enough to store in a session, look at caches like APC or Memcached to store the result with a meaningful key. That way, the load is lifted from the database, and searches can be shared amongst users.
If you have a fixed number of items for pagination, you could even consider storing the different 'pages' with a pre- or postfix in the key, so you don't have to select a subset of the result every time (i.e. store with a key like 'search:foo=bar|1' / 'search:foo=bar|2' etc.)
Is there a simple way to cache MySQL queries in PHP or failing that, is there a small class set that someone has written and made available that will do it? I can cache a whole page but that won't work as some data changes but some do not, I want to cache the part that does not.
This is a great overview of how to cache queries in MySQL:
The MySQL Query Cache
You can use Zend Cache to cache results of your queries among other things.
I think the query cache size is 0 by default, which is off.
Edit your my.cnf file to give it at least a few megabytes.
No PHP changes necessary :)
It may be complete overkill for what you're attempting, but have a look at eAccelerator or memcache. If you have queries that will change regularly and queries that won't, you may not want all of your db queries cached for the same length of time by mysql.
Caching engines like the above allow you to decide, on a query-by-query basis, how long the data should be cached for. So say you've data in your header that will change infrequently, you can check if it's currently in the cache - if so, return it, otherwise do the query, and put it into cache with a lifetime of N, so for the next N seconds every page load will pull the data from cache without going near MySQL.
You're then free to pull your other data "live" from the db as and when required, by-passing the cache.
I would recommend the whole page caching route. If some of the data changes, simply place tokens/placeholders in place of the dynamic data. Cache the entire page with those tokens in place, then post process the tokens for the cached data for the tokens. Thus you now have a cached page that contains dynamic content.