I have a PHP page on my website, that uses over 100 mysql queries. All the queries are different, and are all just SELECT queries from multiple tables. On average, the page takes about 5 seconds to load, and I wish to improve this time.
What method of optimization do I have? I did some research, and took a look into memcache (I don't know how it works, what it can do or if it applies to my situation, so help may be appreciated), but as I said, I don't know if that is applicable to my situation.
I was also thinking of a query caching program, but don't know of any I can use?
Any help?
There are a number of options for MySQL.
First is to setup a Query Cache in your MySQL config. If your program is SELECT heavy, try setting low-priority-updates to on. This gives higher priority on the server to SELECT statements, and less priority to INSERT/DELETE/UPDATE statements.
Changing MySQL's use of memory might be a good idea, especially if you use a lot of JOIN statements - I usually set the join_buffer_size to about 8M.
From a PHP point-of-view, try caching results.
Edit: the class down the forum page that Suresh Kamrushi posted is a nice way of caching in PHP.
Below are some points which might be useful to optimize your page load:
MySQL:
Enable Query Cache
Select with only specific columns, avoid select * from syntax
Avoid Co-related inner queries
Use Indexing
Avoid too many queries. If possible then try to use joins/unions
PHP:
Use singleton methodology to avoid multiple database instances
If possible, try calculation work in SQL as well.
HTML:
CDN to load images/js/css parallely
Sprite images
JS include in footer
Related
I originally had a page that ran 60 MySQL queries, which was obviously flawed. The page took a couple seconds to load. So i changed the code to one MySQL query and used php sessions/arrays to arrange the 60 results. The page now loads much faster/instantly but I'm wondering is this way better than the MySQL, design wise? I have an incrementing session that is set in a while loop(60 loops), each session holds an array, which i then sort.
both are bad as yoda did say
you have to move in completely different direction:
Sensibly reduce number of queries. There is nothing actually bad in having 60 queries, and a page could have it and still load in a fraction of second. But it would be wise to remove unnecessary ones.
Optimize query runtime. Determine which query runs slow and optimize it, by using DESC query query (or rather explain extended+show warnings), using indexes and such
It's impossible to say more for such a vague question with not a single query for example
Reduce number of queries.
Optimize queries for performance.
Use memcache or APC if instead of session. Sessions are not made for this purpose.
I want to make sure AJAX responses from dynamic JSON pages does not slow down the server when the SQL queries take too long. I'm using PHP, MySQL with Apache2. I had this idea to use ini_set() to recude the execution of this pages with the inline use of the mentioned method or the set_time_limit() method. Is this effective? Are their any alternatives or a mysql syntax equivalent for query time?
these are being used for example with jquery ui autosuggestions and things like that which is better to not work if they are gonna slow down the server.
If it makes sense for your application, then go ahead and set_time_limit with the desired max execution time. However, it most likely makes more sense to tweak your queries and introduce caching of query results.
memory_get_usage function can be used to find how much memory is used by your queries.
as you said you can set time limit. But how this will improve your code?
If your mysql query is going to take 5 mins and yu set time limit as 2 mins what will happen?
Main thing is optimizing the mysql query itself.
If you going to fetch huge data.
try to fetch in blocks .
set limit like fetch 1000 then next 1000.
use indexing.
make optimized joining if youare joining tables.
You can use stored procedure also if it works for your application.
Mysql 5 have SP.
So i have a site that I am working on that has been touched by many developers over time and as new features arised the developers at the time felt it necessary to just add another query to get the data that is needed. Which leaves me with a php page that is slow and runs maybe 70 queries. Some of the queries are in the actual PHP file for this page and some are scattered throughout many different function. Now i have the responsibility of trying to speed up the page to meet certain requirements. I am seeking the best course of action.
Is there a way to print all the queries that are running other then going through the file and finding each and every one?
should I cache the queries that are slow using memcached?
Is there an idea that anyone has had to help me speed up the page?
Is there a plugin or tool to analyze the queries, I am using YSlow and there is nothing there to look at queries?
Something I do is to have a my_mysql_query(...) function that behaves as mysql_query(...) but which I can then tailor to log out the execution time together with the text of the query. MySQL can log slow queries with very little fiddling - see here.
If there is not a central query method that is called to run each query, then the only options is to look for each query and find where it is in the code. Otherwise you could go to that query function and print each query that runs through it.
Using cache will depend on how often the data changes. If it changes frequently, it may not give you any performance boost to cache it.
One idea to help you speed up the page is to do the following:
group like queries into the same query and use the data in multiple parts
consider breaking the page into multiple locations
Depending on the database you are using. There are analyze functions in some databases that will help you optimize your queries. For example, you can use EXPLAIN with mysql. (http://dev.mysql.com/doc/refman/5.0/en/explain.html) You may need to consider consulting with a DBA on the issue.
Good luck.
Recently I've been doing quite a big project with php + mysql. And now I'm concerned about my mysql. What should I do to make my mysql as optimal as possible? Tell everything you know, I'll be really very grateful.
Second question, I use one mysql query per page load which takes information from mysql. It's quite a big query, because I take information from a few tables with a join. Maybe I should do something else?
Thank you.
Some top tips from MySQL Performance tips forge
Specific Query Performance:
Use EXPLAIN to profile the query
execution plan
Use Slow Query Log (always have it
on!)
Don't use DISTINCT when you have or
could use GROUP BY Insert
performance
Batch INSERT and REPLACE
Use LOAD DATA instead of INSERT
LIMIT m,n may not be as fast as it
sounds
Don't use ORDER BY RAND() if you
have > ~2K records
Use SQL_NO_CACHE when you are
SELECTing frequently updated data or
large sets of data
Avoid wildcards at the start of LIKE
queries
Avoid correlated subqueries and in
select and where clause (try to
avoid in)
Scaling Performance Tips:
Use benchmarking
isolate workloads don't let administrative work interfere with customer performance. (ie backups)
Debugging sucks, testing rocks!
As your data grows, indexing may change (cardinality and selectivity change). Structuring may want to change. Make your schema as modular as your code. Make your code able to scale. Plan and embrace change, and get developers to do the same.
Network Performance Tips:
Minimize traffic by fetching only what you need.
1. Paging/chunked data retrieval to limit
2. Don't use SELECT *
3. Be wary of lots of small quick queries if a longer query can be more efficient
Use multi_query if appropriate to reduce round-trips
Use stored procedures to avoid bandwidth wastage
OS Performance Tips:
Use proper data partitions
1. For Cluster. Start thinking about Cluster before you need them
Keep the database host as clean as possible. Do you really need a windowing system on that server?
Utilize the strengths of the OS
pare down cron scripts
create a test environment
Learn to use the explain tool.
Three things:
Joins are not necessarily suboptimal. Oftentimes schemata that use joins will be faster than those that achieve the same but avoid table joins. The important thing is to know that your joins are optimal. EXPLAIN is very helpful but you also need to know how indexes work.
If you're grabbing data from the DB on every page hit, consider if a cacheing system would work for you. If so, check out PHP memcache and memcached. It's easy to use in PHP and very fast. It's popular for a reason.
Back to mysql: make sure you're key buffer is sized correctly. You can also think about using dedicated key buffers for critical indices that should remain in cache. Read about CACHE INDEX and LOAD INDEX INTO CACHE. See also here.
"...because I take information from a few tables with a join"
Joins, even "big" joins aren't bad. Just be sure that you have good indexes.
Also note that performance with a couple of records is a lot different than performance with hundreds of thousands of records, so test accordingly.
For performance, this book is good: High Perofmanace MYSQL. The associated blog is good too.
my 2cents: set your log_slow_queries to <2sec and use mysqlsla (get it from hackmysql.com) to analyse the 'slow' queries... Thisway you can just drilldown into the slower queries as they come along...
(the mysqlsla can also benefit from the log-queries-not-using-indexes option)
on mysqlhack.com there's a script called 'mysqlreport' that gives estimates on how your installation is runnig... (once it's running a while) and also gives pointers as to where to tune your setup more precisely...
Being perfect is a bit of a challenge and not the first target to set yourself.
Enable mysql logging of all queries, and write some code which parses the log files and removes any literal values from the SQL statements.
e.g. changes
SELECT * FROM atable WHERE something=5 AND other='splodgy';
and
SELECT * FROM atable WHERE something=1 AND other='zippy';
to something like:
SELECT * FROM atable WHERE something=:1 AND other=:2;
(Sorry, I've not got my code which does this to hand - but it's not rocket science)
Then shove the re-written log into a table so you can prioritize your performance fixes based on length and frequency of execution.
I am not professional programmer so i can not be sure about this.How many mysql queries your scripts send at one page and what is your optimal query number .For example at stackoverflow's homepage it lists questions shows authors of these questions . is stackoverflow sends mysql query foreach question to get information of author. or it sends 1 query and gets all user data and match it with questions ?
I like to keep mine under 8.
Seriously though, that's pretty meaningless. If hypothetically there was a reason for you to have 800 queries in a page, then you could go ahead and do it. You'll probably find that the number of queries per page will simply be dependant on what you're doing, though in normal circumstances I'd be surprised to see over 50 (though these days, it can be hard to realise just how many you're doing if you are abstracting your DB calls away).
Slow queries matter more
I used to be frustrated at a certain PHP based forum software which had 35 queries in a page and ran really slow, but that was a long time ago and I know now that the reason that particular installation ran slow had nothing to do with having 35 queries in a page. For example, only one or two of those queries took most of the time. It just had a couple of really slow queries, that were fixed by well-placed indexes.
I think that identifying and fixing slow queries should come before identifying and eliminating unnecessary queries, as it can potentially make a lot more difference.
Consider even that three fast queries might be significantly quicker than one slow query - number of queries does not necessarily relate to speed.
I have one page (which is actually kind of a test case/diagnostic tool designed to be run only by an admin) which has over 800 queries but it runs in a matter of seconds. I guess they are all really simple queries.
Try caching
There are various ways to cache parts of your application which can really cut down on the number of queries you do, without reducing functionality. Libraries like memcached make this trivially easy these days and yet run really fast. This can also help improve performance a lot more than reducing the number of queries.
If queries are really unnecessary, and the performance really is making a difference, then remove/combine them
Just consider looking for slow queries and optimizing them, or caching their results, first.
Don't focus on the number of queries. This is not a useful metric. Instead, you need to look at a few other things:
how many queries are duplicated?
how many queries have intersecting datasets? or are a subset of another?
how long do they take to run? have you profiled the common ones to check indices?
how many are unnecessarily complex?
Numerous times I've seen three simpler queries together execute in a tenth of the time of one complex one that returned the same information. By the same token, SQL is powerful, but don't go mad trying to do something in SQL that would be easier and simpler in a loop in PHP.
how much progressive processing are you doing?
If you can't avoid longer queries with large datasets, try to re-arrange the algorithm so that you can process the dataset as it comes from the database. This lets you use an unbuffered query in MySQL and that improves your memory usage. And if you can provide output whilst you're doing this, you can improve your page's perceived speed by provinding first output sooner.
how much can you cache some of this data? Even caching it for a few seconds can help immensely.
There really is no optimal number of queries. Obviously the less queries you make the better.
If you are using some kind of ORM like Hibernate, Propel, Doctrine, etc they will generate queries differently than if you were to write the SQL by hand. So if StackOverflow uses an ORM they might have more than one query accessing the questions and the users that created the questions. Or they might just use a join with straight SQL.
It really depends on the technology you are using and what it actually does behind the scenes to generate the SQL.
Things you should be researching to understand this better:
Lazy loading
Object Relational Mapping
I recently started refactoring some older code of mine and I realised that I had used a lot of queries inside loops because back then I didn't know how to write SQL queries with subqueries and joins, etc. So I went and integrated these nested queries into one query so I could retrieve all the data at once and then loop over it in a nested way.
In some cases this made the page load significantly faster.
Ergo: It's definitely worth learning about the possibilities of SQL so you can start doing more with SQL and less with PHP.
I would not say that there is an optimal number of queries to be on any given script, but rather you have a goal when optimising; ordinarily time is the main concern, among other things.
If time is the only concern, you could optimise you queries such that you could have queries that are executed in less time than one other queries.
This is how I view optimosation, I have an objective, how best do I achieve it. Is there any information that you can cache? Based on you indexes, would a particular order of filters in your query perform better.....
My point, optimisation is best done on both the Db end and the application end.
You may want to read more on database optimisation.
As few as you need and no more. There is no rule of thumb here. Some websites require a lot of db access and others don't.
SO actually has only a few db calls if its written as I think. On a page like this one, an answer to a question, there would:
1) session verification, if you are logged in.
2) current user info, to get the user bar at the top of the screen and you medal count.
3) get the question info as well as the questioner's/last editor's info.
4) retrieve a count of tags used in this question.
5) select all responses and responder data in one shot.
And that's about it. The fun part is how much is keyed off the question:
// this returns one row per revision
select q.*, u.name, u.u_id, u.points, u.gmedal, u.smedals, u.bmedals
from questions q left outer join users on q.u_id = u.u_id
where q_id = :q_id;
// this used to display the tags below the question and the tag counts on the right
select t.name, count(*)
from tags t left join tags q on q.tagid = t.tagid
where t.q_id = :q_id
// this can also get multiple revisions
select a.*, u.name, u.u_id, u.points, u.gmedal, u.smedals, u.bmedals
from answers a left outer join users on a.u_id = u.u_id
where a.q_id = :q_id
This assumes that the various counts (vote-ups, favored question) are cached on the table as well as stored separately.
The optimal number is as many as you need to display the information the user expects. I always try to keep it in the single digits. For information that takes a few queries, but rarely changes, I cache the results in a generic cache table so it only takes one query. Store it as a serialized array to retain an easy to access structure.
When I first installed WordPress, I was appalled that the base install did over 20 queries! Plugins would increase that number (some by quite a bit). But with caching, that could be reduced to zero (SuperCache). If your content changes every 10 minutes, why generate it dynamically every hit?
At the very extreme is a platform like Facebook, where every page is unique content, customized to the user viewing it. You have to query every time.
But regardless, I rarely see the need to hit double digits query counts.
0 would be optimal if you are prioritizing speed.