I originally had a page that ran 60 MySQL queries, which was obviously flawed. The page took a couple seconds to load. So i changed the code to one MySQL query and used php sessions/arrays to arrange the 60 results. The page now loads much faster/instantly but I'm wondering is this way better than the MySQL, design wise? I have an incrementing session that is set in a while loop(60 loops), each session holds an array, which i then sort.
both are bad as yoda did say
you have to move in completely different direction:
Sensibly reduce number of queries. There is nothing actually bad in having 60 queries, and a page could have it and still load in a fraction of second. But it would be wise to remove unnecessary ones.
Optimize query runtime. Determine which query runs slow and optimize it, by using DESC query query (or rather explain extended+show warnings), using indexes and such
It's impossible to say more for such a vague question with not a single query for example
Reduce number of queries.
Optimize queries for performance.
Use memcache or APC if instead of session. Sessions are not made for this purpose.
Related
So I've been developing an application that needs to compare a lot of data. The max (I'm assuming) at once will be about 28800 rows that need to be compared against another table separately.
So what I'm doing right now is making 28800 ajax calls to a database and it works, but it's obviously not the fastest procedure in the world.
Would it be faster to iterate over an array of about say, 80,000 elements 28800 times and just making 2 db calls?
I'm assuming it is not, but just figured I'd ask to be sure.
For a little background knowledge, check out an earlier question I posted:
How to avoid out of memory error in a browser due to too many ajax calls
Thank you!
Unless there is some very specific reason you are making that many calls it sounds like thre must be an easier way to retrieve all this data, have the database do some of the work, or rethink what is happening.
Can you write a database script that is passed some params and does the comparison?
Chances are the database can work faster than looping over all that data 28800 times as the database uses other methods for comparing and retrieving data. You are in a way comparing brute force looping to a database engine.
Can your write a query that does the comparison?
It sounds like you are pulling the data and then doing more work to it, can you pull the data you need with a single query?
The general rule that I haver found is that MySQL is massively fast on doing simple queries, especially when indexed, and abominably slow on complex ones involving multiple joins or nested sub queries.
This doesn't actually answer your question though, because its too vague. Mysql retrieval is critical with respect to what indexes are applied. It is also critical to the complexity of the query.
So there are no general rules. You must try both and see which is better.
Likewise javascript performance varies wildly between browsers.
A final comment: some things that are simple to define in procedural code, like checking the result of one query to formulate the shape of the next, are massively easier to code than to work out what exactly the optimal SQL syntax should be to combine them into a single query, and even then, you may find that the mysql server is building massive temporary tables to do it.
I have a PHP page on my website, that uses over 100 mysql queries. All the queries are different, and are all just SELECT queries from multiple tables. On average, the page takes about 5 seconds to load, and I wish to improve this time.
What method of optimization do I have? I did some research, and took a look into memcache (I don't know how it works, what it can do or if it applies to my situation, so help may be appreciated), but as I said, I don't know if that is applicable to my situation.
I was also thinking of a query caching program, but don't know of any I can use?
Any help?
There are a number of options for MySQL.
First is to setup a Query Cache in your MySQL config. If your program is SELECT heavy, try setting low-priority-updates to on. This gives higher priority on the server to SELECT statements, and less priority to INSERT/DELETE/UPDATE statements.
Changing MySQL's use of memory might be a good idea, especially if you use a lot of JOIN statements - I usually set the join_buffer_size to about 8M.
From a PHP point-of-view, try caching results.
Edit: the class down the forum page that Suresh Kamrushi posted is a nice way of caching in PHP.
Below are some points which might be useful to optimize your page load:
MySQL:
Enable Query Cache
Select with only specific columns, avoid select * from syntax
Avoid Co-related inner queries
Use Indexing
Avoid too many queries. If possible then try to use joins/unions
PHP:
Use singleton methodology to avoid multiple database instances
If possible, try calculation work in SQL as well.
HTML:
CDN to load images/js/css parallely
Sprite images
JS include in footer
My question really revolves around the repetitive use of a large amount of data.
I have about 50mb of data that I need to cross reference repetitively during a single php page execution. This task is most easily solved by using sql queries with table joins. The problem is the sheer volume of data that I need to process in an very short amount of time and the number of queries required to do it.
What I am currently doing is dumping the relevant part of each table (usually in excess of 30% or 10k rows) into an array and looping. The table joins are always on a single field, so I built a really basic 'index' of sorts to identify which rows are relevant.
The system works. It's been in my production environment for over a year, but now I'm trying to squeeze even more performance out of it. On one particular page I'm profiling, the second highest total time is attributed to the increment line that loops though these arrays. It's hit count is 1.3 million, for a total execution time of 30 seconds. This represents the work that would have been preformed by about 8200 sql queries it to achieve the same result.
What I'm looking for is anyone else that has run a situation like this. I really can't belive that I'm anywhere near the first person to have large amounts of data that needs to be processed in PHP.
Thanks!
Thank you very much to everyone that offered some advice here. It looks like there's isn't really a sliver bullet here like I was hoping. I think what I'm going to end up doing is using a mix of mysql memory tables and some version of a paged memcache.
This solution depends closely on what are you doing with the data, but I found that working unique-value columns inside array keys accelerate things a lot when you are trying to look up for a row given certain value on a column.
This is because php uses a hash table to store the keys for fast lookups. It's hundreds of times faster than iterating over the array, or using array_search.
But without seeing a code example is hard to say.
Added from comment:
The next step is use some memory database. You can use memory tables in mysql, or SQLite. Also depends on how much of your running environment you control, because those methods would need more memory than a shared hosting provider would usually allow. It would probably also simplify your code because of grouping, sorting, aggregate functions, etc.
Well, I'm looking at a similar situation in which I have a large amount of data to process, and a choice to try to do as much via MySQL queries, or off-loading it to PHP.
So far, my experience has been this:
PHP is a lot slower than using MySQL queries.
MySQL query speed is only acceptable if I cram the logic into a single call, as the latency between calls is severe.
I'm particularly shocked by how slow PHP is for looping over an even modest amount of data. I keep thinking/hoping I'm doing something wrong...
I want to make sure AJAX responses from dynamic JSON pages does not slow down the server when the SQL queries take too long. I'm using PHP, MySQL with Apache2. I had this idea to use ini_set() to recude the execution of this pages with the inline use of the mentioned method or the set_time_limit() method. Is this effective? Are their any alternatives or a mysql syntax equivalent for query time?
these are being used for example with jquery ui autosuggestions and things like that which is better to not work if they are gonna slow down the server.
If it makes sense for your application, then go ahead and set_time_limit with the desired max execution time. However, it most likely makes more sense to tweak your queries and introduce caching of query results.
memory_get_usage function can be used to find how much memory is used by your queries.
as you said you can set time limit. But how this will improve your code?
If your mysql query is going to take 5 mins and yu set time limit as 2 mins what will happen?
Main thing is optimizing the mysql query itself.
If you going to fetch huge data.
try to fetch in blocks .
set limit like fetch 1000 then next 1000.
use indexing.
make optimized joining if youare joining tables.
You can use stored procedure also if it works for your application.
Mysql 5 have SP.
Okay, so I'm sure plenty of you have built crazy database intensive pages...
I am building a page that I'd like to pull all sorts of unrelated database information from. Here are some sample different queries for this one page:
article content and info
IF the author is a registered user, their info
UPDATE the article's view counter
retrieve comments on the article
retrieve information for the authors of the comments
if the reader of the article is signed in, query for info on them
etc...
I know these are basically going to be pretty lightning quick, and that I could combine some; but I wanted to make sure that this isn't abnormal?
How many fairly normal and un-heavy queries would you limit yourself to on a page?
As many as needed, but not more.
Really: don't worry about optimization (right now). Build it first, measure performance second, and IFF there is a performance problem somewhere, then start with optimization.
Otherwise, you risk spending a lot of time on optimizing something that doesn't need optimization.
I've had pages with 50 queries on them without a problem. A fast query to a non-large (ie, fits in main memory) table can happen in 1 millisecond or less, so you can do quite a few of those.
If a page loads in less than 200 ms, you will have a snappy site. A big chunk of that is being used by latency between your server and the browser, so I like to aim for < 100ms of time spent on the server. Do as many queries as you want in that time period.
The big bottleneck is probably going to be the amount of time you have to spend on the project, so optimize for that first :) Optimize the code later, if you have to. That being said, if you are going to write any code related to this problem, write something that makes it obvious how long your queries are taking. That way you can at least find out you have a problem.
I don't think there is any one correct answer to this. I'd say as long as the queries are fast, and the page follows a logical flow, there shouldn't be any arbitrary cap imposed on them. I've seen pages fly with a dozen queries, and I've seen them crawl with one.
Every query requires a round-trip to your database server, so the cost of many queries grows larger with the latency to it.
If it runs on the same host there will still be a slight speed penalty, not only because a socket is between your application but also because the server has to parse your query, build the response, check access and whatever else overhead you got with SQL servers.
So in general it's better to have less queries.
You should try to do as much as possible in SQL, though: don't get stuff as input for some algorithm in your client language when the same algorithm could be implemented without hassle in SQL itself. This will not only reduce the number of your queries but also help a great deal in selecting only the rows you need.
Piskvor's answer still applies in any case.
Wordpress, for instance, can pull up to 30 queries a page. There are several things you can use to stop MySQL pull down - one of them being memchache - but right now and, as you say, if it will be straightforward just make sure all data you pull is properly indexed in MySQL and don't worry much about the number of queries.
If you're using a Framework (CodeIgniter for example) you can generally pull data for the page creation times and check whats pulling your site down.
As other have said, there is no single number. Whenever possible please use SQL for what it was built for and retrieve sets of data together.
Generally an indication that you may be doing something wrong is when you have a SQL inside a loop.
When possible Use joins to retrieve data that belongs together versus sending several statements.
Always try to make sure your statements retrieve exactly what you need with no extra fields/rows.
If you need the queries, you should just use them.
What I always try to do, is to have them executed all at once at the same place, so that there is no need for different parts (if they're separated...) of the page to make database connections. I figure it´s more efficient to store everything in variables than have every part of a page connect to the database.
In my experience, it is better to make two queries and post-process the results than to make one that takes ten times longer to run that you don't have to post-process. That said, it is also better to not repeat queries if you already have the result, and there are many different ways this can be achieved.
But all of that is oriented around performance optimization. So unless you really know what you're doing (hint: most people in this situation don't), just make the queries you need for the data you need and refactor it later.
I think that you should be limiting yourself to as few queries as possible. Try and combine queries to mutlitask and save time.
Premature optimisation is a problem like people have mentioned before, but that's where you're crapping up your code to make it run 'fast'. But people take this 'maxim' too far.
If you want to design with scalability in mind, just make sure whatever you do to load data is sufficiently abstracted and calls are centralized, this will make it easier when you need to implement a shared memory cache, as you'll only have to change a few things in a few places.