php and MySQL: 2 requests or 1 request? - php

I'm building a wepage in php using MySQL as my database.
Which way is faster?
2 requests to MySQL with the folling query.
SELECT points FROM data;
SELECT sum(points) FROM data;
1 request to MySQL. Hold the result in a temporary array and calcuale the sum in php.
$data = SELECT points FROM data;
EDIT -- the data is about 200-500 rows

It's really going to depend on a lot of different factors. I would recommend trying both methods and seeing which one is faster.

Since Phill and Kibbee have answered this pretty effectively, I'd like to point out that premature optimization is a Bad Thing (TM). Write what's simplest for you and profile, profile, profile.

How much data are we talking about? I'd say MySQL is probably faster at doing those kind of operations in the majority of cases.
Edit: with the kind of data that you're talking about, it probably won't make masses of difference. But databases tend to be optimised for those kind of queries, whereas PHP isn't. I think the second DB query is probably worth it.

If you want to do it in one line, use a running total like this:
SET #total=0;
SELECT points, #total:=#total+points AS RunningTotal FROM data;

I wouldn't worry about it until I had an issue with performance.

If you go with two separate queries, you need to watch out for the possibility of the data changing between getting the rows & getting their sum. Until there's an observable performance problem, I'd stick to doing my own summation to keep the page consistent.

The general rule of thumb for efficiency with mySQL is to try to minimize the number of SQL requests. Every call to the database adds overhead and is "expensive" in terms of time required.
The optimization done by mySQL is quite good. It can take very complex requests with many joins, nestings and computations, and make it run efficiently.
But it can only optimize individual requests. It cannot check the relationship between two different SQL statements and optimize between them.
In your example 1, the two statements will make two requests to the database and the table will be scanned twice.
Your example 2 where you save the result and compute the sum yourself would be faster than 1. This would only be one database call, and looping through the data in PHP to get the sum is faster than a second call to the database.

Just for the fun of it.
SELECT COUNT(points) FROM `data`
UNION
SELECT points FROM `data`
The first row will be the total, the next rows will be the data.
NOTE: Union can be slow, but its an option.
Could also do more fun and this supports you sorting the rows.
SELECT 'total' AS name, COUNT(points) FROM `data`
UNION
SELECT 'points' AS name, points FROM `data`
Then selecting through PHP
while($row = mysql_fetch_assoc($query))
{
if($row["data"] == "points")
{
echo $row["points"];
}
if($row["data"] == "total")
{
echo "Total is: ".$row["points"];
}
}

You can use union like this:
(select points, null as total from data) union (select null, sum(points) from data group by points);
The result will look something like this:
point total
2 null
5 null
...
null 7
you can figure out how to handle it.

do it the mySQL way. let the database manager do its work.
mySQL is optimized for such tasks

Related

Why mysql query takes long before received results?

I have these 2 tables with thousands of data and i will look for matches 1:1 on both tables. I have the query below, but when the limit is already beyond 1000 the return starts to slow down, it took almost an hour before I receive a result, I am using my local xampp as database for this and have a 4G(3.4G usable) RAM PC.
Is there any other way to enhance and make the query faster?
Thank you in advance for those who will help.
SELECT a.rNum,
a.cDate,
a.cTime,
a.aNumber,
a.bNumber,
a.duration,
a.tag,
a.aNumber2,
a.bNumber2,
'hasMatch',
a.concatDate,
a.timeMinutes
FROM tableOne a
INNER JOIN
tableTwo b ON a.aNumber2 = b.aNumber2
AND a.bNumber2 = b.bNumber2
WHERE a.hasMatch = 'valid'
AND (a.duration - b.duration) <= 3
AND (a.duration - b.duration) >= -3
AND TIMEDIFF(a.concatDate,b.concatDate) <= 3
AND TIMEDIFF(a.concatDate,b.concatDate) >= -3
LIMIT 0,100;
The obvious question is: do you have indexes?
As written, the best indexes to try on on tableOne(aNumber1, aNumber2) and tableTwo(bNumber1, bNumber2).
If you provide sample data, desired results, and an explanation of what you are trying to do, there might be further suggestions.
You can try many steps in a nutshell
First perform an EXPLAIN statement and act accordingly
Make sure the compared attributes are of equal length
Make your keys as short as possible
Consider adding indexes where you need too, (in the WHERE clause if the values are suitable for indexing, and you perform this query very often)
Make an ANALYZE TABLE
Avoid any redundant data or with NULL values as much as possible

Joining a count query mysql for performance

Have searched but can't find an answer which suits the exact needs for this mysql query.
I have the following quires on multiple tables to generate "stats" for an application:
SELECT COUNT(id) as count FROM `mod_**` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_***` WHERE `published`='1';
SELECT COUNT(id) as count FROM `mod_****`;
SELECT COUNT(id) as count FROM `mod_*****`;
pretty simple just counts the rows sometimes based on a status.
however in the pursuit of performance i would love to get this into 1 query to save resources.
I'm using php to fetch this data with simple mysql_fetch_assoc and retrieving $res[count] if it makes a difference (pro isn't guaranteed, so plain old mysql here).
The overhead of sending a query and getting a single-row response is very small.
There is nothing to gain here by combining the queries.
If you don't have indexes yet an INDEX on the published column will greatly speed up the first two queries.
You can use something like
SELECT SUM(published=1)
for some of that. MySQL will take the boolean result of published=1 and translate it to an integer 0 or 1, which can be summed up.
But it looks like you're dealing with MULTIPLE tables (if that's what the **, *** etc... are), in which case you can't really. You could use a UNION query, e.g.:
SELECT ...
UNION ALL
SELECT ...
UNION ALL
SELECT ...
etc...
That can be fired off as one single query to the DB, but it'll still execute each sub-query as its own query, and simply aggregate the individual result sets into one larger set.
Disagreeing with #Halcyon I think there is an appreciable difference, especially if the MySQL server is on a different machine, as every single query uses at least one network packet.
I recommend you UNION the queries with a marker field to protect against the unexpected.
As #Halcyon said there is not much to gain here. You can anyway do several UNIONS to get all the result in one query

Speeding up responses from a database query

I am running a select * from table order by date desc query using php on a mysql db server, where the table has a lot of records, which slows down the response time.
So, is there any way to speed up the response. If indexing is the answer, what all columns should I make indexes.
An index speeds up searching when you have a WHERE clause or do a JOIN with fields you have indexed. In your case you don't do that: You select all entries in the table. So using an index won't help you.
Are you sure you need all of the data in that table? When you later filter, search or aggregate this data in PHP, you should look into ways to do that in SQL so that the database sends less data to PHP.
you need to use caching system.
the best i know Memcache It's really great to speed up your application and it's not using database at all.
Simple answer: you can't speed anything up using software.
Reason: you're selecting entire contents of a table and you said it's a large table.
What you could do is cache the data, but not using Memcache because it's got a limit on how much data it can cache (1 MB per key), so if your data exceeds that - good luck using Memcache to cache a huge result set without coming up with an efficient scheme of maintaining keys and values.
Indexing won't help because you haven't got a WHERE clause, what could happen is that you can speed up the order by clause slightly. Use EXPLAIN EXTENDED before your query to see how much time is being spent in transmitting the data over the network and how much time is being spent in retrieving and sorting the data from the query.
If your application requires a lot of data in order for it to work, then you have these options:
Get a better server that can push the data faster
Redesign your application because if it requires so much data in order to run, it might not be designed with efficiency in mind
Optimizing Query is a big topic and beyond the scope this question
here are some highlight that will boost you select statement
Use proper Index
Limit the number records
use the column name that you require (instead writing select * from table use select col1, col2 from table)
to limit query for large offset is little tricky in mysql
this select statement for large offset will be slow because it have to process large set of data
SELECT * FROM table order by whatever LIMIT m, n;
to optimize this query here is simple solution
select A.* from table A
inner join (select id from table order by whatever limit m, n) B
on A.id = B.id
order by A.whatever

Querying multiple tables and only returning a single array?

Hey all, I am wondering what the best way to do the following is:
Query up to 5 tables (depends on user input) pull back the results and return a single array. Lets say I query table A and have the results stored as a result handle (return from the pg_query() fn), shall I go ahead and convert those to an array, and then continue appending the results from the subsequent queries to that? If so can I get away with simply using the array_merge() function?
Sounds like you want a UNION.
use a union:
SELECT col1, col2
FROM a
UNION
SELECT col1, col2
FROM b
Using a UNION might be the answer only if your results are going to be low. UNION's are a little taxing on the DB server and some DB's have a hard time optimizing them. By using a Store Procedure or Making 5 Queries then doing an array merge, will allow the DB server to cache some results and improve DB performance.
If the result set is going to be low and there are not going to be much calls, then using a UNION should be fine.

Optimizing a PHP page: MySQL bottleneck

I have a page that is taking 37 seconds to load. While it is loading it pegs MySQL's CPU usage through the roof. I did not write the code for this page and it is rather convoluted so the reason for the bottleneck is not readily apparent to me.
I profiled it (using kcachegrind) and find that the bulk of the time on the page is spent doing MySQL queries (90% of the time is spent in 25 different mysql_query calls).
The queries take the form of the following with the tag_id changing on each of the 25 different calls:
SELECT * FROM tbl_news WHERE news_id
IN (select news_id from
tbl_tag_relations WHERE tag_id = 20)
Each query is taking around 0.8 seconds to complete with a few longer delays thrown in for good measure... thus the 37 seconds to completely load the page.
My question is, is it the way the query is formatted with that nested select that is causing the problem? Or could it be any one of a million other things? Any advice on how to approach tackling this slowness is appreciated.
Running EXPLAIN on the query gives me this (but I'm not clear on the impact of these results... the NULL on primary key looks like it would be bad, yes? The number of results returned seems high to me as well as only a handful of results are returned in the end):
1 PRIMARY tbl_news ALL NULL NULL NULL NULL 1318 Using where
2 DEPENDENT SUBQUERY tbl_tag_relations ref FK_tbl_tag_tags_1 FK_tbl_tag_tags_1 4 const 179 Using where
I'e addressed this point in Database Development Mistakes Made by AppDevelopers. Basically, favour joins to aggregation. IN isn't aggregation as such but the same principle applies. A good optimize will make these two queries equivalent in performance:
SELECT * FROM tbl_news WHERE news_id
IN (select news_id from
tbl_tag_relations WHERE tag_id = 20)
and
SELECT tn.*
FROM tbl_news tn
JOIN tbl_tag_relations ttr ON ttr.news_id = tn.news_id
WHERE ttr.tag_id = 20
as I believe Oracle and SQL Server both do but MySQL doesn't. The second version is basically instantaneous. With hundreds of thousands of rows I did a test on my machine and got the first version to sub-second performance by adding appropriate indexes. The join version with indexes is basically instantaneous but even without indexes performs OK.
By the way, the above syntax I use is the one you should prefer for doing joins. It's clearer than putting them in the WHERE clause (as others have suggested) and the above can do certain things in an ANSI SQL way with left outer joins that WHERE conditions can't.
So I would add indexes on the following:
tbl_news (news_id)
tbl_tag_relations (news_id)
tbl_tag_relations (tag_id)
and the query will execute almost instantaneously.
Lastly, don't use * to select all the columns you want. Name them explicitly. You'll get into less trouble as you add columns later.
The SQL Query itself is definitely your bottleneck. The query has a sub-query in it, which is the IN(...) portion of the code. This is essentially running two queries at once. You can likely halve (or more!) your SQL times with a JOIN (similar to what d03boy mentions above) or a more targeted SQL query. An example might be:
SELECT *
FROM tbl_news, tbl_tag_relations
WHERE tbl_tag_relations.tag_id = 20 AND
tbl_news.news_id = tbl_tag_relations.news_id
To help SQL run faster you also want to try to avoid using SELECT *, and only select the information you need; also put a limiting statement at the end. eg:
SELECT news_title, news_body
...
LIMIT 5;
You also will want to look into the database schema itself. Make sure you are indexing all of the commonly referred to columns so that the queries will run faster. In this case, you probably want to check your news_id and tag_id fields.
Finally, you will want to take a look at the PHP code and see if you can make one single all-encompassing SQL query instead of iterating through several seperate queries. If you post more code we can help with that, and it will probably be the single greatest time savings for your posted problem. :)
If I understand correctly, this is just listing the news stories for a specific set of tags.
First of all, you really shouldn't
ever SELECT *
Second, this can probably be
accomplished within a single query,
thus reducing the overhead cost of
multiple queries. It seems like it
is getting fairly trivial data so
it could be retrieved within a
single call instead of 20.
A better approach to using IN might be to use a JOIN with a WHERE condition instead. When using an IN it will basically be a lot of OR statements.
Your tbl_tag_relations should definitely have an index on tag_id
select *
from tbl_news, tbl_tag_relations
where
tbl_tag_relations.tag_id = 20 and
tbl_news.news_id = tbl_tag_relations.news_id
limit 20
I think this gives the same results, but I'm not 100% sure. Sometimes simply limiting the results helps.
Unfortunately MySQL doesn't do very well with uncorrelated subqueries like your case shows. The plan is basically saying that for every row on the outer query, the inner query will be performed. This will get out of hand quickly. Rewriting as a plain old join as others have mentioned will work around the problem but may then cause the undesired affect of duplicate rows.
For instance the original query would return 1 row for each qualifying row in the tbl_news table but this query:
SELECT news_id, name, blah
FROM tbl_news n
JOIN tbl_tag_relations r ON r.news_id = n.news_id
WHERE r.tag_id IN (20,21,22)
would return 1 row for each matching tag. You could stick DISTINCT on there which should only have a minimal performance impact depending on the size of the dataset.
Not to troll too badly, but most other databases (PostgreSQL, Firebird, Microsoft, Oracle, DB2, etc) would handle the original query as an efficient semi-join. Personally I find the subquery syntax to be much more readable and easier to write, especially for larger queries.

Categories