Mysql Comparison of 'WHERE' methods - php

Is there a difference between these two where clauses in terms of speed? Are the column still indexed in the second one?
1. SELECT * FROM TableName WHERE col1 = 'a' AND col2 = 'b' AND col3='c'
2. SELECT * FROM TableName WHERE (col1,col2,col3) = ('a','b','c')
When
PRIMARY KEY (col1,col2,col3)
Thanks

There shouldn't be, but you can use EXPLAIN to find out in the context of your database.

Use EXPLAIN to determine the execution plan for the queries.
If EXPLAIN shows that they are the same, then the only time difference possible would be the parse time of the query string, which is insignificant compared to running the query.
Since you said EXPLAIN shows the same, just pick whichever one you prefer, it won't matter which you pick.

You will get more information By using
EXPLAIN EXTENDED

Related

Using IN clause vs. multiple SELECTs

I was wondering which of these would be faster (performance-wise) to query (on MySQL 5.x CentOS 5.x if this matters):
SELECT * FROM table_name WHERE id=1;
SELECT * FROM table_name WHERE id=2;
.
.
.
SELECT * FROM table_name WHERE id=50;
or...
SELECT * FROM table_name WHERE id IN (1,2,...,50);
I have around 50 ids to query for. I know usually DB connections are expensive, but I've seen the IN clause isn't so fast either [sometimes].
I'm pretty sure the second option gives you the best performance; one query, one result. You have to start looking for > 100 items before it may become an issue.
See also the accepted answer from here: MySQL "IN" operator performance on (large?) number of values
IMHO you should try it and measure response time: IN should give you better performances...
Anyway if your ids are sequential you could try
SELECT * FROM table_name WHERE id BETWEEN 1 AND 50
Here is another post where the discuss the performance of using OR vs IN. IN vs OR in the SQL WHERE Clause
You suggested using multiple queries, but using OR would also work.
2nd will be faster because resources are consumed when query gets interpreted and during php communication with mysql for sending query and waiting for result , if your data is sequential you can also do just
SELECT * FROM table_name WHERE id <= 50;
I was researching this after experimenting with 3000+ values in an IN clause. It turned out to be multitudes faster than individual SELECTs since the column referenced in the IN was not keyed. My guess is that in my case it only needed to build a temporary index for that column once instead of 3000 separate times.

Array in php mysql query

when i use arrays mysql query it's really slow. are there any tricks that makes this faster?
e.g:
SELECT *
FROM posts
WHERE type IN ('1','2','5')
ORDER BY id ASC
takes much longer then other queries.
If type is integer type, then remove apostrophes:
WHERE type IN (1, 2, 5)
If not - change type type to integer
There a number of things to speed up queries I'd consider looking up the following:
Normalizing. Perhaps the structure of your tables isn't the most efficient?
Indexing. This will improve query times if you KNOW what you want to search on.
EXPLAIN will tell you why your query is slow. Based on that you can make adjustments to your query or your table structure to solve the problem. In this case, use it like this:
EXPLAIN SELECT * FROM posts WHERE type IN (1,2,5) ORDER BY id ASC
Here you need to look two things
First, ('1','2','5') this is string not integer it should be (1,2,5).
Second, you can apply indexes on type field.

Optimizing SQL query

I have to get all entries in database that have a publish_date between two dates. All dates are stored as integers because dates are in UNIX TIMESTAMP format...
Following query works perfect but it takes "too long". It returns all entries made between 10 and 20 dazs ago.
SELECT * FROM tbl_post WHERE published < (UNIX_TIMESTAMP(NOW())-864000)
AND published> (UNIX_TIMESTAMP(NOW())-1728000)
Is there any way to optimize this query? If I am not mistaken it is calling the NOW() and UNIX_TIMESTAMP on evey entry. I thought that saving the result of these 2 repeating functions into mysql #var make the comparison much faster but it didn't. 2nd code I run was:
SET #TenDaysAgo = UNIX_TIMESTAMP(NOW())-864000;
SET #TwentyDaysAgo = UNIX_TIMESTAMP(NOW())-1728000;
SELECT * FROM tbl_post WHERE fecha_publicado < #TenDaysAgo
AND fecha_publicado > #TwentyDaysAgo;
Another confusing thing was that PHP can't run the bove query throught mysql_query(); ?!
Please, if you have any comments on this problem it will be more than welcome :)
Luka
Be sure to have an index on published.And make sure it is being used.
EXPLAIN SELECT * FROM tbl_post WHERE published < (UNIX_TIMESTAMP(NOW())-864000) AND published> (UNIX_TIMESTAMP(NOW())-1728000)
should be a good start to see what's going on on the query. To add an index:
ALTER TABLE tbl_post ADD INDEX (published)
PHP's mysql_query function (assuming that's what you're using) can only accept one query per string, so it can't execute the three queries that you have in your second query.
I'd suggest moving that stuff into a stored procedure and calling that from PHP instead.
As for the optimization, setting those variables is about as optimized as you're going to get for your query. You need to make the comparison for every row, and setting a variable provides the quickest access time to the lower and upper bounds.
One improvement in the indexing of the table, rather than the query itself would be to cluster the index around fecha_publicado to allow MySQL to intelligently handle the query for that range of values. You could do this easily by setting fecha_publicado as PRIMARY KEY of the table.
The obvious things to check are, is there an index on the published date, and is it being used?
The way to optimize would be to partition the table tbl_post on the published key according to date ranges (weekly seems appropriate to your query). This is a feature that is available for MySQL, PostgreSQL, Oracle, Greenplum, and so on.
This will allow the query optimizer to restrict the query to a much narrower dataset.
I agree with BraedenP that a stored procedure would be appropriate here. If you can't use one or really don't want to, you can always either generate the dates on the PHP side, but they might not match exactly with the database unless you have them synced.
You can also do it more quickly as 3 separate queries likely. Query for the begin data, query for the end date, then use those values as input into your target query.

How to Requery a query?

consider "Query1", which is quite time consuming. "Query1" is not static, it depends on $language_id parameter, thats why I can not save it on the server.
I would like to query this "Query1" with another query statement. I expect, that this should be fast. I see perhaps 2 ways
$result = mysql_query('SELECT * FROM raw_data_tbl WHERE ((ID=$language_id) AND (age>13))');
then what? here I want to take result and requery it with something like:
$result2 = mysql_query('SELECT * FROM $result WHERE (Salary>1000)');
Is it possible to create something like "on variable based" MYSQL query directly on the server side and pass somehow variable $language_id to it? The second query would query that query :-)
Thanks...
No, there is no such thing as your second idea.
For the first idea, though, I would go with a single query :
select *
from raw_data
where id = $language_id
and age > 13
and Salary > 1000
Provided you have set the right indexes on your table, this query should be pretty fast.
Here, considering the where clause of that query, I would at least go with an index on these three columns :
id
age
Salary
This should speed things up quite a bit.
For more informations on indexes, and optimization of queries, take a look at :
Chapter 7. Optimization
7.3.1. How MySQL Uses Indexes
12.1.11. CREATE INDEX Syntax
With the use of sub queries you can take advantage of MySQL's caching facilities.
SELECT * FROM raw_data_tbl WHERE (ID='eng') AND (age>13);
... and after this:
SELECT * FROM (SELECT * FROM raw_data_tbl WHERE (ID='eng') AND (age>13)) WHERE salary > 1000;
But this is only beneficial in some very rare circumstances.
With the right indexes your query will run fast enough without the need of trickery. In your case:
CREATE INDEX filter1 ON raw_data_tbl (ID, age, salary);
Although the best solution would be to just add conditions from your second query to the first one, you can use temporary tables to store temporary results. But it would still be better if you put that in a single query.
You could also use subqueries, like SELECT * FROM (SELECT * FROM table WHERE ...) WHERE ....

php and MySQL: 2 requests or 1 request?

I'm building a wepage in php using MySQL as my database.
Which way is faster?
2 requests to MySQL with the folling query.
SELECT points FROM data;
SELECT sum(points) FROM data;
1 request to MySQL. Hold the result in a temporary array and calcuale the sum in php.
$data = SELECT points FROM data;
EDIT -- the data is about 200-500 rows
It's really going to depend on a lot of different factors. I would recommend trying both methods and seeing which one is faster.
Since Phill and Kibbee have answered this pretty effectively, I'd like to point out that premature optimization is a Bad Thing (TM). Write what's simplest for you and profile, profile, profile.
How much data are we talking about? I'd say MySQL is probably faster at doing those kind of operations in the majority of cases.
Edit: with the kind of data that you're talking about, it probably won't make masses of difference. But databases tend to be optimised for those kind of queries, whereas PHP isn't. I think the second DB query is probably worth it.
If you want to do it in one line, use a running total like this:
SET #total=0;
SELECT points, #total:=#total+points AS RunningTotal FROM data;
I wouldn't worry about it until I had an issue with performance.
If you go with two separate queries, you need to watch out for the possibility of the data changing between getting the rows & getting their sum. Until there's an observable performance problem, I'd stick to doing my own summation to keep the page consistent.
The general rule of thumb for efficiency with mySQL is to try to minimize the number of SQL requests. Every call to the database adds overhead and is "expensive" in terms of time required.
The optimization done by mySQL is quite good. It can take very complex requests with many joins, nestings and computations, and make it run efficiently.
But it can only optimize individual requests. It cannot check the relationship between two different SQL statements and optimize between them.
In your example 1, the two statements will make two requests to the database and the table will be scanned twice.
Your example 2 where you save the result and compute the sum yourself would be faster than 1. This would only be one database call, and looping through the data in PHP to get the sum is faster than a second call to the database.
Just for the fun of it.
SELECT COUNT(points) FROM `data`
UNION
SELECT points FROM `data`
The first row will be the total, the next rows will be the data.
NOTE: Union can be slow, but its an option.
Could also do more fun and this supports you sorting the rows.
SELECT 'total' AS name, COUNT(points) FROM `data`
UNION
SELECT 'points' AS name, points FROM `data`
Then selecting through PHP
while($row = mysql_fetch_assoc($query))
{
if($row["data"] == "points")
{
echo $row["points"];
}
if($row["data"] == "total")
{
echo "Total is: ".$row["points"];
}
}
You can use union like this:
(select points, null as total from data) union (select null, sum(points) from data group by points);
The result will look something like this:
point total
2 null
5 null
...
null 7
you can figure out how to handle it.
do it the mySQL way. let the database manager do its work.
mySQL is optimized for such tasks

Categories