Is UNION faster than running separate queries? - php

I have 7 tables that I could UNION on (wwith a limit of 30)
OR
should I do 7 separate queries (with a limit of 30) and trace through them using PHP.
Which why is faster? More optimal? In the second way I would have to trace through part of the 7 queries simulataneously and find the top 30 I need.

What is your needs?
As #chris wrote before, this may help you:
Complex SQL (Maybe Outer Joins)
select * from (select ... from ... order ... limit 10 )
union all
select * from (select ... from ... order ... limit 10)
order by ... limit 10
As I know (checked on DB with 50 million rows) - its fater than not using the devired queries.

Before making decisions you need at least to run both kinds of queries with MySql's EXPLAIN and analyze results. Something like this:
EXPLAIN SELECT f1, f2, f3 FROM t1
UNION ALL
SELECT f1, f2, f3 FROM t2;

depends if each query produce unique results using UNION ALL is better you save server trips, and you can sort the result after the union is performed. e.g
select column1 alias1, column2 alias2, from table x where ...
UNION ALL
select column3 alias1, column2 alias2 from table y where ...
...
order by 1
Sorry by my English

Related

combine ORDER BY in UNION query

how do i combine ORDER BY in UNION query?
I tried this and got error:
SELECT country.country_name AS res
FROM countries AS country
WHERE (lower (country.country_name) LIKE '%".$_POST['query']."%')
ORDER BY country.lang = '".$_POST['lang']."'
UNION
SELECT sec.loc AS res
FROM itin_secs AS sec
WHERE sec.loc LIKE '%".$_POST['query']."%'
I would recommend:
SELECT res
FROM (SELECT c.country_name as res, 1 as priority, c.lang
FROM countries AS c
WHERE lower(c.country_name) LIKE '%".$_POST['query']."%')
UNION ALL -- recommended instead of `UNION`
SELECT sec.loc Ares
FROM itin_secs as sec, 2, NULL
WHERE sec.loc LIKE '%".$_POST['query']."%'
) x
ORDER BY priority, lang = '".$_POST['lang']."';
This query is quite explicit about the final ordering. It also uses UNION ALL, so the query does not incur the overhead of removing duplicates.
As a secondary issue (I mean, the original query doesn't work), you should parameterize the query. You should really start by writing parameterized queries, so they are natural for anything you want to do.

PHP MySQL newest records from multiple tables

I'm trying to select the latest 10 records from multiple tables (ORDER BY date). For example, 8 of the newest records might be in one table and 2 in another (10 rows in total). Is there a way to select those 10 records?
SELECT *
FROM
( SELECT * FROM x
UNION ALL
SELECT * FROM y
) n
ORDER
BY date DESC
LIMIT 10;
You can maybe use:
SELECT column_name(s)
FROM table1
ORDER BY date LIMIT 0,8
UNION ALL
SELECT column_name(s)
FROM table2
ORDER BY date LIMIT 0,2;
SELECT * FROM (
SELECT some_data AS alias1, date_field AS mydate
FROM table1
UNION ALL
SELECT datazzz AS alias1, another_datefield AS mydate
FROM table2
)
ORDER BY mydate DESC LIMIT 10
Syntax might need a little bit of tweaking, but that's the gist of it.
Specifically, you need to select whatever data you want out of each of the tables and then use aliases to make sure they have the same column names (otherwise they can't be returned in the same result set). Then after that you need to order by the common date field.

Insert lots of rows with only a number

What is the fastest way to create 899 rows in a table, using only the number. The column isn't autoincrement.
Currently I create a query like this:
$a1=range(100,999);
$a1=implode('),(',$a1);
$a1='INSERT INTO groups (val) VALUES('.$a1.')';
it gives a huge query like this:
INSERT INTO groups (val) VALUES(100),(101),(102),(103),(104),(105),(106),(107),(108),
(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),
(123),(124),(125), etc etc etc....
I wonder if there is a faster and nicer way to do this?
I don't think you have a faster way of doing that. Look at MySQL documentation
The time required for inserting a row is determined by the following
factors, where the numbers indicate approximate proportions:
Connecting: (3)
Sending query to server: (2)
Parsing query: (2)
Inserting row: (1 × size of row)
Inserting indexes: (1 × number of indexes)
Closing: (1)
This does not take into consideration the initial overhead to open
tables, which is done once for each concurrently running query.
The size of the table slows down the insertion of indexes by log N,
assuming B-tree indexes.
You can use the following methods to speed up inserts:
If you are inserting many rows from the same client at the same time,
use INSERT statements with multiple VALUES lists to insert several
rows at a time. This is considerably faster (many times faster in some
cases) than using separate single-row INSERT statements. If you are
adding data to a nonempty table, you can tune the
bulk_insert_buffer_size variable to make data insertion even faster.
See Section 5.1.4, “Server System Variables”.
With one query you save the Connecting, Sending query to server , Closing, plus MySQL optimizing your query.
Also, if you're only inserting around 1000 rows with so little data, the insertion is very fast so i wouldn't be worried about performance in this case.
For a range of numbers a smaller query can be used if you want:-
INSERT INTO groups (val)
SELECT Hundreds.a * 100 + Tens.a * 10 + Units.a AS aNumber
FROM
(SELECT 0 AS a UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Hundreds,
(SELECT 0 AS a UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Tens,
(SELECT 0 AS a UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) Units
HAVING aNumber BETWEEN 100 AND 999
Not sure this saves you anything much though.

MySQL 'select WHERE like x OR y' multiple results for one row

Alright, I tried to word the title as well as possible. Here's what I'm looking for...let's say I've got a row with an ID of 3 in a table called 'table' with a 'col1' value of "apple,potato,carrot,squash" that I want to search.
I want to be able to do a search something like this:
SELECT * FROM table WHERE col1 LIKE '%potato%' OR col1 LIKE '%apple%';
...and I want it to result in two separate results for the row with the ID of 3.
I could parse out the results with PHP obviously, but it seems a lot more efficient to just get the results exactly as I want them directly from MySQL. Is there a way to do this?
(Note that this is not a homework assignment or anything, I'm just trying to be as generic as possible for the sake of the example)
You're nullifying the use of indexes with the LIKE '%substring%' query. Using UNION ALL with multiple queries would work, and it's simple. However, one drawback to that method is that MySQL will have to scan all the rows in the database for each subquery.
So, for a query like the following, assuming 1000 records:
SELECT * FROM table WHERE col1 LIKE '%potato%'
UNION ALL
SELECT * FROM table WHERE col1 LIKE '%apple%'
MySQL will have to scan through 2000 records (1000 * 2). Then, you have to process the results, when really, you just want a count. For three search types, it's 3000, etc. It doesn't scale well.
Instead, both for performance, and for simplicity (in processing the results), you can have MySQL do the work all at once with the CASE and SUM statements:
SELECT SUM(CASE
WHEN t.col1 LIKE '%potato%' THEN 1
ELSE 0
END) AS numPotatoes,
SUM(CASE
WHEN t.col1 LIKE '%apple%' THEN 1
ELSE 0
END) AS numApples
FROM table t
This allows MySQL to scan through all the records just once and return your actual counts.
If you really need it to return 2 results then you could do something like:
SELECT * FROM table WHERE col1 LIKE '%potato%'
UNION ALL
SELECT * FROM table WHERE col1 LIKE '%apple%'
Here is a way to formulate the query in a "general" way, making it easier to add in new comparisons:
SELECT t.*
FROM table t join
(select '%potato%' as str union all
select '%apple%'
) comp
on t.col1 like comp.str;
That said, I would suggest the following variant:
SELECT t.*
FROM table t join
(select 'potato' as item union all
select 'apple'
) comp
on find_in_set(comp.item, t.col1) > 0

How to quickly SELECT 3 random records from a 30k MySQL table with a where filter by a single query?

Well, this is a very old question never gotten real solution. We want 3 random rows from a table with about 30k records. The table is not so big in point of view MySQL, but if it represents products of a store, it's representative. The random selection is useful when one presents 3 random products in a webpage for example. We would like a single SQL string solution that meets these conditions:
In PHP, the recordset by PDO or MySQLi must have exactly 3 rows.
They have to be obtained by a single MySQL query without Stored Procedure used.
The solution must be quick as for example a busy apache2 server, MySQL query is in many situations the bottleneck. So it has to avoid temporary table creation, etc.
The 3 records must be not contiguous, ie, they must not to be at the vicinity one to another.
The table has the following fields:
CREATE TABLE Products (
ID INT(8) NOT NULL AUTO_INCREMENT,
Name VARCHAR(255) default NULL,
HasImages INT default 0,
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The WHERE constraint is Products.HasImages=1 permitting to fetch only records that have images available to show on the webpage. About one-third of records meet the condition of HasImages=1.
Searching for a Perfection, we first let aside the existent Solutions that have drawbacks:
I. This basic solution using ORDER BY RAND(),
is too slow but guarantees 3 really random records at each query:
SELECT ID, Name FROM Products WHERE HasImages=1 ORDER BY RAND() LIMIT 3;
*CPU about 0.10s, scanning 9690 rows because of WHERE clause, Using where; Using temporary; Using filesort, on Debian Squeeze Double-Core Linux box, not so bad but
not so scalable to a bigger table as temporary table and filesort are used, and takes me 8.52s for the first query on the test Windows7::MySQL system. With such a poor performance, to avoid for a webpage isn't-it ?
II. The bright solution of riedsio using JOIN ... RAND(),
from MySQL select 10 random rows from 600K rows fast, adapted here is only valid for a single random record, as the following query results in an almost always contiguous records. In effect it gets only a random set of 3 continuous records in IDs:
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID)
AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
*CPU about 0.01 - 0.19s, scanning 3200, 9690, 12000 rows or so randomly, but mostly 9690 records, Using where.
III. The best solution seems the following with WHERE ... RAND(),
seen on MySQL select 10 random rows from 600K rows fast proposed by bernardo-siu:
SELECT Products.ID, Products.Name FROM Products
WHERE ((Products.Hasimages=1) AND RAND() < 16 * 3/30000) LIMIT 3;
*CPU about 0.01 - 0.03s, scanning 9690 rows, Using where.
Here 3 is the number of wished rows, 30000 is the RecordCount of the table Products, 16 is the experimental coefficient to enlarge the selection in order to warrant the 3 records selection. I don't know on what basis the factor 16 is an acceptable approximation.
We so get at the majority of cases 3 random records and it's very quick, but it's not warranted: sometimes the query returns only 2 rows, sometimes even no record at all.
The three above methods scan all records of the table meeting WHERE clause, here 9690 rows.
A better SQL String?
Ugly, but quick and random. Can become very ugly very fast, especially with tuning described below, so make sure you really want it this way.
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
UNION ALL
(SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT RAND()*(SELECT MAX(ID) FROM Products) AS ID) AS t ON Products.ID >= t.ID
WHERE Products.HasImages=1
ORDER BY Products.ID
LIMIT 1)
First row appears more often than it should
If you have big gaps between IDs in your table, rows right after such gaps will have bigger chance to be fetched by this query. In some cases, they will appear significatnly more often than they should. This can not be solved in general, but there's a fix for a common particular case: when there's a gap between 0 and the first existing ID in a table.
Instead of subquery (SELECT RAND()*<max_id> AS ID) use something like (SELECT <min_id> + RAND()*(<max_id> - <min_id>) AS ID)
Remove duplicates
The query, if used as is, may return duplicate rows. It is possible to avoid that by using UNION instead of UNION ALL. This way duplicates will be merged, but the query no longer guarantees to return exactly 3 rows. You can work around that too, by fetching more rows than you need and limiting the outer result like this:
(SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
UNION (SELECT ... LIMIT 1)
...
UNION (SELECT ... LIMIT 1)
LIMIT 3
There's still no guarantee that 3 rows will be fetched, though. It just makes it more likely.
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (RAND() * (SELECT MAX(ID) FROM Products)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Of course the above is given "near" contiguous records you are feeding it the same ID every time without much regard to the seed of the rand function.
This should give more "randomness"
SELECT Products.ID, Products.Name
FROM Products
INNER JOIN (SELECT (ROUND((RAND() * (max-min))+min)) AS ID) AS t ON Products.ID >= t.ID
WHERE (Products.HasImages=1)
ORDER BY Products.ID ASC
LIMIT 3;
Where max and min are two values you choose, lets say for example sake:
max = select max(id)
min = 225
This statement executes really fast (19 ms on a 30k records table):
$db = new PDO('mysql:host=localhost;dbname=database;charset=utf8', 'username', 'password');
$stmt = $db->query("SELECT p.ID, p.Name, p.HasImages
FROM (SELECT #count := COUNT(*) + 1, #limit := 3 FROM Products WHERE HasImages = 1) vars
STRAIGHT_JOIN (SELECT t.*, #limit := #limit - 1 FROM Products t WHERE t.HasImages = 1 AND (#count := #count -1) AND RAND() < #limit / #count) p");
$products = $stmt->fetchAll(PDO::FETCH_ASSOC);
The Idea is to "inject" a new column with randomized values, and then sort by this column. The generation of and sorting by this injected column is way faster than the "ORDER BY RAND()" command.
There "might" be one caveat: You have to include the WHERE query twice.
What about creating another table containing only items with image ? This table will be much lighter as it will contain only one-third of the items the original table has !
------------------------------------------
|ID | Item ID (on the original table)|
------------------------------------------
|0 | 0 |
------------------------------------------
|1 | 123 |
------------------------------------------
.
.
.
------------------------------------------
|10 000 | 30 000 |
------------------------------------------
You can then generate three random IDs in the PHP part of the code and just fetch'em the from the database.
I've been testing the following bunch of SQLs on a 10M-record, poorly designed database.
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'(
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
) UNION (
SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1
)';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1
, #l2
, #l3;
DEALLOCATE PREPARE random_records;
It took almost 7 minutes to get the three results. But I'm sure its performance will be much better in your case. Yet if you are looking for a better performance I suggest the following ones as they took less than 30 seconds for me to get the job done (on the same database).
SELECT COUNT(ID)
INTO #count
FROM Products
WHERE HasImages = 1;
PREPARE random_records FROM
'SELECT * FROM Products WHERE HasImages = 1 LIMIT ?, 1';
SET #l1 = ROUND(RAND() * #count);
SET #l2 = ROUND(RAND() * #count);
SET #l3 = ROUND(RAND() * #count);
EXECUTE random_records USING #l1;
EXECUTE random_records USING #l2;
EXECUTE random_records USING #l3;
DEALLOCATE PREPARE random_records;
Bear in mind that both these commands require MySQLi driver in PHP if you want to execute them in one go. And their only difference is that the later one requires calling MySQLi's next_result method to retrieve all three results.
My personal belief is that this is the fastest way to do this.
On the off-chance that you're willing to accept an 'outside the box' type of answer, I'm going to repeat what I said in some of the comments.
The best way to approach your problem is to cache your data in advance (be that in an external JSON or XML file, or in a separate database table, possibly even an in-memory table).
This way you can schedule your performance-hit on the products table to times when you know the server will be quiet, and reduce your worry about creating a performance hit at "random" times when the visitor arrives to your site.
I'm not going to suggest an explicit solution, because there are far too many possibilities on how to build a solution. However, the answer suggested by #ahmed is not silly. If you don't want to create a join in your query, then simply load more of the data that you require into the new table instead.

Categories