This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How to request a random row in SQL?
I have table:
name,age,school
jane,15,zu
peter,16,zu
john,15,stu
Tomas,15,kul
viera,17,stu
tibor15,zu
I want select from this table 1 person (randomly) per school
select * from table group by school order by rand()
It's terribly innefficient since it's ORDERing the entire table randomly, and then GROUPing on a potentially huge unindexed temporary table, but this works.
SELECT *
FROM (
SELECT *
FROM my_table
ORDER BY RAND()
)a
GROUP BY school
It would likely be more wise to break it down.
One query to retrieve a list of
schools.
For each school, one query to get a random student for that school.
See this post for some slick tricks on how to get single random values efficiently.
If you are using php (as the tag suggests), run SELECT * FROM table; then generate a random number based on the number of results in the query. For example:
i = floor(random()*query.length);
Then seek to the record indicated seek(i) and viola, you have a random entry =)
You'll have to use your own knowledge/documentation for the exact syntax, I haven't touched PHP in a while, but the logic is sound.
Gary
Related
This question already has answers here:
Is storing a delimited list in a database column really that bad?
(10 answers)
Closed 4 years ago.
I have an issue with SQL IN query: I'm storing multiple employee IDs in a table, separated with commas for each task. When I try to fetch a task with an IN query, I'm not getting the row which contains the employee IDs.
My query is:
select
t.*,
e.emp_name,
d.department_name
from
task as t
LEFT JOIN employee as e on(e.emp_id=t.assign_to)
LEFT JOIN department as d on(d.depart_id=e.depart_id)
where
t.task_status='PENDING'
AND t.created_by!='31'
AND t.assign_to IN ('31')
order by
t.task_id DESC
The stored value in database
IN doesn't work like that
Example if your data looks like:
ManagerName, ManagerOf
'John', 'Steve,Sarah,Dave'
You can't do:
SELECT * FROM managers WHERE 'sarah' IN ManagerOf
IN is best conceived as an extension of OR:
SELECT * FROM managers WHERE managerof IN ('Sarah','Steve')
--is the same as:
SELECT * FROM managers WHERE
managerof = 'Sarah' OR
managerof = 'Steve'
There would be as many OR clauses as there are items in the IN list.
Hopefully this shows you why the database doesn't return you any results.. Because a value of Steve,Sarah,Dave is not equal to either Sarah or Steve. The database doesn't look at the commas and say "oh, that's a list" - it's just a single string of characters that happens to have a comma character every now and then
There are nasty quick hacks to so-so achieve what you want, using LIKE and string concat but they aren't worthy of an answer, to be honest
You need to change your database structure to include a new table that tracks the task id and the employee(s) id it is/was assigned to. After doing that, you'll be able to use IN on the employee id column as you're expecting to with this query
This question already has answers here:
MySQL: Fastest way to count number of rows
(14 answers)
Closed 8 years ago.
I want to count all of rows in table that match to my condition as fast as mysql can do.
So, i have four SQL and want you to explain all of them, how is it different for each SQL?
and which is fastest or best for query times and server performance? or it has another way that can better than these. Thank you.
select count(*) as total from table where my_condition
select count(1) as total from table where my_condition
select count(primary_key) as total from table where my_condition
or
select * from table where my_condition
//and then use mysql_num_rows()
First of all don't even think about doing the last one! It literally means select all the data I have in the database, return it to the client and the client will count them!
Second, you need to understand that if you're counting by column name, count will not return null values, for example: select count(column1) from table_name; if there is a row where column1 is null, it will not count it, so be careful.
select count(*), select count(1), select count(primary_key) are all equal and you'll see no difference whatsoever.
I have to select 4 rows randomly from a column.
Is is better to generate randomly 4 id and to perform 4 requests 'select column from database where id = ... '
Or to select all the rows in one request and to choose after?
If you are capable of generating random existing id's, I think the best approach is to use a clause like where id in (id1, id2, id3, id4). This will result in getting 4 records in one query, so no unnecessary query's or records are fetched.
As told before, where id in (id1, id2, id3, id4) is the fastest way from the MySQL perspective. How ever, you will need some logic in the application generating those IDs : All 4 IDs shall exist, be randomly distributed, and you want to avoid duplicates. In worst case you will be retrieving a list of all existend IDs with a huge query, extracting 4 random values, and querying again.
With all that logic to be done, it can be wise to move selection into MySQL:
SELECT * FROM foobar
ORDER BY RAND()
LIMIT 4;
You must understand that this is slow in mysql, but you have a speed gain in the application logic and can be sure to get random values equally seed all over your table.
EDIT:
The comment asks if PHP is fasten in this task then MySQL. Answer is no.
It is not done by "using rand". You need to have an array containing all those IDs in PHP. That is a huge query, lots of TCP traffic, huge array to be buildt in php, huge btree to be buildt by zend engine. Then, with the IDs, you must fire a second query to get the rows for those IDs.
Although the RAND() function may be slow, so far I have not had significant problems with speed. MY strategy is actually to join the database back to a query of itself returning a list of random IDs with a limit.
SELECT *
FROM table AS t1
JOIN (
SELECT rowID
FROM table
ORDER BY RAND()
LIMIT 4
) AS t2
WHERE t1.rowID = t2.rowID
There is also a more robust solution that exist - try checking out this question (asked in 2010).
I'm trying to get 4 random results from a table that holds approx 7 million records. Additionally, I also want to get 4 random records from the same table that are filtered by category.
Now, as you would imagine doing random sorting on a table this large causes the queries to take a few seconds, which is not ideal.
One other method I thought of for the non-filtered result set would be to just get PHP to select some random numbers between 1 - 7,000,000 or so and then do an IN(...) with the query to only grab those rows - and yes, I know that this method has a caveat in that you may get less than 4 if a record with that id no longer exists.
However, the above method obviously will not work with the category filtering as PHP doesn't know which record numbers belong to which category and hence cannot select the record numbers to select from.
Are there any better ways I can do this? Only way I can think of would be to store the record id's for each category in another table and then select random results from that and then select only those record ID's from the main table in a secondary query; but I'm sure there is a better way!?
You could of course use the RAND() function on a query using a LIMIT and WHERE (for the category). That however as you pointed out, entails a scan of the database which takes time, especially in your case due to the volume of data.
Your other alternative, again as you pointed out, to store id/category_id in another table might prove a bit faster but again there has to be a LIMIT and WHERE on that table which will also contain the same amount of records as the master table.
A different approach (if applicable) would be to have a table per category and store in that the IDs. If your categories are fixed or do not change that often, then you should be able to use that approach. In that case you will effectively remove the WHERE from the clause and getting a RAND() with a LIMIT on each category table would be faster since each category table will contain a subset of records from your main table.
Some other alternatives would be to use a key/value pair database just for that operation. MongoDb or Google AppEngine can help with that and are really fast.
You could also go towards the approach of a Master/Slave in your MySQL. The slave replicates content in real time but when you need to perform the expensive query you query the slave instead of the master, thus passing the load to a different machine.
Finally you could go with Sphinx which is a lot easier to install and maintain. You can then treat each of those category queries as a document search and let Sphinx randomize the results. This way you offset this expensive operation to a different layer and let MySQL continue with other operations.
Just some issues to consider.
Working off your random number approach
Get the max id in the database.
Create a temp table to store your matches.
Loop n times doing the following
Generate a random number between 1 and maxId
Get the first record with a record Id greater than the random number and insert it into your temp table
Your temp table now contains your random results.
Or you could dynamically generate sql with a union to do the query in one step.
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
UNION
SELECT * FROM myTable WHERE ID > RAND() AND Category = zzz LIMIT 1
Note: my sql may not be valid, as I'm not a mySql guy, but the theory should be sound
First you need to get number of rows ... something like this
select count(1) from tbl where category = ?
then select a random number
$offset = rand(1,$rowsNum);
and select a row with offset
select * FROM tbl LIMIT $offset, 1
in this way you avoid missing ids. The only problem is you need to run second query several times. Union may help in this case.
For MySQl you can use
RAND()
SELECT column FROM table
ORDER BY RAND()
LIMIT 4
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
MySQL select 10 random rows from 600K rows fast
I need to pull from a table a random id. At this moment only a few records exists but with time they will grow. What methods of getting this id are in Php or MySql, and what are the trade offs, consequences between them. One last thing i need speed and performance.
select * from YOUR_TABLE order by rand() limit 1
You can achieve this direct in your SQL:
SELECT `idfield` FROM `table` ORDER BY RAND() LIMIT 0,1;
Also see here for some alternatives.
This will be a simple means of execution than building the randomisation in your PHP and then passing to mySQL, though the link above details the merits of the various approaches.
SELECT id FROM table ORDER BY RAND() LIMIT 1
What this does is basically: take the table, order the records by a random number, and get the first record's ID.
There is already a lot of questions and answer about that:
Random Records Mysql PHP
mysql query with random and desc
If you have less that 500,000 records, the RAND() is perfectly fine.
SELECT * FROM tbl_name ORDER BY RAND() LIMIT 1
Well you can obtain a random number with RAND(), but you would still have to call that method/function until you actually get something (you can have 1,2,5,6,100 as id`s due to deletion).
there is a post on this http://jan.kneschke.de/projects/mysql/order-by-rand/
Procedures can be usefull (but said to be slow...):
1 var to get the maximum value
then produce a random within 1 and that interval (or get the minimum)
then query