For example, say if you wanted a random result every 10 minutes. Is there a way to achieve this with ORDER BY RAND()?
$fetch = mysqli_query($conn, "
SELECT *
FROM food
JOIN food_images ON food.size = food_images.size
ORDER BY RAND()
");
I also am using a JOIN and worried if this might affect the answers. Thank you!
I don't have a MySQL server in front of me so most of this is a guess, but you might try as follows:
You can generate a number that changes only once every ten minutes by taking the system time in seconds, dividing by the number of seconds in ten minutes, and then casting to an integer:
$seed = (int) (time() / 600);
Then pass this value to MySQL's RAND() function as a parameter to seed the RNG, and you should get a repeatable sequence that changes every ten minutes:
$stmt = mysqli_prepare($conn, 'SELECT ... ORDER BY RAND(?)');
mysqli_stmt_bind_param($stmt, 'i', $seed);
You can do it as:
SELECT *, rand(time_to_sec(current_time()) / 600) as ord
FROM food
JOIN food_images ON food.size = food_images.size
order by ord
The parameter of the RAND() function is the seed. The expression in it, changes only every 10 minutes.
You can use MySQL Event Scheduler and as described in the documentation:
you are creating a named database object containing one or more SQL statements to be executed at one or more regular intervals, beginning and ending at a specific date and time
And I guess you are using php so You can use PHP Cron jobs too , Managing Cron Jobs With PHP
Related
I have 2 tables - info and comments.
info has 270,000 rows, while comments has only 100 rows.
I run a php script that selects the data from the database and encodes it to json format.
The PHP script also limits the response to only first 10 rows, and both PHP files Info.php and Comments.php are exactly the same, except the names.
So why is it, that the table with 270,000 rows takes way more time to load than the one with 100 rows when it only prints the first 10 rows?
comments takes 1 seconds while info takes 10 seconds.
This is the PHP query code, pretty simple :
Info.php: $query = "SELECT * FROM info ORDER BY id LIMIT 10;";
Comments.php: $query = "SELECT * FROM comments ORDER BY id LIMIT 10;";
As for testing purposes, they both have the same columns and same data, the only difference is the rows number. So I tested times with PHP and:
Info.php:
select from database time: 0.6090 seconds
time taken to decode JSON: 6.4736 seconds
while Comments.php results:
select from database time: 0.7309 seconds
time taken to decode JSON: 1.7178 seconds
Thanks
It might be because of MySQL select clause execution order. MySQL engine has to sort all the rows in the table first by id column, and after that limit results to 10 rows. Take a look at this answer.
I need to run queries with several conditions which will result large dataset. Whereas all the conditions are straight forward, I need advice regarding 2 issues in terms of speedoptimization:
1) If I need to run those queries between 1st Apr till 20th June of each year for last 10 years, I have 2 options in my knowledge:
a. Run the query 10 times
$year = 2015;
$start_month_date = "-04-01";
$end_month_date = "-06-20";
for($i=0;$i<10;$i++){
$start = $year.$start_month_date;
$end = $year.$start_month_date;
$result = mysql_query("....... WHERE .... AND `event_date` BETWEEN $start AND $end");
// PUSH THE RESULT TO AN ARRAY
$year = $year - 1;
}
b. Run the query single time, however query will compare by DayOfYear (hence each date has to be converted to DayOfYear by the query)
$start = Date("z", strtotime("2015-04-01")) + 1;
$end = Date("z", strtotime("2015-06-20")) + 1;
$result = mysql_query("....... WHERE .... AND DAYOFYEAR(`event_date`) BETWEEN $start AND $end");
I am aware of the 1 day difference in day count for leap year with other years, but I can live with that. I am sensing 1.b is more optimized, just want to verify.
2) I have a large query with 2 sub query. When I want to limit the result by date, I should put the conditions inside or outside the sub query?
a. Inside sub query means it has to validate the condition twice
SELECT X.a,X.b,Y.c FROM
(SELECT * FROM mytable WHERE `event_date` BETWEEN '$startdate' AND '$enddate' AND `case` = 'AAA' AND .......) X
(SELECT * FROM mytable WHERE `event_date` BETWEEN '$startdate' AND '$enddate' AND `case` = 'BBB' AND .......) Y
WHERE X.`event_date` = Y.`event_date` AND ........... ORDER BY `event_date`
b. Outside sub query means it will validate once, but has to join a larger dataset (for which I need to set SQL_BIG_SELECTS = 1)
SELECT X.a,X.b,Y.c FROM
(SELECT * FROM mytable WHERE `case` = 'AAA' AND .......) X
(SELECT * FROM mytable WHERE `case` = 'BBB' AND .......) Y
WHERE X.`event_date` = Y.`event_date` AND X.`event_date` BETWEEN '$startdate' AND '$enddate' AND ........... ORDER BY `event_date`
Again, in my opinion 2.a is more optimized, but requesting your advise.
Thanks
(1) Running the queries 10 times with event_date BETWEEN $start AND $end will be faster when the SQL engine can take advantage of an index on event_date. This could be significant, but it depends on the rest of the query.
Also, because you are ordering the entire data set, running 10 queries is likely to be a bit faster. That's because sorting is O(n log(n)), meaning that it takes longer to sort larger data sets. As an example, sorting 100 rows might take X time units. Sorting 1000 rows might take X * 10 * log(10) time units. But, sorting 100 rows 10 times takes just X * 10 (this is for explanatory purposes).
(2) Don't use subqueries if you can avoid them in MySQL. The subqueries are materialized, which adds additional overhead. Plus, they then prevent the use of indexes. If you need to use subqueries, filter the data as much as possible in the subquery. This reduces the data that needs to be stored.
I assume you have lots rows over 10 years otherwise that wouldn't be much of an issue.
Now the best bet is to do a couple explain on the different queries you plan to use, that will probably tell you which index it can use as currently we don't know them (you didn't post the structure of the table)
1.b. use a function in where clause so it will be terrible as it won't be able to use index for date (assuming there is one). So this will read the entire table
One thing that you could do, is ask the database to join the resultset of the 10 queries together using UNION. Mysql would join the result instead of php... (see https://dev.mysql.com/doc/refman/5.0/en/union.html)
2 - As gordon said, filter data as much as possible. However instead of trying option blindly you can use EXPLAIN and the database will help you decide which one make the most sense.
I have a table called 'Visits' that stores MAC addresses along with their timestamps. I have the task to check each day's MAC addresses against those of the previous days and if found, update them as 'Repeat' in today's records with the number of visits made so far (excluding today).
I have written the following PHP code that does the job nicely but the problem is that today it's taking 586.4 seconds to execute (checking 1,500 MACs against 70,000 from the previous 40 days) and it will surely become worse with each passing day.
$STH = $DBH->prepare("SELECT DISTINCT MAC FROM `Visits` WHERE TimeStamp=:TimeStamp");
$STH->bindParam(':TimeStamp', $unixDataDate);
$STH->execute();
while ($r = $STH->fetch(PDO::FETCH_ASSOC)) {
$MAC=$r['MAC'];
$STH2 = $DBH->prepare("SELECT COUNT(ID) FROM `Visits` WHERE MAC=:MAC AND TimeStamp<:TimeStamp");
$STH2->bindParam(':MAC', $MAC);
$STH2->bindParam(':TimeStamp', $unixDataDate);
$STH2->execute();
$prevVisits=$STH2->fetchColumn();
if ($prevVisits>0) {
$STH3 = $DBH->prepare("UPDATE `Visits` SET RepeatVisitor=:RepeatVisitor WHERE MAC=:MAC AND TimeStamp=:TimeStamp");
$STH3->bindParam(':RepeatVisitor', $prevVisits);
$STH3->bindParam(':MAC', $MAC);
$STH3->bindParam(':TimeStamp', $unixDataDate);
$STH3->execute();
}
}
Now I tried several ways to construct a query to do this job and compare execution times but I couldn't get the same results. Any help as to whether it's possible to do this task in one inexpensive query and how to format it would be greatly appreciated.
I assume that Visits.TimeStamp is a date
UPDATE Visits
SET RepeatVisitor =
(SELECT COUNT(*) FROM Visits as v2 WHERE v2.MAC = Visits.MAC and v2.TimeStamp!=Visits.TimeStamp AMD v2.TimeStamp>'[40 days ago generated in PHP]')
WHERE Visits.TimeStamp = '[Yesterday generated in PHP]'
Have you placed indexes on both the MAC and Timestamp column? If not, these should speed things up considerably.
Also, move the prepare statements outside of the while loop, you're preparing the same query over and over, which kind of misses the point. Prepare once, execute more often.
I have this table:
person_id int(10) pk
fid bigint(20) unique
points int(6) index
birthday date index
4 FK columns int(6)
ENGINE = MyISAM
Important info: the table contains over 8 million rows and is fast growing (1.5M a day at the moment)
What I want: to select 4 random rows in a certain range when I order the table on points
How I do it now: In PHP I randomize a certain range, let's say this gives me 20% as low range and 30% as high range. Next I count(*) the number of rows in table. After I determine the lowest row number: table count / 100 * low range. Same for high range. After I calculate a random row by using rand(lowest_row, highest_row), which gives me a row number within the range. And at last I select the random row by doing:
SELECT * FROM `persons` WHERE points > 0 ORDER BY points desc LIMIT $random_offset, 1;
The points > 0 is in the query since I only want randoms with at least 1 point.
Above query takes about 1.5 seconds to run, but since I need 4 rows it takes over 6 seconds, which is too slow for me. I figured the order by points takes the most time, so I was thinking about making a VIEW of the table, but I have really no experience with views, so what do you think? Is a view a good option or are there better solutions?
ADDED:
I forgot to say that it is important that all rows has the same chance of being selected.
Thanks, I appreciate all the help! :)
Kevin
Your query is so slow, and will become exponentially slower, because using LIMIT here forces it to do a full table sort, and then a full table scan, to get the result. Instead you should do this on the PHP end of things as well (this kind of 'abuse' of LIMIT is actually the reason it's non-standard SQL and for example MSSQL and Oracle do not support it).
First ensure there's an index on points. This will make select max(points), min(points) from persons a query that'll return instantly. Next you can determine from those 2 results the points range, and use rand() to determine 4 points in the requested range. Then repeat for each result:
SELECT * FROM persons WHERE points < $myValue ORDER BY points DESC LIMIT 1
Since it only has to retrieve one row, and can determine which one via the index, this'll be in the milliseconds execution time as well.
Views aren't going to do anything to help your performance here. My suggestion would be to simply run:
SELECT * FROM `persons` WHERE points BETWEEN ? AND ?
Make sure you have an index on points. Also, you SHOULD replace * with only the fields you are concerned about if applicable. Here is course ? represents the upper and lower bounds for your search.
You can then determine the number of rows returned in the result set using mysqli_num_rows() (or similar based on your DB library of choice).
You now have the total number of rows that meet your criteria. You can easily then calculate 4 random numbers within the range of results and use mysqli_data_seek() or similar to go directly to the record at the random offset and get the values you want from it.
Putting it all together:
$result = mysqli_query($db_conn, $sql); // here $sql is your SQL query
$num_records = 4; // your number of records to return
$num_rows = mysqli_num_rows($result);
$rows = array();
while ($i = 0; $i < $num_records; $i++) {
$random_offset = rand(0, $num_rows - 1);
mysqli_data_seek($result, $random_offset);
$rows[] = mysqli_fetch_object($result);
}
mysqli_free_result($result);
I have a table that I want to pick one row from it and show it to the user. every week I want to make the website automatically picks another row randomly. so, basically I want to get new result every week not every time a user visit the page.
I am using this code right now :
$res = mysql_query("SELECT COUNT(*) FROM fruit");
$row = mysql_fetch_array($res);
$offset = rand(0, $row[0]-1);
/* the first three lines to pick a row randomly from the table */
$res = mysql_query("SELECT * FROM fruit LIMIT $offset, 1");
$row = mysql_fetch_assoc($res);
This code gets a new result everytime the user visit the page, and after every refresh another random row gets chosen. I want to make it update every week and the results are the same for every user. Is their a php command that does that? If so, how does it work?
My suggestion would be as follows:
Store the random result id and timestamp is some other kind of persistent storage (file, DB table, etc).
Setup a cron job or other automated task to update the record above weekly. If you don't have access to such solutions, you could write code to do it on each page load and check against the timestamp column. However, that's pretty inefficient.
Yes there is. Use the date function in php and write each week and the corresponding row to a file using fwrite. Then, using an if statement, check if it is a new week and if it is get a new random row, write it to the file and return that, if it isn't, return the same one for that week.
A cronjob is the best solution. Create a script weeklynumber.php, much as what you have already, that generates an entry. After this, go to your console, and open your crontab file using crontab -e.
In here, you may add
0 0 * * 0 php /path/to/weeklynumber.php
This means that at every Sunday at 0:00, php /path/to/weeklynumber.php is executed.
But all of this assumes you're on UNIX and that you have access to creating cronjobs. If not, here's another solution: Hash the week number and year, and use that to generate the weekly number.
// Get the current week and year
$week = date('Wy');
// Get the MD5 hash of this
$hash = md5($week);
// Get the amount of records in the table
$count = mysql_result(mysql_query("SELECT COUNT(*) FROM fruit"),0);
// Convert the MD5 hash to an integer
$num = base_convert($hash, 16, 10);
// Use the last 6 digits of the number and take modulus $count
$num = substr($num,-6) % $count;
Note that the above will only work as long as the amount of records in your table doesn't change.
And finally, just a little note to your current method. Instead of counting rows, getting a random number from PHP, and asking your DBMS to return that number, it can all be done with a single query
SELECT * FROM fruit ORDER BY RAND() LIMIT 1