Assign Random Order to 10000+ records

Assign Random Order to 10000+ records - php

I am trying to assign a defined random order to a table of over 10000+ records. I have a function utilizing a start date and adding 1 second to each consecutive date, assigned randomly. Then I could sort by the random assigned date. My function worked fine with 50 records, but fails with 10000+ records.
It sets correct dates for about 9000 records, but 1146 records get assigned 0 (1969-12-31 19:00:00) Any help getting this or something similar to work would be appreciated.
function randomize(){
$count = $this->Application->find('count');
$order = range(0, $count-1); // Array of numbers 0 to count-1
$startDate = strtotime('December 13, 2011 0:00:00');
shuffle($order); // scramble array of numbers
$Applications = $this->Application->find('all');
set_time_limit(0);
foreach($Applications as $app){
$this->Application->id = $app['Application']['id'];
$this->Application->saveField('order', date('Y-m-d H:i:s', $startDate + $order[$this->Application->id]));
}
set_time_limit(30);
}
Update: I am using MySQL database but need a permanent state for 1 randomization, not repeated randomization as per ORDER BY RAND(). I also updated the code (see above) to reduce overhead, and increased memory in php.ini from 128M to 256M. With the code change the bad dates are no longer 0 but the same as $startDate indicating it may be an issue with the $order array of numbers.

Questions:
Are you sure you're using the proper date format there?
Why a start date to randomize? Take the date as a fixed number, as you're doing (X in this case): If you do X + givenOrderNumber for each record then the order will be defined by givenOrderNumber... so why the unnecessary addition?
I've got the query I understand you're looking for here:
set #num = 0;
select *,
date_add('2011-12-13 00:00:00', interval #num := #num + 1 second) as newOrder
from table1
order by newOrder
Example
It sorts the records by a date which is incremented by one each time. Now if you want to use the application id cheat:
select *,
date_add('2011-12-13 00:00:00', interval id second) as newOrder
from table1
order by newOrder
Example
However, whether this is useful or not for you... it seems to be unnecessary.
Hope this helps.

I think you have a weird way to do it. Can you tell us which type of database you are using? Some databases like mysql support order random, so you won't need to do all this to get random order...
In MySql is like this
SELECT * FROM tbl_name ORDER BY RAND();
Also i will listen to juhana comment, Using integers is easier and better done than with dates, you may use bigint if you are going to have a huuuuuge amount of records...
Still i would try to see first if your database supports the random order so you do less work :D
Hope this helps you :D

Related

More efficient way to perform multiple MySQL queries in PHP

I have an online store with thousands of orders and I'm writing some queries to figure out how much money each supplier (brand) has made on the site. I have the following queries, one for every month of the selected year:
$filterJan = "$filterYear-01";
$queryJan = "SELECT price, quantity FROM order_items WHERE productID='$productID' AND timestamp LIKE '%$filterJan%' LIMIT 10000";
$suppliersQueryFilter = mysql_query($queryJan, $connection) or die(mysql_error());
while($rowF = mysql_fetch_assoc($suppliersQueryFilter)) {
$price = $rowF["price"]*$rowF["quantity"];
$totalJan = $totalJan+$price;
}
** and so on for each month **
It takes ages to load (we're talking over 60 seconds at least) and I know it is not efficient in any shape or form. For each month these queries are searching through thousands of records. Is there a more efficient way or writing this to:
a) Reduce the amount of code to maybe 1 query
b) Make it more efficient to increase loading times
$filterYear contains a year, like 2009.
So what this query does is it selects how much money has been made for each month for a selected year (which is assigned to $filterYear). So the result it generates is a table with Jan, Feb, March... with how much money has been made each month, so £2345, £2101, etc...

You should be storing your timestamp as an actual mysql datetime value, which would make things like
GROUP BY YEAR(timestamp), MONTH(timestamp)
WHERE timestamp BETWEEN $initialtime AND $finaltime
trivially possible. That'd reduce your multiple essentially identical repeated queries to just one single query.
You can use derived values for this, but it'll be less efficient than using a native datetime field:
GROUP BY SUBSTR(timestamp, 0, 4), SUBSTR(timestamp, 6,2)

For best performance, you'd want to submit a query something like this to the database:
SELECT DATE_FORMAT(i.timestamp,'%Y-%m') AS `month`
, SUM(i.price*i.qty) AS `total`
FROM order_items i
WHERE i.productID = 'foo'
AND i.timestamp >= '2013-01-01'
AND i.timestamp < '2013-01-01' + INTERVAL 12 MONTH
GROUP
BY DATE_FORMAT(i.timestamp,'%Y-%m')
(This assumes that the timestamp column is MySQL datatype TIMESTAMP, DATETIME or DATE)
Using the deprecated mysql_ interface, you want to avoid SQL Injection vulnerabilities using the mysql_real_escape_string function. (A better option would be to use the mysqli or PDO interface, and use a prepared statement with bind placeholders.)
We want the predicates on the timestamp to be on the BARE column, so MySQL can make use of an available suitable index for a range scan operation, rather than requiring a full scan of every row in the table.
We also want to use the power of the server to quickly derive a total, and return just the total, rather than retrieving every flipping row, and processing each of those rows individually (RBAR = row by agonizing row)
The GROUP BY clause and the SUM() aggregate function are tailor made to suit this result.
With mysql_ interface, the query would look something like this:
$querytext = "
SELECT DATE_FORMAT(i.timestamp,'%Y-%m') AS `month`
, SUM(i.price*i.qty) AS `total`
FROM order_items i
WHERE i.productID = '" . mysql_real_escape_string($thisProductID) . "'
AND i.timestamp >= '" . mysql_real_escape_string($filterYear) . "-01-01'
AND i.timestamp < '" . mysql_real_escape_string($filterYear) . "-01-01' +
INTERVAL 12 MONTH
GROUP BY DATE_FORMAT(i.timestamp,'%Y-%m')";
# for debugging
#echo $querytext;

Show all results from database where mm/dd/yy date is "today" or greater

I am using HTML input type="date" to allow users to input appointment dates.
Now I want to query the database and show all appointments that are "today" and in the future.
Not dates that have already passed.
Here is my SQL Script
$today = date('d-m-Y');
$sql = "SELECT *
FROM `client1`
WHERE `client` = '$customer'
AND DATEDIFF('$today', `date`) >= 0
ORDER BY `id` DESC";
Can someone guide me as to how I can achieve this?
I have seen several directions online but I want to have the sorting done at the moment of query.

I have solved the issue!
My date() format was incorrect because HTML input type="date" inserts YYYY-MM-DD into the database =/
$today = date('d-m-Y');
should be
$today = date('Y-m-d');
My operator >= should have been <= to show today and future dates.
Thanks everyone for the help. I should have tried fixing it for 5 more minutes before posting.

Why are you using PHP to compare dates in the database? I assume its a date field so you can use MySQL to do it for you:
SELECT *
FROM `client1`
WHERE `client` = '$customer'
AND DATEDIFF(date_format(now(), '%Y/%m/%d'), `date`) >= 0
ORDER BY `id` DESC

None of the responses have specified sargable predicates. If you perform an operation on a column in the where clause, there is no discernible stopping point.
where ... some_function( some_field ) = some_constant_value ...
Even if some_field is indexed, a complete table scan must be performed because there is no way to know if the output of the operation is also ordered.
From my understanding the date column is in a sortable form -- either a date field or a string in lexically sortable format 'yyyy-mm-dd'. That being the case, don't do any operation on it.
where ... some_field >= now() ...
Thus the system can use the result of now() as a target value to find exactly where in the index to start looking. It knows it can ignore all the rows with indexed values "down" from the target value. It has to look only at rows with indexed values at or "up" from the target value. That is, it performs an index seek to the correct starting point and proceeds from there. This could mean totally bypassing many, many rows.
Or, to put it bluntly, ditch the datediff and do a direct comparison.

Getting temperature difference between intervals

my question is more "theoretical" than practical - in other words, Im not really looking for a particular code for how to do something, but more like an advice about how to do it. Ive been thinking about it for some time but cannot come up with some feasible solution.
So basically, I have a MySQL database that saves weather information from my weather station.
Column one contains date and time of measurement (Datetime format field), then there is a whole range of various columns like temp, humidity etc. The one I am interested in now is the one with the temperature. The data is sorted by date and time ascending, meaning the most recent value is always inserted to the end.
Now, what I want to do is using a PHP script, connect to the db and find temperature changes within a certain interval and then find the maximum. In other words, for example lets say I choose interval 3h. Then I would like to find the time, from all the values, where there was the most significant temperature change in those 3 h (or 5h, 1 day etc.).
The problem is that I dont really know how to do this. If I just get the values from the db, Im getting the values one by one, but I cant think of a way of getting a value that is lets say 3h from the current in the past. Then it would be easy, just subtracting them and get the date from the datetime field at that time, but how to get the values that are for example those 3 h apart (also, the problem is that it cannot just simply be a particular number of rows to the past as the intervals of data save are not regular and range between 5-10mins, so 3 h in the past could be various number of rows).
Any ideas how this could be done?
Thx alot

Not terribly hard actually. So I would assume it's a two column table with time and temp fields, where time is a DATETIME field
SELECT MAX(temp) FROM records
WHERE time >= "2013-10-14 12:00:00" and time <= "2013-10-14 15:00:00"

SELECT t1.*, ABS(t1.temperature - t2.temperature) as change
FROM tablename t1
JOIN tablename t2
ON t2.timecolumn <= (t1.timecolumn - INTERVAL 3 HOUR)
LEFT JOIN tablename t3
ON t3.timecolumn <= (t1.timecolumn - INTERVAL 3 HOUR)
AND t2.timecolumn > t3.timecolumn
WHERE
t3.some_non_nullable_column IS NULL
ORDER BY ABS(t1.temperature - t2.temperature) DESC
LIMIT 1;
1 table joined 2 times on itself, t2 is the quaranteed direct predecessor of t1 t2 is the closest record with offset 3h before or more. This could with the proper indexes, and a limited amount of data (where limited is in the eye of the beholder) be quite performant. However, if you need a lot of those queries in a big dataset, this is a prime candidate for denormalization, were you create a table which also stores the calculated offsets compared to the previous entry.

How to update/Insert random dates in SQL within a specified Date Range

Please forgive me. I am an absolute newbie and I need help with this table in phpmyadmin
My Table has the following columns:
Primary_ID, Begin_Date, End_Date, Timestamp
How do I update in phpmyadmin, selected rows with randomly generated begin_dates and timestamp within a specified date range (eg: 30 days in a month).
E.g of desired outcome
Primary_id--- Begin_Date -------------Timestamp
1.------------2008-09-02--------------2008-09-02 21:48:09
2.------------2008-09-03--------------2008-09-03 15:19:01
3.------------2008-09-14--------------2008-09-14 01:23:12
4.------------2008-09-27--------------2008-09-27 19:03:59
Date Range between 2008-09-01 and 2008-09-31.
Time is variable 24 hrs
I am a newbie, so a syntax that will work in phpmyadmin will help greatly.
We are making a presentation for a gym site with 500 members but the added member values all have the same begin date and time. Trying to separate them into different monthly registrations in the database, eg 50 people registered in August at different days and times, 35 people in October, etc. Hope it is clearer now. Thanks –
When I try one of the below answers, I get this error: #1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '$randomDate = rand(1,31)' at line 1. So ideally, a code I can copy and paste into phpmyadmin with minimal editing will be appreciated. In sequence if possible. For a total dummy to understand and execute.

I'd start with something like this. A bunch of these can be combined, but I split it up so you can see what I'm doing.
To get random numbers, you can use rand(). Get one for the date, hour, minute, and second
$randomDate = rand(1,31);
$randomHour = rand(1,24);
$randomMinute = rand(0,59);
$randomSecond = rand(0,59);
You will want leading zeros (03 instead of 3) so you can use str_pad to add them, if required
$randomDate = str_pad($randomDate, 2, '0',STR_PAD_LEFT);
//The '2' is how many characters you want total
//The '0' is what will be added to the left if the value is short a character
Do the same with all your other random values.
Just because I like neat queries, you should make up your final update strings next.
$newDate = '2008-09-'.$randomDate;
$newTime = $randomHour.':'.$randomMinute.':'.$randomSecond;
Now I don't know how you're determining which rows you want to update, so I will leave that up to you. For an example, I will show you a query if you wanted to do this with Primary_id 3:
$x = mysql_query("UPDATE yourTable SET Begin_Date=\"$newDate\", Timestamp=\"$newTime\" WHERE Primary_id = 3");

something like:
insert into myTable (begin_date) values date_add('2008-09-01', INTERVAL RAND()*30 DAY)
that should create a new row with a random begin_date
update myTable set Timestamp = date_add(begin_date, INTERVAL RAND()*1440 MINUTE)
then that one should set the timestamp to a random minute of that day.

MYSQL most recent time

Is there a way to grab the most recent time of day. If your data in the database is formatted like this 07:00AM and 08:00pm and 12:00pm. Sorta like max(). But for the time. In a Mysql query.
Thanks
Eric

It would be best to store it in another format rather than as text. Or at least store it in 24 hour format, then a simple sort would work. You can convert it to 12-hour format when you display the data to the user.
But assuming you can't change your database schema, try this:
SELECT *
FROM your_table
ORDER BY STR_TO_DATE(your_time, '%h:%i%p') DESC
LIMIT 1
Note that this won't be able to use an index to perform the sorting.

You should try STR_TO_DATE() instead if you're using a string. If your times are always formatted as hh:mmAMPM, you can use:
MAX(STR_TO_DATE(YourTimeField,'%h:%i%p'))
This converts your string to a time, without any need to split it up by substring or anything, so MySQL would then see 09:07AM as 09:07:00 and 02:35PM as 14:35:00, and then would easily be able to determine the MAX of it.

Assuming you are dealing with a DATETIME field in your MySQL, you can use this query to get the max time per day:
SELECT DATE(YourDateField), MAX(TIME(YourDateField)) FROM YourTable
GROUP BY DATE(YourDateField)
When you are dealing with a VARCHAR field, you can try a hack like this:
SELECT YourDateField, SUBSTRING(MAX(
CASE WHEN YourTimeField LIKE '%AM%' THEN '0' ELSE '1' END
+ REPLACE(YourTimeField, '12:', '00:')
), 2)
GROUP BY YourDateField

You can just sort.
select time_column from table order by time_column desc limit 1;

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.