More efficient way to perform multiple MySQL queries in PHP - php

I have an online store with thousands of orders and I'm writing some queries to figure out how much money each supplier (brand) has made on the site. I have the following queries, one for every month of the selected year:
$filterJan = "$filterYear-01";
$queryJan = "SELECT price, quantity FROM order_items WHERE productID='$productID' AND timestamp LIKE '%$filterJan%' LIMIT 10000";
$suppliersQueryFilter = mysql_query($queryJan, $connection) or die(mysql_error());
while($rowF = mysql_fetch_assoc($suppliersQueryFilter)) {
$price = $rowF["price"]*$rowF["quantity"];
$totalJan = $totalJan+$price;
}
** and so on for each month **
It takes ages to load (we're talking over 60 seconds at least) and I know it is not efficient in any shape or form. For each month these queries are searching through thousands of records. Is there a more efficient way or writing this to:
a) Reduce the amount of code to maybe 1 query
b) Make it more efficient to increase loading times
$filterYear contains a year, like 2009.
So what this query does is it selects how much money has been made for each month for a selected year (which is assigned to $filterYear). So the result it generates is a table with Jan, Feb, March... with how much money has been made each month, so £2345, £2101, etc...

You should be storing your timestamp as an actual mysql datetime value, which would make things like
GROUP BY YEAR(timestamp), MONTH(timestamp)
WHERE timestamp BETWEEN $initialtime AND $finaltime
trivially possible. That'd reduce your multiple essentially identical repeated queries to just one single query.
You can use derived values for this, but it'll be less efficient than using a native datetime field:
GROUP BY SUBSTR(timestamp, 0, 4), SUBSTR(timestamp, 6,2)

For best performance, you'd want to submit a query something like this to the database:
SELECT DATE_FORMAT(i.timestamp,'%Y-%m') AS `month`
, SUM(i.price*i.qty) AS `total`
FROM order_items i
WHERE i.productID = 'foo'
AND i.timestamp >= '2013-01-01'
AND i.timestamp < '2013-01-01' + INTERVAL 12 MONTH
GROUP
BY DATE_FORMAT(i.timestamp,'%Y-%m')
(This assumes that the timestamp column is MySQL datatype TIMESTAMP, DATETIME or DATE)
Using the deprecated mysql_ interface, you want to avoid SQL Injection vulnerabilities using the mysql_real_escape_string function. (A better option would be to use the mysqli or PDO interface, and use a prepared statement with bind placeholders.)
We want the predicates on the timestamp to be on the BARE column, so MySQL can make use of an available suitable index for a range scan operation, rather than requiring a full scan of every row in the table.
We also want to use the power of the server to quickly derive a total, and return just the total, rather than retrieving every flipping row, and processing each of those rows individually (RBAR = row by agonizing row)
The GROUP BY clause and the SUM() aggregate function are tailor made to suit this result.
With mysql_ interface, the query would look something like this:
$querytext = "
SELECT DATE_FORMAT(i.timestamp,'%Y-%m') AS `month`
, SUM(i.price*i.qty) AS `total`
FROM order_items i
WHERE i.productID = '" . mysql_real_escape_string($thisProductID) . "'
AND i.timestamp >= '" . mysql_real_escape_string($filterYear) . "-01-01'
AND i.timestamp < '" . mysql_real_escape_string($filterYear) . "-01-01' +
INTERVAL 12 MONTH
GROUP BY DATE_FORMAT(i.timestamp,'%Y-%m')";
# for debugging
#echo $querytext;

Related

Best way to run a SQL query for different dates

I have a PHP application that records the sessions of various devices connected to a server. The database has a table session, with the columns device_id, start_date, end_date. To know the number of devices connected on a given date, I can use the request :
SELECT COUNT(DISTINCT device_id)
FROM session
WHERE :date >= start_date AND (:date <= end_date OR end_date IS NULL)
where :date is passed as a parameter to the prepared statement.
This works fine, but if I want to know the number of devices for every days of the year, that makes 365 queries to run, and I'm afraid things could get very slow. It doesn't feel right to be iterating on the date in PHP, it seems to me that there should be a more optimal way to do this, with a single query to the database.
Is it possible do this with a single query?
Would it actually be faster than to iterate on the date in PHP an running multiple queries?
EDIT to answer the comments :
I do want the number for each separate day (to draw a graph for example), not just the sum
the datatype is DATE
If I understand correctly then you first need a table of dates, something like:
create table dates(dt date);
insert into dates(dt) values
('2001-01-01'),
('2001-01-02'),
...
('2100-12-31')
And use a query like so:
select dates.dt, count(session.device_id)
from dates
join session on start_date <= dates.dt and (dates.dt <= end_date or end_date is null)
-- change to left join to include zero counts
where dates.dt >= :date1 and dates.dt <= :date2
group by dates.dt
PS: since you mentioned charts I might add that it is possible to avoid the table of dates. However, the result will only contain dates on which the count of devices changed. Chart APIs usually accept this kind of data but still create data points for all dates in between.

SQL select with DATE in WHERE, what is faster?

I have a query from table of rows, where is datetime column with only year and month.
The day is always 01 and time is 00:00:00
When selecting data with php query, what is faster?
$date = "2020-04";
$query = "SELECT * FROM table WHERE datum LIKE ?",$date ;
or
$date = "2020-04";
$rok = substr($mesic,0,4);
$mesic = substr($mesic,5,2);
$query = "SELECT * FROM table WHERE YEAR(datum) = ? AND MONTH(datum) = ?",$rok,$mesic;
The table contains 100s thousands of rows
We always used to have the rule:
"Avoid functions in the WHERE Clause".
The background is that the database server has to make a table scan to calculate the function result for each row (even if it is only the key table).
So he cannot use the keys efficiently!
LIKE is faster!
If you use the beginning of the key (as you write), it can also use it efficiently.
I would recommend:
$date = "2020-04";
$sql = "SELECT * FROM table WHERE datum = concat(?, '-01')", $date;
The base idea is not not apply functions (either date functions or pattern matching) on the column being searched, as this prevents the database from taking full advantage of an index on that column (and it involves unnecessary additional computation for each and every row). Instead, you can easily generate the exact value to search for with by concatenating the variable in the query.
In the more typical case where datum had real day and time components, and you want to filter on a given month, you would do:
SELECT *
FROM table
WHERE datum >= concat(?, '-01') AND datum < concat(?, '-01') + interval 1 month
Note that this assumes that you are using MySQL, because the syntax suggests it. Other database have equivalent ways to build dates from strings.
Neither. In both cases you have a function call on the datum column. With YEAR() and MONTH() it is obvious. With LIKE you are converting to a string. Both impede the optimize and prevent the use of indexes.
The correct structure would be:
where datum >= ? and
datum < ? + interval 1 month -- syntax might vary depending on the database
where ? are parameter place-holders for the beginning of the month. I would suggest that you construct this in the application as a date constant or a string constant of the canonical form YYYY-MM-DD.

DISTINCT datetime in SQL query

I am graphing data from a large database (150K+ rows) and some data points have identical timestamps with different price values.
For example:
time => 1502050000
price => 1
time => 1502050000 // identical timestamp
price => 1.1
In the SQL query, I want to ignore duplicate timestamps. I have found that DISTINCT will likely do the job, but I'm stuck with applying this to a datetime field in my database. Here is my working SQL query which pulls in duplicate timestamps.
SELECT time, price FROM price_table WHERE time >= '" . $data_from . "' ORDER BY time ASC
My goal is to get unique timestamps only, and avoid querying duplicates.
You can use aggregation:
SELECT time, MIN(price) as price
FROM price_table
WHERE time >= '" . $data_from . "'
GROUP BY time
ORDER BY time ASC;
Of course, this begs the question of what you want for price when there are duplicates.

Get all rows from a specific month and year

I have a PHP scirpt that is always querying all the data from a database table and it's getting pretty slow. I really just need the data of a specific month and year.
Is there a simple way to get only those entries? For example, everything from February 2013?
The column that stores the dates in my table is of type datetime, if that applies to the solution.
You can add that condition in the WHERE clause of your select statement. I would recommend using BETWEEN operand for two dates:
SELECT myColumns
FROM myTable
WHERE dateColumn BETWEEN '2013-02-01' AND '2013-02-28';
If you mean to say you want everything beginning with February 2013, you can do so using the greater than or equal to operator:
SELECT myColumns
FROM myTable
WHERE dateColumn >= '2013-02-01';
EDIT
While the above are my preferred methods, I would like to add for completeness that MySQL also offers functions for grabbing specific parts of a date. If you wanted to create a paramaterized query where you could pass in the month and year as integers (instead of a start and end date) you could adjust your query like this:
SELECT myColumns
FROM myTable
WHERE MONTH(dateColumn) = 2 AND YEAR(dateColumn) = 2013;
Here is a whole bunch of helpful date and time functions.
You should index the datetime field for added efficiency and then use Between syntax in your sql. This will allow the mysql engine to remove all records that you are not interested in from the returned data set.

Assign Random Order to 10000+ records

I am trying to assign a defined random order to a table of over 10000+ records. I have a function utilizing a start date and adding 1 second to each consecutive date, assigned randomly. Then I could sort by the random assigned date. My function worked fine with 50 records, but fails with 10000+ records.
It sets correct dates for about 9000 records, but 1146 records get assigned 0 (1969-12-31 19:00:00) Any help getting this or something similar to work would be appreciated.
function randomize(){
$count = $this->Application->find('count');
$order = range(0, $count-1); // Array of numbers 0 to count-1
$startDate = strtotime('December 13, 2011 0:00:00');
shuffle($order); // scramble array of numbers
$Applications = $this->Application->find('all');
set_time_limit(0);
foreach($Applications as $app){
$this->Application->id = $app['Application']['id'];
$this->Application->saveField('order', date('Y-m-d H:i:s', $startDate + $order[$this->Application->id]));
}
set_time_limit(30);
}
Update: I am using MySQL database but need a permanent state for 1 randomization, not repeated randomization as per ORDER BY RAND(). I also updated the code (see above) to reduce overhead, and increased memory in php.ini from 128M to 256M. With the code change the bad dates are no longer 0 but the same as $startDate indicating it may be an issue with the $order array of numbers.
Questions:
Are you sure you're using the proper date format there?
Why a start date to randomize? Take the date as a fixed number, as you're doing (X in this case): If you do X + givenOrderNumber for each record then the order will be defined by givenOrderNumber... so why the unnecessary addition?
I've got the query I understand you're looking for here:
set #num = 0;
select *,
date_add('2011-12-13 00:00:00', interval #num := #num + 1 second) as newOrder
from table1
order by newOrder
Example
It sorts the records by a date which is incremented by one each time. Now if you want to use the application id cheat:
select *,
date_add('2011-12-13 00:00:00', interval id second) as newOrder
from table1
order by newOrder
Example
However, whether this is useful or not for you... it seems to be unnecessary.
Hope this helps.
I think you have a weird way to do it. Can you tell us which type of database you are using? Some databases like mysql support order random, so you won't need to do all this to get random order...
In MySql is like this
SELECT * FROM tbl_name ORDER BY RAND();
Also i will listen to juhana comment, Using integers is easier and better done than with dates, you may use bigint if you are going to have a huuuuuge amount of records...
Still i would try to see first if your database supports the random order so you do less work :D
Hope this helps you :D

Categories