Convert complex query from Laravel to SQL - php

I'm trying to move some statistics logic entirely to SQL to vastly improve performance, however it's quite complex and I'm not sure how to achieve this using procedures/functions in SQL - or whether it's even possible.
Stats table looks like this:
Row in the table looks like this:
This is the code I'm trying to convert. $this->stats is an Eloquent Collection of all the rows in the table:
return $this->stats
->groupBy(function ($stat) {
return $stat->created_at->format('Y-m-d');
})
->map(function ($stats) {
$times = [];
for ($hour = 0; $hour < 24; $hour++) {
$thisHour = $stats->filter(function ($stat) use ($hour) {
return (int) $stat->created_at->format('H') === $hour;
});
$times[$hour] = $thisHour->isNotEmpty()
? $thisHour->sum(function ($stat) {
return $stat->data->count;
}) : 0;
}
return $times;
});
This outputs something like this:
{
"2018-12-20": {
0: 54,
1: 87,
2: 18,
3: 44,
4: 35,
...
}
}
So it's grouped by date, and each date contains 0-23 (the hours of the day) with the corresponding value (in this case the summation of the row's data->count property).
In a different query I've been able to get the summation of the data->count property by using this:
SELECT SUM(data->>\"$.count\") AS value
So is this even possible to do in SQL? I'm imaging there to be a date columns, plus the hours column, so hour_0, hour_1 etc. with the values underneath.
Any help would be greatly appreciated!

The first step I would take to understand how Laravel is doing it would be enabling query logging and dumping the query;
DB::connection()->enableQueryLog();
// ... do your query
$queries = DB::getQueryLog();
That will allow you to recreated the Eloquent query in plain SQL. From there, you can work on optimizing and condensing.
https://laravel.com/docs/5.0/database#query-logging

UDPATED
As you have now included the table structure, the following query should give you required results in tabular format. You need to parse the results to convert to json.
Updated SQL
SELECT DATE_FORMAT(stats.updated_at, "%Y-%m-%d") AS `date`,
HOUR(stats.updated_at) AS `hour`,
COUNT(stats.id) AS record_count
FROM stats
GROUP BY `date`, `hour`
ORDER BY 1 ASC, 2 ASC
The resulting tabular data would look similar to this.

Related

SQL fill in empty months if data isn't present in Laravel 8 query

I'm using the query builder in my Laravel 8 project to create a monthly sum of all of the deleted users in my application, I'm then outputting two items to use as part of a graph, total and date.
This works well, but, if a month didn't have any data then it would skip straight onto the next month, e.g:
2021-01
2021-04
2021-05
How can I modify the query to add all of the months, from a given start date, up until "now" and effectively add blank values for those months that don't have data?
My current query is:
$data = User::selectRaw('DATE_FORMAT(created_at, "%Y-%m") as date, COUNT(*) as total')
->groupByRaw('DATE_FORMAT(created_at, "%Y-%m")')
->withTrashed()
->whereNotNull('deleted_at')
->get();
And I'm thinking of calculating the start by doing something like this:
$user = User::orderBy('created_at', 'asc')->first();
$start = $user->created_at;
$data = User::selectRaw('DATE_FORMAT(created_at, "%Y-%m") as date, COUNT(*) as total')
->groupByRaw('DATE_FORMAT(created_at, "%Y-%m")')
->withTrashed()
->whereNotNull('deleted_at')
->get();
$end = Carbon::now()->endOfMonth();
Not sure how to get it into the query though
The problem here is that groups originate from the rows, not the other way around. A group will not exist unless a row exists to be included within the group. You only see "missing" months because, in your mind, there are months between January and April.
I'd recommend doing it in post-processing, because any clever attempts to create phantom rows so that groups appear will inevitably be more complicated and more frustrating to maintain.
It may feel clunky, but looping through months from Carbon and adding values to your query result will work fine. Plus, you don't need to rely on $start from your first user result, you can set it yourself.
$start = Carbon::today()->subYear(); // Use any start date, even include it from user input (like a datepicker).
$end = Carbon::today(); // Use any end date, though it won't be useful any later than today.
$loop = $start->copy();
while ($loop->lessThanOrEqualTo($end)) {
$exists = $data->first(function($item) use ($end) {
return $item->date == $end->format('Y-m');
});
if (!$exists) {
$row = new stdClass();
$row->date = $loop->copy()->format('Y-m');
$row->total = 0;
$data->push($row);
}
$loop->addMonth(); // This keeps the loop going.
}
This accomplishes what you want and doesn't get into any N+1 issues.
Edit: Added example below in re-usable function.
function fillEmptyMonths(Collection $data, Carbon $start, Carbon $end): Collection
{
$loop = $start->copy();
// Loop using diff in months rather than running comparison over and over.
for ($months = 0; $months <= $start->diffInMonths($end); $months++) {
if ($data->where('date', '=', $loop->format('Y-m'))->isEmpty()) {
$row = new stdClass();
$row->date = $loop->copy()->format('Y-m');
$row->total = 0;
$data->push($row);
}
$loop->addMonth();
}
return $data;
}
You could also expand this to take another parameter that defines the increment (and pass it "month", "day", "year", etc.). But if you are only using month, this should work.

count of subquery in doctrine - querybuilder

I have simple query:
$this->qb->select('l.value')
->addSelect('count(l) AS cnt')
->addSelect('hour(l.time) AS date_hour')
->from(Logs::class, 'l')
->where('l.header = :header')
->groupBy('l.value')
->addGroupBy('date_hour')
->setParameter('header', 'someheader')
This code select 3 columns have one condition and two groupBy.
I want get records count of this query. Of course I dont want to download all records and check size of downloaded data.
Question:
How to rebuild this query and get result from db as singleScalarValue()?
I think you should use (id column for count) here:
->addSelect('count(l) AS cnt')
Something like this, but if you show your Entity i can suggest right solution:
$this->qb->select('l')
->addSelect('count(l.id) AS cnt')
->addSelect('hour(l.time) AS date_hour')
->from(Logs::class, 'l')
->where('l.header = :header')
->groupBy('l.value')
->addGroupBy('date_hour')
->setParameter('header', 'someheader')
$count = $qb->getQuery()->getSingleScalarResult();
Since GROUP BY + COUNT + getSingleScalarResult does not seem to work together in DQL queries, I guess your best option is to count afterwards
$sub
->select('l.id')
->addSelect('l.value as hidden val')
->addSelect('hour(l.time) AS hidden date_hour')
->from(Logs::class, 'l')
->where('l.header = :header')
->groupBy('val')
->addGroupBy('date_hour')
->setParameter('header', 'someheader')
;
$subIds = array_column($sub->getQuery()->getResult(), 'id');
$count = count($subIds);

Time Calculations with MySQL

I'm writing a time logging programme for a client who is a piano tuner, and I've written the following PHP code to give a record a status of 'to do':
$last_tuned = '2017-01-05';
$tuning_period = 3;
$month_last_tuned = date('Y-m', strtotime(date('Y-m-d', strtotime($last_tuned))));
$next_tuning = date('Y-m', strtotime($month_last_tuned.(' +'.$tuning_period.' months')));
if (time() > strtotime($next_tuning.' -1 months')) {
if (time() > strtotime($next_tuning)) {
return 'late';
} else {
return 'upcoming';
}
}
As you can see, the $last_tuned variable is of the date(YYYY-MM-DD) format. This is then converted to a (YYYY-MM) format.
Once convered, an additional number of months, identical to $tuning_period is then added to the $month_last_tuned variable giving us a month and year value for when we need to add a new record.
If the current time (found with time()) is greater than the $next_tuning variable - 1 month, it returns that the task is upcoming. If it's after the $next_tuning variable, it returns that the task is late.
I now have to write a MySQL query to list the items that would return as upcoming or late.
How would I write this in MySQL? I'm not very good with MySQL functions, and some help would be much appreciated.
My attempt at the logic is:
SELECT * FROM records
// The next lines are to get the most recent month_last_tuned value and add the tuning_period variable
WHERE
NOW() > (SELECT tuning_date FROM tunings ORDER BY tuning_date ASC LIMIT 1)
+
(SELECT tuning_period FROM records WHERE records.id = INITIAL CUSTOMER ID)
I know that that is completely wrong. The logic is pretty much there though.
My database schema is as follows:
I expect the rows returned from the query to be on-par with the 'late' or 'upcoming' values in the PHP Code above. This means that the rows returned will be within 1 months of their next tuning date (calculated from last tuning plus tuning period).
Thanks!
You'd probably be better off with using the DateTime object instead of manipulating date strings.
$last_tuned = '2017-01-05';
$tuning_period = 3; // months
$dt_last_tuned = DateTimeImmutable::createFromFormat('Y-m-d',$last_tuned);
$dt_next_tuning = $dt_last_tuned->add(new DateInterval('P3M'));
$dt_now = new DateTimeImmutable();
$dt_tuning_upcoming = $dt_next_tuning->sub(new DateInterval('P1M'));
if( $dt_now > $dt_next_tuning) {
return 'late';
}
if( $dt_now > $dt_tuning_upcoming) {
return 'upcoming';
}
You can also use these DateTime objects in your MySQL queries, by building the query and passing through something like $dt_next_tuning->format('Y-m-d H:i:s'); as needed.
Given your table structure, however, it may be easier to just get all the relevant records and process them. It's a little difficult to tell exactly how the pieces fit together, but generally speaking MySQL shouldn't be used for "processing" stuff.

Database query using CodeIgniter returning null values into array?

I'm pulling numerical data from the database based on a date selected by the user. For some reason, when I select my function to pull all of today's data (there is none), it still forms an array of null values.
I'm thinking the problem lies in either the select_sum() or where() functions. Here's my CI query:
$this->db
->select_sum('column1')
->select_sum('column2')
->select_sum('column3')
->where('created_at >=', date('Y-m-d'));
$query = $this->db->get('table_name');
And here is my foreach loop that pulls all of the selected data into an array to be used throughout the page:
$popular_items_array = array();
foreach ($query->result() as $key => $row)
{
if ($row == NULL) {
echo "Error populating table.";
} else {
$popular_items_array = $row;
}
}
To take a look at the data, I then did:
echo json_encode($popular_items_array);
Which turns out showing this:
{"column1":null,"column2":null,"column3":null}
If I select a different time frame (where data actually exists by the set date) that same JSON echo will display the existing data. The thing I'm not understanding is, why is the query returning anything at all? Why isn't it just failing? And how can I run a check / loop that will catch this problem, and display an error message letting the user know that no data exists on that date?
If you'd prefer to get no record back at all you could amend your code to be:
$this->db
->select_sum('column1')
->select_sum('column2')
->select_sum('column3')
->where('created_at >=', date('Y-m-d'))
->having('count(*) > 0');
$query = $this->db->get('table_name');
This is how the aggregate functions (like sum(), avg(), max() and others) work, they will return the aggregated value from the result set. If the result set is an empty one, they will aggregate that. This cause a lot of confusion and bugreports but this is how this should work, a more detailed explanation can be found at dba.stackexchange.com
You can use COALESCE to substitute the NULL values to something more useful, or you can add a count() so you can tell how many rows was used to generate the sum()s.
Strangely enough, if you add a group by it will work as you would expect (at least with mysql):
SELECT sum(id) sum_id FROM `users` WHERE 0 # => array( array('sum_id' => null) )
But:
SELECT sum(id) FROM `users` WHERE 0 GROUP BY null # => array()

Is there a short and sweet function toString a single jSON object or similar in my example?

I have converted a PHP array into a single selection in a Codeigniter PHP function like so...
function check_week($week_array)
{
$sql = "SELECT X_id FROM products WHERE date_sub(curdate(), INTERVAL 1 DAY) <= updated_at;";
$query = $this->db->query($sql, $week_array);
$week = $query->result_array();
$weeks = json_encode($week[array_rand($week)]);
return $weeks;
}
and I get a return of ...
{"X_id":"XXX1AXPJV6"}
I have already narrowed this down to one id, so no need to use a loop, I just need the id in one simple move. (so I just want XXX1AXPJV6 as a variable). Also, I did try keeping in PHP for this and Codeigniter was finicky about allowing any conversion to string due to the call to this model is from a library file.
btw, my 1 DAY interval is for testing, it will be 7
An attempt at using...
$weeks2 = $weeks[0]['X_id'];
return $weeks2;
...gets error "Cannot use string offset as an array in..."
If I understand the question correctly
$weeks = json_encode($week[array_rand($week)]);
should be
$weeks = reset($week[array_rand($week)]); // returns the value of the first element in the array
hope that helps.
If you only need one random row, your SQL should retrieve only one random row.
function check_week($week_array)
{
$sql = "SELECT X_id FROM products WHERE DATE_SUB(CURDATE(), INTERVAL 1 DAY) <= updated_at ORDER BY RAND() LIMIT 1;";
$query = $this->db->query($sql, $week_array);
$week = $query->row_array();
return json_encode($week['X_id']);
}
Note the changes in the query, as well as the use of row_array() which returns a single key => value array, instead of result_array() which returns an array of arrays.
For what it's worth, you could've gotten the result you need by altering this line to:
$weeks = json_encode($week[array_rand($week)]['X_id']);
But the above is still a more suitable solution. Don't retrieve lots of records if you only need one.
Also, what is the $week_array parameter for? You are using it as a query binding, but there are no ? places for the bindings to go in your query, making it pointless.

Categories