Mysql search query - php

I am developing a car rental site. I have two tables test_tbl_cars and test_reservations.
I am using the search query (cribbed from Jon Kloske in "How do I approach this PHP/MYSQL query?"):
$sql = mysql_query("SELECT
test_tbl_cars.*,
SUM(rental_start_date <= '$ddate' AND rental_end_date >= '$adate') AS ExistingReservations
FROM test_tbl_cars
LEFT JOIN test_reservations USING (car_id)
GROUP BY car_id
HAVING ExistingReservations = 0");
This gives me excellent search results but the test_tbl_cars table contains many cars which in any given search returns several of the same car model as being available.
How can I filter the query return such that I get one of each model available?

Use Distict clause
$sql = mysql_query("SELECT
DISTINCT test_tbl_cars.model, test_tbl_cars.*,
SUM(rental_start_date <= '$ddate' AND rental_end_date >= '$adate') AS ExistingReservations
FROM test_tbl_cars
LEFT JOIN test_reservations USING (car_id)
GROUP BY car_id
HAVING ExistingReservations = 0");

Awww, should have tagged me, I only saw this now over a year later! You've probably already figured out how to work this by now, but I'll take a crack at it anyway for completeness sake and because most of the answers here I don't think are doing what you want.
So the problem you are having is that in the other question each room had a unique ID and it was unique rooms people were interested in booking. Here, you're extending the concept of a bookable item to a pool of items of a particular class (in this case, model of car).
There's may be a way to do this without subqueries but by far the easiest way to do it is to simply take the original idea from my other answer and extend it by wrapping it up in another query that does the grouping into models (and as you'll see shortly, we get a bunch of other useful stuff for free out of doing this).
So, firstly lets start by getting the list of cars with counts of conflicting reservations (as per the update to my other answer):
(I'll use your query for these examples as a starting point, but note you really should use prepared statements or at the very least escaping functions supplied by your DB driver for the two parameters you're passing)
SELECT car_id, model_id, SUM(IF(rental_id IS NULL, 0, rental_start_date <= '$ddate' AND rental_end_date >= '$adate')) AS ConflictingReservations
FROM test_tbl_cars
LEFT JOIN test_reservations USING (car_id)
GROUP BY car_id
This will return one row per car_id giving you the model number, and the number of reservations that conflict with the date range you've specified (0 or more).
Now at this stage if we were asking about individual cars (rather than just models of cars available) we could restrict and order the results with "HAVING ConflictingReservations = 0 ORDER BY model_id" or something.
But, if we want to get a list of the availability of ~models~, we need to perform a further grouping of these results to get the final answer:
SELECT model_id, COUNT(*) AS TotalCars, SUM(ConflictingReservations = 0) AS FreeCars, CAST(IFNULL(GROUP_CONCAT(IF(ConflictingReservations = 0, car_id, NULL) ORDER BY car_id ASC), '') AS CHAR) AS FreeCarsList
FROM (
SELECT car_id, model_id, SUM(IF(rental_id IS NULL, 0, rental_start_date <= '$ddate' AND rental_end_date >= '$adate')) AS ConflictingReservations
FROM test_tbl_cars
LEFT JOIN test_reservations USING (car_id)
GROUP BY car_id
) AS CarReservations
GROUP BY model_id
You'll notice all we're doing is grouping the original query by model_id, and then using aggregate functions to get us the model_id, a count of total cars we have of this model, a count of free cars of this model we have which we achieve by counting all the times a car has zero ConflictingReservations, and finally a cute little bit of SQL that returns a comma separated list of the car_ids of the free cars (in case that was also needed!)
A quick word on performance: all the left joins, group bys, and subqueries could make this query very slow indeed. The good news is the outer group by should only have to process as many rows as you have cars for, so it shouldn't be slow until you end up with a very large number of cars. The inner query however joins two tables (which can be done quite quickly with indexes) and then groups by the entire set, performing functions on each row. This could get quite slow, particularly as the number of reservations and cars increases. To alleviate this you could use where clauses on the inner query and combine that with appropriate indexes to reduce the number of items you are inspecting. There's also other tricks you can use to move the comparison of the start and end dates into the join condition, but that's a topic for another day :)
And finally, as always, if there's incorrect edge cases, mistakes, wrong syntax, whatever - let me know and I'll edit to correct!

Related

Mysql fetch from last to first - [many records]

i want to fetch records from mysql starting from last to first LIMIT 20. my database have over 1M records. I am aware of order by. but from my understanding when using order by its taking forever to load 20 records i have no freaking idea. but i think mysql fetch all the records before ordering.
SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
I suppose if you execute the query without the order by the time would be satisfactory?
You might try to create an index in the column your are ordering:
create index idx_bookings_booking_id on bookings(booking_id)
You can try to find out complexity of the Query using
EXPLAIN SELECT bookings.created_at, bookings.total_amount,
passengers.name, passengers.id_number, payments.amount,
passengers.ticket_no,bookings.phone,bookings.source,
bookings.destination,bookings.date_of_travel FROM bookings
INNER JOIN passengers ON bookings.booking_id = passengers.booking_id
INNER JOIN payments on payments.booking_id = bookings.booking_id
ORDER BY bookings.booking_id DESC LIMIT 10
then check the proper index has been created on the table
SHOW INDEX FROM `db_name`.`table_name`;
if the index us not there create proper index on all the table
please add if anything is missing
The index lookup table needs to be able to reside in memory, if I'm not mistaken (filesort is much slower than in-mem lookup).
Use small index / column size
For a double in capacity use UNSIGNED columns if you need no negative values..
Tune sort_buffer_size and read_rnd_buffer_size (maybe better on connection level, not global)
See https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html , particularly regarding using EXPLAIN and the maybe trying another execution plan strategy.
You seem to need another workaround like materialized views.
Tell me if this sounds like it:
Create another table like the booking table e.g. CREATE TABLE booking_short LIKE booking. Though you only need the booking_id column
And check your code for where exactly you create booking orders, e.g. where you first insert into booking. SELECT COUNT(*) FROM booking_short. If it is >20, delete the first record. Insert the new booking_id.
You can select the ID and join from there before joining for more details with the rest of the tables.
You won't need limit or sorting.
Of course, this needs heavy documentation to avoid maintenance problems.
Either that or https://stackoverflow.com/a/5912827/6288442

MySQL SUM() giving incorrect total

I am developing a php/mysql database. I have two tables - matters and actions.
Amongst other fields the matter table contains 'matterid' 'fixedfee' and 'fee'. Fixed fee is Y or N and the fee can be any number.
For any matter there can be a number of actions. The actions table contains 'actionid' 'matterid' 'advicetime' 'advicefee'. The advicetime is how long the advice goes on for (in decimal format) and advicefee is a number. Thus, to work out the cost of the advice for a matter I use SUM(advicetime*advicefee).
What I wish to do is to add up all of the 'fee' values when 'fixedfee'=Y and also the sum of all of the SUM(advicetime*advicefee) values for all of these matters.
I have tried using:
SELECT
SUM(matters.fee) AS totfixed,
SUM(advicetime*advicefee) AS totbills,
FROM matters
INNER JOIN actions
ON matters.matterid = actions.matterid
WHERE fixedfee = 'Y'
but this doesn't work as (I think) it is adding up the matters.fee for every time there is an action. I have also tried making it
SUM(DISTINCT matters.fee) AS totfixed
but this doesn't work as I think it seems to be missing out any identical fees (and there are several matters which have the same fixed fee).
I am fairly new to this so any help would be very welcome.
but this doesn't work as (I think) it is adding up the matters.fee for every time there is an action. I have also tried making it ...
You're experiencing aggregate fanout issue. This happens whenever the primary table in a select query has fewer rows than a secondary table to which it is joined. The join results in duplicate rows. So, when aggregate functions are applied, they act on extra rows.
Here the primary table refers to the one where aggregate functions are applied. In your example,
* SUM(matters.fee) >> aggregation on table matters.
* SUM(advicetime*advicefee) >> aggregation on table actions
* fixedfee='Y' >> where condition on table matters
To avoid the fanout issue:
* Always apply the aggregates to the most granular table in a join.
* Unless two tables have a one-to-one relationship, don't apply aggregate functions on fields from both tables.
* Obtain your aggregates separately through different subqueries and then combine the result. This can be done in a SQL statement, or you can export the data and then do it.
Query 1:
SELECT SUM(fee) AS totfixed
FROM matters
WHERE fixedfee='Y'
Query 2:
SELECT SUM(actions.advicetime*actions.advicefee) AS totbills
FROM matters
JOIN actions ON matters.matterid = actions.matterid
WHERE matters.fixedfee = 'Y'
Query 1 & Query 2 don't suffer from fanout. At this point you can export them both and deal with the result in php. Or you can combine them in SQL:
SELECT query_2.totbills, query_1.totfixed
FROM (SELECT SUM(fee) AS totfixed
FROM matters
WHERE fixedfee='Y') query_1,
(SELECT SUM(actions.advicetime*actions.advicefee) AS totbills
FROM matters
JOIN actions ON matters.matterid = actions.matterid
WHERE matters.fixedfee = 'Y') query_2
Finally, SUM does not take a keyword DISTINCT. DISTINCT is only available to COUNT and GROUP_CONCAT aggregate functions. The following is a piece of invalid SQL
SUM(DISTINCT matters.fee) AS totfixed

Database Group By error

I've been working with Mysql for a while, but this is the first time I've encountered this problem.
The thing is that I have a select query...
SELECT
transactions.inventoryid,
inventoryName,
inventoryBarcode,
inventoryControlNumber,
users.nombre,
users.apellido,
transactionid,
transactionNumber,
originalQTY,
updateQTY,
finalQTY,
transactionDate,
transactionState,
transactions.observaciones
FROM
transactions
LEFT JOIN
inventory ON inventory.inventoryid = transactions.inventoryid
LEFT JOIN
users ON transactions.userid = users.userid
GROUP BY
transactions.transactionNumber
ORDER BY
transactions.inventoryid
But the GROUP BY is eliminating 2 values from the QUERY.
In this case, when I output:
foreach($inventory->inventory as $values){
$transactionid[] = $values['inventoryid'];
}
It returns:
2,3,5
If I eliminate the GROUP BY Statement it returns
2,3,4,5,6
Which is the output I need for this particular case.
The question is:
Is there a reason for this to happen?
If I'm grouping by a transaction and that was supposed to affect the query, wouldn't it then return only 1 value?
Maybe I'm over thinking this one or been working too long on the code that I don't see the obvious flaw in my logic. But if someone can lend me a hand I would appreciate it.
In standard SQL you can only SELECT colums which are
contained in GROUP BY clause
(or) aggregate "colums", like MAX() or COUNT().
You need to consult the MySQL description of the interpretation they use for columns which are not contained by GROUP BY (and which are no aggregated column) MySQL Handling of GROUP BY to find out what happens here.
Do you need more information?

MySQL Query Optimization - Random Record

I'm having a terrible time with a MySQL query. I've spent most of my weekend and most of my day today attempting to make this query run a bit faster. I've made it considerably faster, but I know I can make it better.
SELECT m.id,other_fields,C.contacts_count FROM marketingDatabase AS m
LEFT OUTER JOIN
(SELECT COUNT(*) as contacts_count, rid
FROM contacts
WHERE status = 'Active' AND install_id = 'XXXX' GROUP BY rid) as C
ON C.rid = m.id
WHERE (RAND()*2612<50)
AND do_not_call != 'true'
AND `ACTUAL SALES VOLUME` >= '800000'
AND `ACTUAL SALES VOLUME` <= '1200000'
AND status = 'Pending'
AND install_id = 'XXXXX'
ORDER BY RAND()
I have an index on 'install_id', 'category' and 'status' but the EXPLAIN shows it was sorting based on 9100 rows.
My Explain is here:
https://s3.amazonaws.com/jas-so-question/Screen+Shot+2012-03-13+at+12.34.04+AM.png
Anybody have any suggestions on what I can do to make this a bit faster? The entire point of the query is to select a random record from an account's records (install_id) that matches certain criteria like sales volume, status and do_not_call. I'm currently gathering 25 records and caching it (using PHP) so I only have to run this query once every 25 requests, but I'm already dealing with thousands of requests per day. It currently takes 0.2 seconds to run. I realize that by using ORDER BY RAND() I'm already taking a major performance hit, but it's just sorting 25 rows.
Thanks in advance for the help.
**EDIT: I forgot to mention that the 'contact_sort' index is on the 'contacts' table, and indexes install_id, status, and rid. (rid references Record ID in marketingDatabase so it knows which record a contact belongs to.
**EDIT 2: The 2612 number in the query represents the number of rows in marketingDatabase that match the criteria (install_id, status, actual sales volume, etc.)
Since I do not see your index definitions, I am not sure they are correct. The query would benefit from the following indexes:
a composite index (install_id, status, rid) on the contacts
a composite index (install_id, status, `ACTUAL SALES VOLUME`) on marketingDatabase
I played around with a few queries, and I don't think you'll ever be able to get a indexed query to work with RAND(), especially when you're using it in both a WHERE clause and an ORDER BY clause. If at all possible, I'd introduce the random element in my PHP logic, and probably look at whether two simple queries made more sense than one fairly complex one. Added to that, you have LEFT OUTER JOIN on a random result set, which may also be increasing the amount of work that has to be done a lot.
In summary, my guess would be - rewrite to exclude RAND, see if you can get rid of the LEFT OUTER JOIN. Two straightforward indexed queries with a bit of PHP in between may be a lot better.

Mysql Unique Query

I have a programme listing database with all the information needed for one programme packed into one table (I should have split programmes and episodes into their own) Now since there are multiple episodes for any given show I wish to display the main page with just the title names in ascending and chosen letter. Now I know how to do the basic query but this is all i know
SELECT DISTINCT title FROM programme_table WHERE title LIKE '$letter%'
I know that works i use it. But I am using a dynamic image loading that requires a series number to return that image full so how do I get the title to be distinct but also load the series number from that title?
I hope I have been clear.
Thanks for any help
Paul
You can substitute the DISTINCT keyword for a GROUP BY clause.
SELECT
title
, series_number
FROM
programme_table
WHERE title LIKE '$letter%'
GROUP BY
title
, series_number
There are currently two other valid options:
The option suggested by Mohammad is to use a HAVING clause in stead of the WHERE clause this is actually less optimal:
The WHERE clause is used to restrict records, and is also used by the query optimizer to determine which indexes and tables to use. HAVING is a "filter" on the final result set, and is applied after ORDER BY and GROUP BY, so MySQL cannot use it to optimize the query.
So HAVING is a lot less optimal and you should only use it when you cannot use 'WHERE' to get your results.
quosoo points out that the DISTINCT keyword is valid for all listed columns in the query. This is true, but generally people do not recommend it (there is no performance difference *In some specific cases there is a performance difference***)**. The MySQL optimizer however spits out the same query for both so there is no actual performance difference.
Update
Although MySQL does apply the same optimization to both queries, there is actually a difference: when DISTINCT is used in combination with a LIMIT clause, MySQL stops as soon as it finds enough unique rows. so
SELECT DISTINCT
title
, series_number
FROM
programme_table
WHERE
title LIKE '$letter%'
is actually the best option.
select title,series_number from programme_table group by title,series_number having title like '$letter%';
DISTINCT keyword works actually for a list of colums so if you just add the series to your query it should return a set of unique title, series combinations:
SELECT DISTINCT title, series FROM programme_table WHERE title LIKE '$letter%'
Hey thanks for that but i have about 1000 entries with the same series so it would single out the series as well rendering about 999 programmes useless and donot show.
I however found out away to make it unique and show the series number
SELECT * FROM four a INNER JOIN (SELECT title, MIN(series) AS MinPid FROM four WHERE title LIKE '$letter%' GROUP BY title) b ON a.title = b.title AND a.series = b.MinPid
Hopefully it helps anyone in the future and thank you for the replies :)

Categories