SQL Joining 2 similar tables - php

I have two tables which are used to store details of different types of events.
The tables are almost identical, with fields such as date, duration etc. I'm trying to perform a join on the two tables and sort them by the mutual field 'date'.
I know it would be simpler to simply add a 'type' field to a single table and store it all in a single place, unfortunately the nature of the cms I am using does not allow this.
Is there a way to perform this simply? The following query returns no results.
$sql = "SELECT * FROM Events_One a, Events_Two b WHERE a.Date > now() OR b.Date > now() ORDER BY Date ASC LIMIT ".$limit;

You should look at UNION statement. It does exactly what you need.
SELECT columns FROM t1
UNION
SELECT columns FROM t2
gives you one set, which you can later filter or sort by whatever you want.

If you have a table for each event type, with identical fields for each one, I'd say that you should reconsider your design. That's breaking normalization rules. It's a poor design because it forces you to add another table every time a new event comes along. It's better if you can add a new event by adding data, like a new type value, to an existing table.

If you want to join two similar tables You can use Union or union all
Union:its Like a join Command.When using the union all the selected columns need to be of the same datatype.Only distinct values are selected.
Union All :it almost like union.There is no distinct operation so it will take all the values.
select * FROM Events_One a WHERE a.Date > now()
union
select * FROM Events_Two b WHERE b.Date > now()

Related

query group by with order by

i need to create query with group by and order by, and i dont know how to do it.
query should return one record for the newest date for existing device_serial_number. enter image description here
so i would to get id 591 nad 592
solution can be in sql or the best way it will be in symfony, through query builder etc.
There are many ways to accomplish what you want.
First Way
The oldest way to select first, best, worst, whatever within a group is with a correlated subquery:
Select * from mytable outer
Where created_at = (
Select max(created_at)
from mytable inner
Where inner.device_serial_number = outer.device_serial_number
)
Second Way
Use a subselect to find earliest dates for all devices, them join back to the original table to filter:
Select a.*
From mytable a Inner Join
(Select device_serial_number, max(created_at) as latedate
From mytable b
Group By device_serial_number
) b
On a.device_serial_number=b.device_serial_number
And a.created_at=b.latedate
Third way
Use a window function to rank order all the dates and then pick the number one ranking.
Select * From (
Select *
, rank() Over (Partition By device_serial_number Order by created_at desc) as myrank
From mytable
)
Where myrank=1
Notice that while these 3 solutions use different aspects of SQL, they all have a common analytical approach. They are all two step processes whose first (inner) part involves finding the most recent created_at date for each device_serial_number and then reapplying that result back to the original table in the second (outer) part.

Displaying a large amount of data in paging table without heavily impacting DB

The current implementation is a single complex query with multiple joins and temporary tables, but is putting too much stress on my MySQL and is taking upwards of 30+ seconds to load the table. The data is retrieved by PHP via a JavaScript Ajax call and displayed on a webpage. Here is the tables involved:
Table: table_companies
Columns: company_id, ...
Table: table_manufacture_line
Columns: line_id, line_name, ...
Table: table_product_stereo
Columns: product_id, line_id, company_id, assembly_datetime, serial_number, ...
Table: table_product_television
Columns: product_id, line_id, company_id, assembly_datetime, serial_number, warranty_expiry, ...
A single company can have 100k+ items split between the two product tables. The product tables are unioned and filtered by the line_name, then ordered by assembly_datetime and limited depending on the paging. The datetime value is also reliant on timezone and this is applied as part of the query (another JOIN + temp table). line_name is also one of the returned columns.
I was thinking of splitting the line_name filter out from the product union query. Essentially I'd determine the ids of the lines that correspond to the filter, then do a UNION query with a WHERE condition WHERE line_id IN (<results from previous query>). This would cut out the need for joins and temp tables, and I can apply the line_name to line_id and timezone modification in PHP, but I'm not sure this is the best way to go about things.
I have also looked at potentially using Redis, but the large number of individual products is leading to a similarly long wait time when pushing all of the data to Redis via PHP (20-30 seconds), even if it is just pulled in directly from the product tables.
Is it possible to tweak the existing queries to increase the efficiency?
Can I push some of the handling to PHP to decrease the load on the SQL server? What about Redis?
Is there a way to architect the tables better?
What other solution(s) would you suggest?
I appreciate any input you can provide.
Edit:
Existing query:
SELECT line_name,CONVERT_TZ(datetime,'UTC',timezone) datetime,... FROM (SELECT line_name,datetime,... FROM ((SELECT line_id,assembly_datetime datetime,... FROM table_product_stereos WHERE company_id=# ) UNION (SELECT line_id,assembly_datetime datetime,... FROM table_product_televisions WHERE company_id=# )) AS union_products INNER JOIN table_manufacture_line USING (line_id)) AS products INNER JOIN (SELECT timezone FROM table_companies WHERE company_id=# ) AS tz ORDER BY datetime DESC LIMIT 0,100
Here it is formatted for some readability.
SELECT line_name,CONVERT_TZ(datetime,'UTC',tz.timezone) datetime,...
FROM (SELECT line_name,datetime,...
FROM (SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos WHERE company_id=#
UNION
SELECT line_id,assembly_datetime datetime,...
FROM table_product_televisions
WHERE company_id=#
) AS union_products
INNER JOIN table_manufacture_line USING (line_id)
) AS products
INNER JOIN (SELECT timezone
FROM table_companies
WHERE company_id=#
) AS tz
ORDER BY datetime DESC LIMIT 0,100
IDs are indexed; Primary keys are the first key for each column.
Let's build this query up from its component parts to see what we can optimize.
Observation: you're fetching the 100 most recent rows from the union of two large product tables.
So, let's start by trying to optimize the subqueries fetching stuff from the product tables. Here is one of them.
SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos
WHERE company_id=#
But look, you only need the 100 newest entries here. So, let's add
ORDER BY assembly_datetime DESC
LIMIT 100
to this query. Also, you should put a compound index on this table as follows. This will allow both the WHERE and ORDER BY lookups to be satisfied by the index.
CREATE INDEX id_date ON table_product_stereos (company_id, assembly_datetime)
All the same considerations apply to the query from table_product_televisions. Order it by the time, limit it to 100, and index it.
If you need to apply other selection criteria, you can put them in these inner queries. For example, in a comment you mentioned a selection based on a substring search. You could do this as follows
SELECT t.line_id,t.assembly_datetime datetime,...
FROM table_product_stereos AS t
JOIN table_manufacture_line AS m ON m.line_id = t.line_id
AND m.line_name LIKE '%test'
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
Next, you are using UNION to combine those two query result sets into one. UNION has the function of eliminating duplicates, which is time-consuming. (You know you don't have duplicates, but MySQL doesn't.) Use UNION ALL instead.
Putting this all together, the innermost sub query becomes this. We have to wrap up the subqueries because SQL is confused by UNION and ORDER BY clauses at the same query level.
SELECT * FROM (
SELECT line_id,assembly_datetime datetime,...
FROM table_product_stereos
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
) AS st
UNION ALL
SELECT * FROM (
SELECT line_id,assembly_datetime datetime,...
FROM table_product_televisions
WHERE company_id=#
ORDER BY assembly_datetime DESC
LIMIT 100
) AS tv
That gets you 200 rows. It should get those rows fairly quickly.
200 rows are guaranteed to be enough to give you the 100 most recent items later on after you do your outer ORDER BY ... LIMIT operation. But that operation only has to crunch 200 rows, not 100K+, so it will be far faster.
Finally wrap up this query in your outer query material. Join the table_manufacture_line information, and fix up the timezone.
If you do the indexing and the ORDER BY ... LIMIT operation earlier, this query should become very fast.
The comment dialog in your question indicates to me that you may have multiple product types, not just two, and that you have complex selection criteria for your paged display. Using UNION ALL on large numbers of rows slams performance: it converts multiple indexed tables into an internal list of rows that simply can't be searched efficiently.
You really should consider putting your two kinds of product data in a single table instead of having to UNION ALL multiple product tables. The setup you have now is inflexible and won't scale up easily. If you structure your schema with a master product table and perhaps some attribute tables for product-specific information, you will find yourself much happier two years from now. Seriously. Please consider making the change.
Remember: Index fast, data slow. Use joins over nested queries. Nested queries return all of the data fields whereas joins just consider the filters (which should all be indexed - make sure there's a unique index on table_product_*.line_id). It's been a while but I'm pretty sure you can join "ON company_id=#" which should cut down the results early on.
In this case, all of the results refer to the same company (or a much smaller subset) so it makes sense to run that query separately (and it makes the query more maintainable).
So your data source would be:
(table_product_stereos as prod
INNER JOIN table_manufacture_line AS ml ON prod.line_id = ml.line_id and prod.company_id=#
UNION
table_product_televisions as prod
INNER JOIN table_manufacture_line as ml on prod.line_id = ml.line_id and prod.company_id=#)
From which you can select prod. or ml. fields as required.
PHP is not a solution at all...
Redis can be a solution.
But the main thing I would change is the index creation for the tables (add missing indexe)...If you're running into temp tables you didn't create indexes well for the tables. And 100k rows in not much at all.
But I cant help you without any table creation statements as well as queries you run.
Make sure your "where part" is part of youf btree index from left to right.

mySQL logic: output of table join not correct

I am trying to query two tables: finished_events and flagged_events. 1st of all I need everything related to the company_id so
SELECT *
FROM finished_events
WHERE company_id=$id
ORDER by schedule, timestamp
I then changed this to:
SELECT * FROM finished_events
INNER JOIN flagged_events
ON finished_events.company_id=flagged_events.company_id
WHERE finished_events.company_id=$id
ORDER by finished_events.schedule, finished_events.timestamp
I have tried using FULL JOIN, LEFT JOIN, and RIGHT JOINs all unsuccessful. Specifically what I want is to get is a combined effort of the following code:
$sql = "SELECT *
FROM finished_events
WHERE company_id=$id
ORDER by schedule, time_stamp";
$flagged_sql = "SELECT *
FROM flagged_events
WHERE company_id=$id
ORDER by schedule, time_stamp";
The tables are a bit different so UNION won't work here. I can post dummy database entries but this won't be of too much help as I need all from both tables. The 2 links between the tables would be the company_id and the schedule columns. Essentially what is going on behind the scenes is timestamps being put into a different table to which I then process either into finished_events or flagged_events. Flagged events will need the user to do something about it until it is a finished event. So this script is generating the data for the GUI, hence why I need to query both tables and create an associative array of customer details then an array of events (from these 2 tables). So creating the assoc_array is no problem I just need to get this query to spit out all the events and order them correctly. Let me know if you need anything specific to solve this one, thanks :)
EDIT
SQL Fiddle: http://sqlfiddle.com/#!2/d4c30/1
this almost fixes it but not quite right, it repeats entries at the bottom
If I understood correctly, this may be useful for you:
SELECT a.* FROM (
SELECT *, 'finished' as event_type FROM finished_events
UNION
SELECT *, 'flagged' as event_type FROM flagged_events) a
ORDER BY a.schedule, a.time_stamp

How can I get this database to order before the GROUP BY [duplicate]

This question already has answers here:
MySQL Order before Group by
(10 answers)
Closed 9 years ago.
I made a website for golf scorecards. The page I am working on is the players profile. When you access a players profile, it shows each course in order of last played (DESC). Except, the order of last played is jumbled due to the ORDER BY command below. Instead, when it GROUPs, it takes the earliest date, rather than the most recent.
After the grouping is done, it correctly shows them in order (DESC)... just the wrong order due to the courses grouping by date_of_game ASC, rather than DESC. Hope this isn't too confusing.. Thank you.
$query_patrol321 = "SELECT t1.*,t2.* FROM games t1 LEFT JOIN scorecards t2 ON t1.game_id=t2.game_id WHERE t2.player_id='$player_id' GROUP BY t1.course_id ORDER BY t1.date_of_game DESC";
$result_patrol321 = mysql_query($query_patrol321) or die ("<br /><br />There's an error in the MySQL-query: ".mysql_error());
while ($row_patrol321 = mysql_fetch_array($result_patrol321)) {
$player_id_rank = $row_patrol321["player_id"];
$course_id = $row_patrol321["course_id"];
$game_id = $row_patrol321["game_id"];
$top_score = $row_patrol321["total_score"];
Try to remove the GROUP BY-clause from the query. You should use GROUP BY only when you have both normal columns and aggregate functions (min, max, sum, avg, count) in your SELECT. You have just normal columns.
The fact that it shows the grouping result in ASC order is a coincidence because that is the order of their insertion. In contrast to other RDBMS like MS SQL Server, MySQL allows you to add non-aggregated columns to a GROUPed query. This non-standard behavior creates the confusion you're seeing. If this were not MySQL, you'd need to define the aggregation for all your selected columns given the grouping.
MySQL's behavior is (I believe) to take the first row matching the the GROUP for non-aggregated columns. I would advise against doing this.
Even though you're aggregating, you're not ORDERing by the aggregated column.
So What you want to do is ORDER BY the MAX date DESC
In this way, you are ordering by the latest date per course (your grouping criteria).
SELECT
t1.* -- It would be better if you actually listed the aggregations you wanted
,t2.* -- Which columns do you really want?
FROM
games t1
LEFT JOIN
scorecards t2
ON t2.[game_id] =t1[.game_id]
WHERE
t2.[player_id]='$player_id'
GROUP BY
t1.[course_id]
ORDER BY
MAX(t1.[date_of_game]) DESC
If you want the maximum date, then insert logic to get it. Don't depend on the ordering of columns or on undocumented MySQL features. MySQL explicitly discourages the use of non-aggregated columns in the group by when the values are not identical:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. This means that the preceding query is legal in MySQL. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. (see [here][1])
How do you do what you want? The following query finds the most recent date on each course and just uses that -- and no group by:
SELECT t1.*, t2.*
FROM games t1 LEFT JOIN
scorecards t2
ON t1.game_id=t2.game_id
WHERE t2.player_id='$player_id' and
t1.date_of_game in (select MAX(date_of_game)
from games g join
scorecards ss
on g.game_id = ss.game_id and
ss.player_id = '$player_id'
where t1.course_id = g.course_id
)
GROUP BY t1.course_id
ORDER BY t1.date_of_game DESC
If game_id is auto incrementing, you can use that instead of date_of_game. This is particularly important if two games can be on the same course on the same date.

combining data from two tables to count contacts, trying to use union

I have two tables. One tracks Part Shipments and the other tracks System shipments.
I am trying to count the customer contacts in each table with the result showing me the total customer contacts for both parts and systems combined.
I am trying to use Union and I would guess from my results I am doing this all wrong. My results end up with two entries for customers. Cust A will have a total of 9 and then another entry of 1. So I am guess there is no merge of the customer contacts and it is just creating a union of both results.
The Code I am using.
SELECT Count(part_shipment.Customer_Station_ID) AS Contact,
part_shipment.Customer_Station_ID AS Customer
FROM part_shipment
GROUP BY part_shipment.Customer_Station_ID
UNION
SELECT Count(system_shipments.Customer_Station_ID) AS Contact,
system_shipments.Customer_Station_ID AS Customer
FROM system_shipments
GROUP BY system_shipments.Customer_Station_ID
ORDER BY Contact DESC
You can't do it like that. The Union just take rows from first query and rows from second query, and "display" them ones after anothers.
UNION requires the creation of derived tables (tables created from a query).
SELECT *
FROM (
SELECT col1, col2
FROM table
) UNION (
SELECT col1, col2
FROM otherTable
)
I also don't think you can use GROUP BY inside the selects that make up the UNION (it's been a while since I used it so I don't remember for sure)
Do you have tried to use a GROUP BY and SUM from the results of UNION query?

Categories