First of all, I have no control over the database structure, etc.
I need to use PHP to retrieve records by state and name but all the data is in multiple tables. Basically I need to try and understand how to get these two queries combined so that they do not run so slow.
First I get my record ID's in PHP. (Lets assume $query returns an array of IDs)
table_products = A bunch of products
$query = "SELECT id,name FROM table_products WHERE name = '".$name."';";
Than I need to iterate through these records (NOTE : There can be A LOT) and figure out where these IDs reside in another two tables that has the location information of where they could be at.
table_places = a table with a bunch of locations
link_table = Contains the relationships between product and location
$state = "somestate";
foreach($query as $row)
{
$query_two = "SELECT table_places.name, table_places.id, table_places.state, link_table.place_id, link_table.product_id
FROM table_places
INNER JOIN link_table
ON table_places.id = link_table.place_id
WHERE table_places.state = '".$state."' AND link_table.product_id = '".$row->id."';";
}
God I hope this made sense. I am no query guru so if I could get assistance in optimizing this to run faster, I would be grateful.
The pain is here:
foreach($query as $row) <<--- you are pinging the DB to death.
Combine the two queries:
SELECT
pl.name, pl.id, pl.state,
l.place_id, l.product_id,
pr.name
FROM table_places pl
INNER JOIN link_table l ON (pl.id = l.place_id)
INNER JOIN table_products pr ON (l.product_id = pr.id)
WHERE pr.name = '$name'
AND pl.state = '$state'
ORDER BY pr.name, pl.state
Make sure you put indexes on all fields used in the ON clauses and the where clauses.
You can join more than two tables in one query. Untested query would be something like;
SELECT * FROM table_products AS prod LEFT JOIN (link_table AS link, table_places AS places)
ON (prod.id=link.id AND prod.id=places.id)
WHERE some_field='some_value'
This will definitely get you some performance boost, as one query is most of times a lot faster than number-of-records query's (as you now loop through the records and query the db once per record)
The answer is simple: no control over database structure - no way to optimize.
As the indexing being the cornerstone of query optimization and you obviously need access to table structure to add one.
Related
I'm currently running this query. However, when run outside phpMyAdmin it causes a 504 timeout error. I'm thinking it has to do with how efficient the number of rows is returned or accessed by the query.
I'm not extremely experienced with MySQL and so this was the best I could do:
SELECT
s.surveyId,
q.cat,
SUM((sac.answer_id*q.weight))/SUM(q.weight) AS score,
user.division_id,
user.unit_id,
user.department_id,
user.team_id,
division.division_name,
unit.unit_name,
dpt.department_name,
team.team_name
FROM survey_answers_cache sac
JOIN surveys s ON s.surveyId = sac.surveyid
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
JOIN user ON user.user_id = sac.user_id
JOIN questions q ON q.question_id = sac.question_id
JOIN division ON division.division_id = user.division_id
LEFT JOIN unit ON unit.unit_id = user.unit_id
LEFT JOIN department dpt ON dpt.department_id = user.department_id
LEFT JOIN team ON team.team_id = user.team_id
WHERE c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
GROUP BY user.team_id, s.surveyId, q.cat
ORDER BY s.surveyId, user.team_id, q.cat ASC
The problem I get with this query is that when I get a correct result returned it runs quickly (let's say +-500ms) but when the result has twice as much rows, it takes more than 5 minutes and then causes a 504 timeout.
The other problem is that I didn't create this database myself, so I didn't set the indices myself. I'm thinking of improving these and therefore I used the explain command:
I see a lot of primary keys and a couple double indices, but I'm not sure if this would affect the performance this greatly.
EDIT: This piece of code takes up all the execution time:
$start_time = microtime(true);
$stmt = $conn->query($query); //query is simply the query above.
while ($row = $stmt->fetch_assoc()){
$resultSurveys["scores"][] = $row;
}
$stmt->close();
$end_time = microtime(true);
$duration = $end_time - $start_time; //value typically the execution time #reallyHigh...
So my question: Is it possible to (greatly?) improve the performance of the query by altering the database keys or should I divide my query into multiple smaller queries?
You can try something like this ( although its not practical for me to test this )
SELECT
sac.surveyId,
q.cat,
SUM((sac.answer_id*q.weight))/SUM(q.weight) AS score,
user.division_id,
user.unit_id,
user.department_id,
user.team_id,
division.division_name,
unit.unit_name,
dpt.department_name,
team.team_name
FROM survey_answers_cache sac
JOIN
(
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
) AS v ON v.surveyid = sac.surveyid
JOIN user ON user.user_id = sac.user_id
JOIN questions q ON q.question_id = sac.question_id
JOIN division ON division.division_id = user.division_id
LEFT JOIN unit ON unit.unit_id = user.unit_id
LEFT JOIN department dpt ON dpt.department_id = user.department_id
LEFT JOIN team ON team.team_id = user.team_id
GROUP BY user.team_id, v.surveyId, q.cat
ORDER BY v.surveyId, user.team_id, q.cat ASC
So I hope I didn't mess anything up.
Anyway, the idea is in the inner query you select only the rows you need based on your where condition. This will create a smaller tmp table as it only pulls 2 fields both ints.
Then in the outer query you join to the tables that you actually pull the rest of the data from, order and group. This way you are sorting and grouping on a smaller dataset. And your where clause can run in the most optimal way.
You may even be able to omit some of these tables as your only pulling data from a few of them, but without seeing the full schema and how it's related that's hard to say.
But just generally speaking this part (The sub-query)
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
Is what is directly affected by your WHERE clause. See so we can optimize this part then use it to join the rest of the data you need.
An example of removing tables can be easily deduced from the above, consider this
SELECT
s.surveyId,
sc.subcluster_id
FROM
surveys s
JOIN subcluster sc ON s.subcluster_id = sc.subcluster_id
WHERE
sc.cluster_id=? AND sc.subcluster_id=? AND s.active=0 AND s.prepare=0
The c table cluster is never used to pull data from, only for the where. So is not
JOIN cluster c ON sc.cluster_id = c.cluster_id
WHERE
c.cluster_id=?
The same as or equivalent to
WHERE
sc.cluster_id=?
And therefore we can eliminate that join completely.
The EXPLAIN result is showing signs of problem
Using temporary;using filesort: the ORDER BY needs to create temporary tables to do the sorting.
On 3rd row for user table type is ALL, key and ref are NULL: means that it needs to scan the whole table each time to retrieve results.
Suggestions:
add indexes on user.cluster_id and all fields involved on the ORDER BY and GROUP by clauses. Keep in mind that user table seems to be under changein database (cross database query).
Add indexes on user columns involved on JOINs.
Add index to s.survey_id
If possible, keep the same sequence for GROUP BY and ORDER BY clauses
According to the accepted answer in this question move the JOIN on user table to the first position in the join queue.
Carefully read this official documentation. You may need to optimize the server configuration.
PS: query optimization is an art that requires patience and hard work. No silver bullet for that.
Welcome to the fine art of optimizing MySQL!
i think the problem happends when you add this:
JOIN user ON user.cluster_id = sc.subcluster_id
JOIN survey_answers_cache sac ON (sac.surveyId = s.surveyId AND sac.user_id = user.user_id)
the extra condition sac.user_id = user.user_id can be easily not consistent.
Can you try do a second join with user table?
pd. can you add a "SHOW CREATE TABLE"
So, I am trying to select some data from 4 tables using a query I have attempted to throw together.
SELECT *
FROM cards
LEFT JOIN cards_viewers ON cards.card_id = cards_viewers.card_id
(SELECT *
FROM folders
WHERE folder_id = cards.card_folderID)
(SELECT user_firstName,
user_lastName,
user_avatar
FROM user_data
WHERE user_id = cards_viewers.user_id)
WHERE cards_viewers.user_id = '.$u_id.'
ORDER BY cards.card_lastUpdated DESC
Basically, the query selects data from the four tables depending on the user_id in table user_data. I have attempted to initially fetch all data from the tables cards, and cards_viewers, and have went on to use this data to select values from the other tables (user_data and folders).
The query is wrong, I know that. I have learnt the majority of basic MySQL, but I am still struggling with more complex queries like the one I am trying to write now. What query can I use to select the data I want?
Links to any documentation to parts of queries would prove very useful in helping me learn how to create queries in future, rather than just relying on StackOverflow.
Many thanks.
You don't need "MULTI-WHERE" but multiple joins, you just need to keep doing joins until you get the tables you need.
Here's an example:
SELECT *
FROM cards LEFT JOIN cards_viewers
ON cards.card_id = cards_viewers.card_id
LEFT JOIN folders
ON folders.folder_id = cards.card_folderID
LEFT JOIN user_data
ON user_id = cards_viewers.user_id
WHERE cards_viewers.user_id = '.$u_id.'
ORDER BY cards.card_lastUpdated DESC
To custom the fields you want to get just change * for the name of the field being careful about ambiguous column naming.
For further information check MySql Joins. Hope this helped you :)
I am running this query
$sql6 = "SELECT RECIPE.Price, RECIPE.Name FROM ORDERRECI, ORDERS, RECIPE WHERE ORDERRECI.OrderID = $orderID AND ORDERRECI.RecipeID = RECIPE.RecipeID";
$results = mysqli_query($con, $sql6);
while ($row=mysqli_fetch_assoc($results))
{
$calcC+= $row['Price'];
echo $calcC.$row['Name']."<br />";
}
return $calcC;
The query Runs fine I'm getting the right values But I'm getting them 9 times.( I checked via the echo). I checked the Database they are only in there once. I tried using Distinct but because the customer can pick the same side multiple times they would get an inaccurate result (tested. Can anyone explain why. Would using join help. My teacher favors where(Don't know why) but that's why I use it
EDIT:
I Recipe and Orders have a many-to-many relationship ORDERRECI is the referential table. I am trying to calculate the Total Cost of an order. I just tried inner join but it still duplicated and this time it duplicated 14 times
FROM ORDERRECI, ORDERS, RECIPE here you are doing a Cartesian product with the 3 tables, which means, each row from each table will be paired with each row from every other table from the FROM list. I don't know what was your goal with the query, but that is whats happening.
Try adding this to your query
GROUP BY RECIPE.Name
For each row in ORDERRECI you create a merged row with each one of the rows in ORDERS and the same goes for RECIPE.
You should use LEFT JOIN. Read about SQL join types here
I ended up with this statement. I Needed the general join of the tables. then use where to narrow it down. Thanks for the help in figuring it out
$sql = "SELECT RECIPE.Price FROM ORDERRECI INNER JOIN ORDERS ON ORDERRECI.OrderID = ORDERS.OrderID INNER JOIN RECIPE ON ORDERRECI.RecipeID = RECIPE.RecipeID WHERE ORDERS.OrderID = $orderID";
I have a recordset based on a view in MySQL that I use to return search results but it is painfuilly slow (consistently 21 seconds!). A similar search in the same environment takes under a second.
I fear that it is the view that is slowing things down since I have four left joins and one subquery in there to make related data available in the search.
Is there any general guidance for speeding up a query when using a view? I have researched indexing but it seems that is not allowed in MySQL in views.
Thanks in advance for any suggestions.
The code to create my view:
CREATE VIEW vproducts2 AS
SELECT products2.productid, products2.active, products2.brandid,
products2.createddate, products2.description, products2.inventorynum,
products2.onhold, products2.price, products2.refmodnum, products2.retail,
products2.sefurl, products2.series, products2.sold,
`producttype`.`type` AS type, categories.category AS category,
`watchbrands`.`brand` AS brand, productfeatures.productfeaturevalue AS size,
(SELECT productimages.image
FROM productimages
WHERE productimages.productid = products2.productid
LIMIT 1
) AS pimage
FROM products2
LEFT JOIN producttype ON producttype.typeid = products2.typeid
LEFT JOIN categories ON categories.categoryid = products2.categoryid
LEFT JOIN watchbrands ON watchbrands.brandid = products2.brandid
LEFT JOIN productfeatures ON productfeatures.productid = products2.productid
AND productfeatures.featureid = 1
You need to ensure that you have indexes on the underlying tables, not on the view. The view should use such tables.
The first index that screams out is on productimages(productid, productimage). This will speed up the subquery in the select clause.
You should also have primary key indexes for what look like primary keys on all the tables . . . categories(categoryid), producttype(typeid), watchbrands(brandid), and (I think) productfeatures(productid, featureid).
I'm looping through records in a table, and in each loop it opens another table to pull data from it. I need to ORDER the main loop ($rs) by a field from the 2nd table.
I'm somewhat a beginner to php and sql so go easy on me :P
code example:
$sql_result = mysql_query("SELECT * FROM table1", $db); // i want this to ORDER BY data from table2
while($rs = mysql_fetch_array($sql_result)) {
$sql_result2 = mysql_query("SELECT * FROM table2 WHERE id='$rs[data]'", $db);
$rs2 = mysql_fetch_array($sql_result2);
//get data from table 2
Looping through a resultset and executing a second query for each row is bad practice and causes unnecessary load on the database.
It looks like you need an INNER JOIN:
SELECT t1.*
FROM table1 t1
INNER JOIN table2 t2 ON t2.id = t1.data
ORDER BY t2.field
This way, you can get all the data with one single query.
EDIT:
Generally you should try to avoid returning all columns (SELECT *) and select only the columns that you really need:
SELECT t1.id, t2.field1, t1.field2, ...
This will not only reduce the load on the database (less data to select and to transfer over the network), but it will also help to eliminate double column names because double column names cause problems when you want to display them (as you found out yourself).
You can do two things to avoid this problem:
if both tables have columns with identical names and identical content (e.g. if these columns are used to join the tables), just select one of them (no matter which one, because they are identical).
if both tables have columns with identical names and different content and you need both of them, the easiest way would be to give one of them an alias:
SELECT t1.id, t2.id AS AnotherId, ...
This will cause the second column to be named "AnotherId", so you can get it with $rs[AnotherId].
Honestly, I'm no PHP guru so I don't know if PHP understands $rs[table.field] as well, but there are enough data access technologies which have problems when a query returns two columns with identical names...so aliasing duplicate columns can never be wrong.