Hey everyone. I'm having a bit of trouble running a query / php combination efficiently. I seem to be just looping over too many result sets in inner loops in my php. I'm sure there is a more efficient way of doing this. Any help very much appreciated.
I've got a table that holds 3500 recipes ([recipe]):
rid | recipe_name
And another table that holds 600 different ingredients ([ingredients])
iid | i_name
Each recipe has x number of ingredients associated to it, and I use a nice joining table to create the association ([recipe_ingredients])
uid | rid | iid
(where uid is just a unique id for the table)
For example:
rid: 1 | recipe_name: Lemon Tart
.....
iid: 99 | i_name: lemon curd
iid: 154 | i_name: flour
.....
1 | 1 | 99
2 | 1 | 154
The query I'm trying to run, allows the user to enter what ingredients they have, and it will tell you anything you can make with those ingredients. It doesn;t have to use all ingredients, but you do need to have all the ingredients for the recipe.
For instance if I had flour, egg, salt, milk and lemon curd I could make 'Pancakes', and 'Lemon Tart' (if we assume lemon tart has no other ingredients:)), but couldn't make 'Risotto' (as I didnt have any rice, or anything else thats needed in it).
In my PHP I have an array containing all the ingredients the user has. At the moment they way I'm running this is going through every recipe (loop 1) and then checking all ingredients in that recipe to see if each ingredient is contained in my ingredients array (loop 2). As soon as it finds an ingredient in the recipe, that isnt in my array, it says "no" and goes onto the next recipe. If it does, it stores the rid in a new array, that I use later to display the results.
But if we look at the efficiency of that, if I assume 3500 recipes, and Ive got 40 ingredients in my array, the worst case scenario is it running through 3500 x 40n, where n = number of ingredients in the recipe. The best case is still 3500 x 40 (doesn't find an ingredient first time for every recipe so exits).
I think my whole approach to this is wrong, and I think there must be some clever sql that I'm missing here. Any thoughts? I can always build up an sql statement from the ingredient array I have ......
Thanks a lot in advance, much appreciated
I'd suggest storing the count of the number of ingredients for the recipe in the recipe table, just for efficiency's sake (it will make the query quicker if it doesn't have to calculate this information every time). This is denormalization, which is bad for data integrity but good for performance. You should be aware that this can cause data inconsistencies if recipes are updated and you are not careful to make sure the number is updated in every relevant place. I've assumed you've done this with the new column set as ing_count in the recipe table.
Make sure you escape the values in for NAME1, NAME2, etc if they are provided via user input - otherwise you are at risk for SQL injection.
select recipe.rid, recipe.recipe_name, recipe.ing_count, count(ri) as ing_match_count
from recipe_ingredients ri
inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing
on ri.iid = ing.iid
inner join recipe
on recipe.rid = ri.rid
group by recipe.rid, recipe.recipe_name, recipe.ing_count
having ing_match_count = recipe.ing_count
If you don't want to store the recipe count, you could do something like this:
select recipe.rid, recipe.recipe_name, count(*) as ing_count, count(ing.iid) as ing_match_count
from recipe_ingredients ri
inner join (select iid from ingredients where i.name='NAME1' or i.name='NAME2' or i.NAME='NAME3') ing
on ri.iid = ing.iid
right outer join recipe
on recipe.rid = ri.rid
group by recipe.rid, recipe.recipe_name
having ing_match_count = ing_count
You could an "IN ANY" type query:
select recipes.rid, count(recipe_ingredients.iid) as cnt
from recipes
left join recipe_ingredients on recipes.rid = recipe_ingredients.rid
where recipes_ingredients in any (the,list,of,ingredients,the,user,hash)
group by recipes.rid
having cnt > some_threshold_amount
order by cnt desc
Doing this off the top of my head, but basically pull out any recipes where at least one of the user-provided ingredients are listed, sort by the total ingredient count, and then only return the recipes where more than a threshold amount of ingredients are present.
I've probably got the threshold bit wrong - sneaky suspicion it'll count the recipes's ingredients, and not the user-provided ones, but the rest of the query should be a good start for what you need.
Question: why isn't your query directly sql?
You can optimize by eliminating the wrong recipes:
firstly eliminate the recipes that have more ingridients than you user ingredients
make a recursive greedy by:
pick the first rid|iid
if it's in the user ingredients, continue,
if not, eliminate from the Recipe_Ingredients table all the rows with rid => new_table
restart using the new_table | stop new_table count = 0
It should have the best statistical results.
Hope it helped
Something like this:
SELECT r.*, COUNT(ri.iid) AS count FROM recipe r
INNER JOIN recipe_ingredient ri ON r.rid = ri.rid
INNER JOIN ingredient i ON i.iid = ri.iid
WHERE i.name IN ('milk', 'flour')
GROUP BY r.rid
HAVING count = 2
It's pretty easy to understand. count hold the number of ingredients within the list (milk, flour) that were matched for each recipe. If count matches the number of ingredients in the WHERE clause (in this case: 2), then return the recipe.
SELECT irl.ingredient_amount, r . * , i.thumbnail
FROM recipes r
LEFT JOIN recipe_images i ON ( i.recipe_id = r.recipe_id )
LEFT JOIN ingredients_recipes_link irl ON ( irl.recipe_id = r.recipe_id )
WHERE irl.recipe_id
IN (
SELECT recipe_id
FROM `ingredients_recipes_link`
WHERE ingredient_id
IN ( 24, 21, 22 )
HAVING count( * ) =3
)
GROUP BY r.recipe_id
Related
I have the following tables
ea_users
id
first_name
last_name
email
password
id_roles
ea_user_cfields
id
c_id = custom field ID
u_id = user ID
data
ea_customfields
id
name = name of custom field
description
I want to get all users which have a certain role, but I also want to retrieve all the custom fields per user. This is for the backend of my software where all the ea_users and custom fields should be shown.
I tried the following, but for each custom field, it duplicates the same user
$this->db->join('(SELECT GROUP_CONCAT(data) AS custom_data, id AS dataid, u_id, c_id
FROM ea_user_cfields userc
GROUP BY id) AS tt', 'tt.u_id = ea.id','left');
$this->db->join('(SELECT GROUP_CONCAT(name) AS custom_name, id AS customid
FROM ea_customfields AS cf
GROUP BY id) AS te', 'tt.c_id = te.customid','left');
$this->db->where('id_roles', $customers_role_id);
return $this->db->get('ea_users ea')->result_array();
the problem that u did not understand properly how join works.
its ok, that u have duplicates in select when u have relation one to many.
in few words your case: engine tries to fetch data from table "A" (ea_users) then JOIN according to the conditions another table "B" (ea_customfields). If u have one to many relation between tables (it means that one record from table "A" (lets say that we have in this table A1 record) can contain few related rows in table "B", lets call them as B1.1, B1.2 and B1.3 and B1.4), in this case it will join this records and put join result in memory. So in memory u would see something like
| FromTable A | FromTableB |
| A1 | B1.1 |
| A1 | B1.2 |
| A1 | B1.3 |
| A1 | B1.4 |
if u have 10 records in table "B", which related to the table "A" it would put 10 times in memory copy of data from table "A" during fetching. And then will render it to u.
depending on join type rows, with missing related records, can be skipped at all (INNER JOIN), or can be filled up with NULLs (LEFT JOIN or RIGHT JOIN), etc.
When u think about JOINs, try to imagine yourself, when u try to join on the paper few big tables. U would always need to mark somehow which data come from which table in order to be able to operate with it later, so its quite logically to write row "A1" from table "A" as many times as u need to fill up empty spaces when u find appropriate record in table "B". Otherwise u would have on your paper something like:
| FromTable A | FromTableB |
| A1 | B1.1 |
| | B1.2 |
| | B1.3 |
| | B1.4 |
Yes, its looks ok even when column "FromTable A" contains empty data, when u have 5-10 records and u can easily operate with it (for example u can sort it in your head - u just need to imagine what should be instead of empty space, but for it, u need to remember all the time order how did u wrote the data on the paper). But lets assume that u have 100-1000 records. if u still can sort it easily, lets make things more complicated and tell, that values in table "A" can be empty, etc, etc.. Thats why for mysql engine simpler to repeat many times data from table..
Basically, I always stick to examples when u try to imagine how would u join huge tables on paper or will try to select something from this tables and then make sorting there or something, how would u look through the tables, etc.
GROUP_CONCAT, grouping
Then, next mistake, u did not understand how GROUP_CONCAT works:
The thing is that mysqlEngine fetch on the first step structure into memory using all where conditions, evaluating subqueries + appends all joins. When structure is loaded, it tried to perform GROUPing. It means that it will select from temporary table all rows related to the "A1". Then will try to apply aggregation function to selected data. GROUP_CONCAT function means that we want to apply concatenation on selected group, thus we would see something like "B1.1, B1.2, B1.3, B1.4". Its in few words, but I hope it will help a little to understand it.
I googled table structure so u can write some queries there.
http://www.mysqltutorial.org/tryit/query/mysql-left-join/#1
and here is example how GROUP_CONCAT works, try to execute there query:
SELECT
c.customerNumber, c.customerName, GROUP_CONCAT(orderNumber) AS allOrders
FROM customers c
LEFT JOIN orders o ON (c.customerNumber = o.customerNumber)
GROUP BY 1,2
;
can compare with results with previous one.
power of GROUP in aggregation functions which u can use with it. For example, u can use "COUNT()", "MAX()", "GROUP_CONCAT()" or many many others.
or example of fetching of count (try to execute it):
SELECT c.customerName, count(*) AS ordersCount
FROM customers AS c
LEFT JOIN orders AS o ON (c.customerNumber = o.customerNumber)
GROUP BY 1
;
so my opinion:
simpler and better to solve this issue on client side or on backend, after fetching. because in term of mysql engine response with duplication in column is absolutely correct. BUT of course, u can also solve it using grouping with concatenations for example. but I have a feeling that for your task its overcomplicating of logic
PS.
"GROUP BY 1" - means that I want to group using column 1, so after selecting data into memory mySql will try to group all data using first column, better not to use this format of writing on prod. Its the same as "GROUP BY c.customerNumber".
PPS. Also I read comments like "use DISTINCT", etc.
To use DISTINCT or order functions, u need to understand how does it work, because of incorrect usage it can remove some data from your selection, (same as GROUP or INNER JOINS, etc). On the first look, you code might work fine, but it can cause bugs in logic, which is the most complicated to find out later.
Moreover DISTINCT will not help u, when u have one-to-many relation(in your particular case). U can try to execute queries:
SELECT
c.customerName, orderNumber AS nr
FROM customers c
INNER JOIN orders o ON (c.customerNumber = o.customerNumber)
WHERE c.customerName='Alpha Cognac'
;
SELECT
DISTINCT(c.customerName), orderNumber AS nr
FROM customers c
INNER JOIN orders o ON (c.customerNumber = o.customerNumber)
WHERE c.customerName='Alpha Cognac'
;
the result should be the same. Duplication in customer name column and orders numbers.
and example how to loose data with incorrect query ;):
SELECT
c.customerName, orderNumber AS nr
FROM customers c
INNER JOIN orders o ON (c.customerNumber = o.customerNumber)
WHERE c.customerName='Alpha Cognac'
GROUP BY 1
;
I have a table with the following schema in MySQL
Recipe_Quantity_ID,Recipe_ID, Recipe_Quantity, Ingredients_ID, Ingredient_Measurement_ID
The sample data can be found in this SQL Fiddle http://sqlfiddle.com/#!2/d05fe .
I want to search the table for the given (one or many) Ingredients_ID and return the Recipe_ID that has this Ingredients_ID
I do this by using this SQL
select Recipe_ID
from recipe_quantity
group by Recipe_ID
having count(*) = {$ar_rows}
and sum(Ingredients_ID in ({$ids})) = {$ar_rows}
which may translate to
select Recipe_ID
from recipe_quantity
group by Recipe_ID
having count(*) = 4
and sum(Ingredients_ID in (8,5,45,88)) = 4
For searching for less Ingredients_ID I substract the last ID until I reach one Ingredient ID. By using this technique of course is not possible to search for all the combinations. Eg 8,5,45,85 | 8,45,85 | 45,85 | 5,45,85 etc.
Any ideas how I can search for all the combinations that may be true? Thanks in advance.
My understanding is that you want to get all recipes where you already have all the ingredients you need. you don't need to use all the ingredients you have but you don't want to have to go shopping.
Correct me if I am wrong but I don't think there is a recipe that fits your ingredients list so I have used other ingredients. note that ingredients 13,36 wont be used.
you should be able to put another select statement in the brackets that gets the ingredients that you have (select ingredients_id from ingredients_owned) it isn't good to specify them each time.
select distinct c.Recipe_id
from
(select Recipe_ID
from recipe_quantity
where Ingredients_ID in (5,6,1,11,8,12,13,36, 81,82,62,73,35)) c
left join (select Recipe_ID
from recipe_quantity
where Ingredients_ID not in (5,6,1,11,8,12,13,36, 81,82,62,73,35)) x
on c.Recipe_id = x.Recipe_id
where x.Recipe_id is null
How about something like this?
select Recipe_ID, group_concat(Ingredients_ID), count(*) as ingredients
from recipe_quantity
where Ingredients_ID IN (8,5,45,88)
group by Recipe_ID
having ingredients > 0
order by ingredients desc
Instead of grouping all recipe ingredients and then filtering out the ones that don't include the ingredients you're looking for, I match only the entries in recipe_quantity that match the ingredients in the first place. I use a group_concat so you can see the set of ingredients that match.
This orders by the number of ingredients that match, but still preserves partial matches on one or more ingredients. You change the 0 to the minimum number of ingredients to match.
That title is really not useful, but its a complex question (in my head, maybe) ... anywho...
Say I have a MySQL table of Countries (A-Z all countries in the world) with id & name
Then I have a table where I am tracking which countries a user has been to: Like so:
Country Table
id name
1 india
2 luxembourg
3 usa
Visited Table
id user_id country_id
1 1 1
2 1 3
Now here's what I want to do, when I present the form to add to the list of visited countries I want country.id 1 & 3 to be excluded from the query result.
I know I can filter this using PHP ... which is something I have done in the past ... but surely there must be a way to structure a query in such a way that 1 & 3 are excluded from the returned results, like:
SELECT *
FROM `countries`
WHERE `id`!= "SELECT `country_id`
FROM `visited`
WHERE `user_id`='1'"
I suspect it has something to do with JOIN statements but I can't quite figure it out.
Bonus gratitude if someone can point me in the right direction with Laravel.
Thanks you all :)
Is this what you want?
select c.*
from countries c left join
visited v
on c.id = v.country_id and v.user_id = 1
where v.country_id is null;
You can also express this as a not in or not exists, but the left join method typically has pretty good performance.
The left outer join keeps all records in the first table regardless of whether or not the on clause evaluates to true. If there are no matches in the second table, then the columns are populated with NULL values. The where clause simply chooses these records -- the ones that do not match.
Here is another way of expressing this that you might find easier to follow:
select c.*
from countries c
where not exists (select 1 from visited where c.id = v.country_id and v.user_id = 1)
You can use your query like this.
SELECT *
FROM `countries` c LEFT JOIN `visited` v on c.id = v.country_id
WHERE v.`country_id` is null
AND v.`user_id` = 1
This is a operation of a LEFT JOIN. What is means is that I'm selecting all registries from the table countries that may or may not is on the table visited based on the ID of the country.
So it will bring you this group
from country from visited
1 1
2 no registry
3 3
So on the where condition (v.country_id is null) I'm saying: I only want the ones that on this left join operation is only on the country table but it is not on visited table so it brings me the id 2. Plus the condition that says that those registries on visited must be from the user_id=1
SELECT * FROM COUNTRIES LEFT JOIN
VISITED ON
countries.id = visited.country_id and visited.country_id NOT IN ( SELECT country_id FROM visited )
if i understand right maybe you need something like this ?
So, I have a table named clients, another one known as orders and other two, orders_type_a and orders_type_b.
What I'm trying to do is create a query that returns the list of all clients, and for each client it must return the number of orders based on this client's id and the amount of money this customer already spent.
And... I have no idea how to do that. I know the logic behind this, but can't find out how to translate it into a MySQL query.
I have a basic-to-thinkimgoodbutimnot knowledge of MySQL, but to this situation I've got really confused.
Here is a image to illustrate better the process I'm trying to do:
Useful extra information:
Each orders row have only one type (which is A or B)
Each orders row can have multiple orders_type_X (where X is A or B)
orders relate with client through the column client_id
orders_type_X relate with orders through the column order_id
This process is being made today by doing a query to retrieve clients, and then from each entry returned the code do another query (with php) to retrieve the orders and yet another one to retrieve the values. So basically for each row returned from the first query there is two others inside it. Needless to say that this is a horrible approach, the performance sucks and I thats the reason why I want to change it.
UPDATE width tables columns:
clients:
id | name | phone
orders:
id | client_id | date
orders_type_a:
id | order_id | number_of_items | price_of_single_item
orders_type_b:
id | order_id | number_of_shoes_11 | number_of_shoes_12 | number_of_shoes_13 | price_of_single_shoe
For any extra info needed, just ask.
If I understand you correctly, you are looking for something like this?
select c.*, SUM(oa.value) + SUM(ob.value) as total
from clients c
inner join orders o on c.order_id = o.id
inner join orders_type_a oa on oa.id = o.order_type_id AND o.type = 'A'
inner join orders_type_b ob on ob.id = o.order_type_id AND o.type = 'B'
group by c.id
I do not know your actual field names, but this returns the information on each customer plus a single field 'total' that contains the sum of the values of all the orders of both type A and type B. You might have to tweak the various names to get it to work, but does this get you in the right direction?
Erik's answer is on the right track. However, since there could be multiple orders_type_a and orders_type_b records for each order, it is a little more complex:
SELECT c.id, c.name, c.phone, SUM(x.total) as total
FROM clients c
INNER JOIN orders o
ON o.client_id = c.id
INNER JOIN (
SELECT order_id, SUM(number_of_items * price_of_single_item) as total
FROM orders_type_a
UNION ALL
SELECT order_id, SUM((number_of_shoes_11 + number_of_shoes_12 + number_of_shoes_13) * price_of_single_shoe) as total
FROM orders_type_b
) x
ON x.order_id = o.id
GROUP BY c.id
;
I'm making a few assumptions about how to calculate the total based on the columns in the orders_type_x tables.
Ok, for this question I need idea how to make an algorithm for sorting results.
Let me explain a problem:
i have different dish recipes (more than 2000) consists of different ingredients.
having 3 tables:
dish
id | name
ingredient
id | name
dish_ingredient
id | id_dish | id_ingredient
Application i made let users select different ingredients they have and app shows them proper dish recipe sort by count of ingredients they have. Now I would like to have algorithm where users would select ingredients and add them some kind of "weight".
Example which works for now: if user select ingredients beef, carrots, onion, salt, pepper algorithm looks which recipes has these ingredients (not necessary all of them) and sort them by ingredients that user has. So first recipe from this sorting could have just salt and then maybe flour and eggs (recipe for pancakes) at the end of list there could be recipes with beef which are more convenient for user.
So my idea is that user could add some weights on his ingredients so search algorithm would give him more proper recipes. If you do not understand what i want, you could see application on www.mizicapogrnise.si (transalate it from Slovenian language to your language) to see how app works now.
Thanks for all your help.
The basic query to count the ingredients is:
select di.id_dish, count(*) as NumIngredients
from dish_ingredient di join
ingredient i
on i.id = di.id_ingredient
group by di.id_dish;
You can modify this for ingredients by introducing a case clause into the query:
select di.id_dish, count(*) as NumIngredients,
sum(case when i.name = 'beef' then 2.0
when i.name = 'salt' then 0.5
. . .
end) as SumWeights
from dish_ingredient di join
ingredient i
on i.id = di.id_ingredient
group by di.id_dish
order by SumWeights desc;
The . . . is not part of the SQL syntax. You have to fill in with similar lines where this occurs.
An alternative formulation is to put the information in a subquery. The subquery contains the weight for each ingredient. This is then joined to the dish_ingredient table to get the weight for each ingredient in each dish. The outer query aggregates the results by dish:
select di.id_dish, count(*) as NumIngredients,
sum(w.weight) as SumWeights
from dish_ingredient di join
ingredient i
on i.id = di.id_ingredient left outer join
(select 'beef' as name, 2.0 as weight union all
select 'salt', 0.5 union all
. . .
) w
on i.name = w.name
group by di.id_dish
order by SumWeights desc;
Both these method require modifying the query to get the necessary weight information into the database.