MYSQL JOIN Conditional to check integer against comma delimited list - php

I have a SQL SELECT statement in which I'm using 3 tables.
I'm using INNER JOINs to join the tables, however I've come across a bit of an issue because two of the columns that I'd like the join conditional to be based on are different data types;
One is an integer - the id of the products table and can be seen below as p.id.
The other is a comma delimited string of these id's in the order table. customers can order more than one product at a time, so the product id's are stored as a comma delimited list.
here's how far I've gotten with the SQL:
"SELECT o.transaction_id, o.payment_status, o.payment_amount, o.product_id, o.currency, o.payment_method, o.payment_time, u.first_name, u.last_name, u.email, p.title, p.description, p.price
FROM orders AS o
INNER JOIN products AS p ON ( NEED HELP HERE--> p.id IN o.product_id comma delimited list)
INNER JOIN users AS u ON ( o.user_id = u.id )
WHERE user_id = '39'
ORDER BY payment_time DESC
LIMIT 1";
Perhaps I could use REGEX? currently the comma delimited list reads as '2,1,3' - however the number of characters isn't limited - so I need a conditional to check if my product id (p.id) is in this list of o.product_id?

What you have is a perfect example for one-to-many relationship where you have one order and several items attached to it. You should have a link table like
order_product - which makes the connection between a orderid and productid where you can also put specific data for the relationship between the two (like when the item was added, quantity, etc)
Then you make the join using this table and you have same field types everywhere.
simple example:
select
/* list of products */
from
order o,
order_product op,
product p
where
o.id = 20
and o.id = op.orderid
and op.productid = p.id

This in one of those very common nightmares when working with legacy database.
The rule is simple: never ever store multiple values in one table columns. This is known as first normal form.
But how to deal with that in existing DB?
The good thing™
If you have the opportunity to refactor your DB, extract the "comma separated values" to their own table. See http://sqlfiddle.com/#!2/0f547/1 for a basic example how to do that.
Then to query the tables you will have to use a JOIN as explained in elanoism's answer.
The bad thing™
I you can't or don't want do that, you probably have to rely on the FIND_IN_SET function.
SELECT * FROM bad WHERE FIND_IN_SET(target_value, comma_separated_values) > 0;
See http://sqlfiddle.com/#!2/29eba/2
BTW, why is this bad thing™? Because as you see, it is not easy to write query against multi-valued columns -- but, probably more important, you are not able to use index on that columns, nor, as a consequence, to easily perform join operations or enforce referential integrity.
The so-so thing™
As a final note, if the set of possible value is small (less that 65), an alternative approach would be to change the column type to a SET().

Related

MYSQL/PHP: Query with multiple joins

I am trying to get the following query to return the right information but it is returning different number of values for fields from the same table.
My data schema is 3 table:
airlines
id|descript|userid
travelers
id|airid|ftrav|ltrav|active
travair
id|travid|airid
Here is the query:
SELECT a.id as `aid`,a.descript,userid,
group_concat(t.id) as `tids`,
group_concat(t.ftrav) as `ftravs`,
group_concat(IFNULL(t.ltrav,'')) as `ltravs`,
group_concat(t.active) as `tactives`,
ta.airid
FROM airlines `a`
LEFT JOIN travair `ta`
ON a.id = ta.airid
LEFT JOIN travelers `t`
ON ta.travid = t.id
WHERE a.userid='$userid'
GROUP BY a.id
Basically, I am trying to query the airlines table to get airlines but also pull the travelers for each of the airlines by way of the ta table which joins the two.
However, the group_concat fields all have different numbers of values in them. In the actual table, I have largely eliminated missing values so that would not account for the differences in number of elements. There seems to be something wrong with query.
Can anyone spot my error? Have been struggling with this for a couple days.
In general, the problem with aggregation and joins is that the joins produce Cartesian products for matching keys. In English, this means that you are joining along multiple dimensions and getting all combinations of different items for the same user.
What can you do? The quick-and-dirty solution is to use the distinct keyword:
SELECT a.id as `aid`,a.descript,userid,
group_concat(distinct t.id) as `tids`,
group_concat(distinct t.ftrav) as `ftravs`,
group_concat(distinct coalesce(t.ltrav, '')) as `ltravs`,
group_concat(distinct t.active) as `tactives`
A more scalable solution is to pre-aggregate each dimension to get the list along each dimension.
Note: It is possible that distinct will not work in your case, if you happen to want all lists to be the same length.
GROUP_CONCAT, like most aggregation functions, ignores NULL values. Since you are using left joins, any GROUP_CONCATs on fields from tables on the right side of those joins may have null values for some of the pre-aggregation result rows.
Edit: If you want to synthesize "results" for the lacking data, you can aggregate calculated values instead; you've actually already done so once with this bit...
group_concat(IFNULL(t.ltrav,'')) as `ltravs`
.... you can just take it a bit further (make the lack of that data a bit more obvious) with something like this:
GROUP_CONCAT(IFNULL(theField, '[Not Recorded]')) AS theList

Excess colomns in the query result

I'm trying to write a simple interface for a list of companies using MySQL and PHP. So, I want to fetch some information from my database.
Here are my tables:
companies_data - only for system information.
corporate_data - here I want to keep information about big companies.
individual_data - and here I want to keep information about little companies.
So, here is the tables
And here is the query that I've written:
SELECT
a.id,
a.user_id,
a.added,
a.`status`,
a.company_id,
a.company_type,
a.deposit,
a.individual_operations_cache,
a.corporate_operations_cache,
a.physical_operations_cache,
b.full_name,
b.tax_number,
b.address,
b.statement_date,
b.psrn,
c.full_name,
c.tax_number,
c.address,
c.statement_date,
c.psrn
FROM
companies_data a
LEFT OUTER JOIN corporate_data b
ON (a.company_id = b.id) AND a.company_type = 0
LEFT OUTER JOIN individual_data c
ON (a.company_id = c.id) AND a.company_type = 1
WHERE
a.user_id = 3
This is just the code for a test, I'll expand it soon.
As you see, I've got result with extra fields like %field_name%1, %another_field_name%1 and so on. Of course it is not the mysql error - what I've asked that I've got - but I want to remove this fields? It's possible or I must convert this output on the application side?
thos %field_name%1, %another_field_name%1 , are visible since you are selecting them in your query:
b.full_name,
b.tax_number,
b.address,
b.statement_date,
b.psrn,
c.full_name,
c.tax_number,
c.address,
c.statement_date,
c.psrn
When you use fields with the same name in distinct tables, then the result column name come with this identifier field1, field2, fieldn... in order to distinguish from which table does the field come from.
If you want to avoid this names, you can use aliases as follows:
[...]
b.full_name as corporate_full_name,
[...]
Probably, if every common fields are coincident, you won´t need to show them all, so just remove them from the select.
Hope being usefull for you.
Br.

Using multiple inner joins

I have four tables:
users, orders, orders_product and products.
They are connected to each other by foreign key
user tables contains: id, name, email and username.
product table contains: id, product_name, product_description and product_price
orders table contains: id, u_id(foreign key).
orders_product table contains: id, product_id(foreign key), order_id(foreign key).
Now I was trying to fetch the name of a user with the total price of a particular order that he has placed.
The maximum I could went for was something like this:
SELECT prod.order_id,
SUM(product_price) AS Total
FROM products
INNER JOIN
(SELECT orders.id AS order_id,
orders_product.product_id
FROM orders
INNER JOIN orders_product ON orders.id = orders_product.order_id
WHERE order_id=1) AS prod ON products.id = prod.product_id;
It showed me total price of a particular order. Now I have two questions:
Is that query correct. It looks like a very long query. Can the same result be achieved with a smaller one?
How to fetch the name of a user with the total price of a particular order that he has placed.
Hi some addition to #Gordon Linoff
your query seems ok.
if you store your price data in order_products it will be good and some benefit, one of these benefit is aggregation will be simple. Second benefit if product price change it will not affect to order.
Your query is correct for one order, but it can be improved:
Don't use a subquery unless necessary. In MySQL this introduces additional overhead.
You are only looking at one order, which seems on the light site. You should remove the where clause.
You should be using a group by because you want aggregation.
You need to join in the user table to get the name.
I also added table aliases (abbreviations for table names). This makes the query a bit more readable:
SELECT u.name, SUM(p.product_price) as Total
FROM orders_product op INNER JOIN
orders o
ON o.id = op.order_id INNER JOIN
products p
ON p.id = op.product_id INNER JOIN
users u
on o.userid = u.id
WHERE op.order_id = 1
GROUP BY u.name;
Your SQL is wrong. Because You want to calculate specific to user. But your SQL is specific to Order. Your SQL will give result for One Order. Please make it User Specific by giving user name or what ever is unique.

Many to many relation, IN operator and possibility of improper results

I have a problem with creating optimal SQL query. I have private messages system where user can send single message to many of users or groups of users. Recipients are stored in single text column (don't ask me why is that I wasn't responsible for designing that) like that:
[60,63,103,68]
Additionaly I've added new text column where is placed group which user belongs to so I can have as a groups in database:
[55,11,11,0]
Now I want to get all users (receivers) and their groups. I have table where relation between user and group id. The problem is that single user can belong to multiple groups, for example user 60 can be in group ID 55 and 11. I would like to do it in the most optimal way (there can be 50+ receivers stored in column...) so I can write query like that:
SELECT u.name, u.last_name, g.group_name
FROM
user u
LEFT JOIN
group g ON u.id = g.user_id
WHERE
u.id IN (".$users.") and
g.id IN (".$groups.")
Unfortunately group name returned by query might by not proper - connected with the group ID i placed in WHERE. I may create PHP foreach and get user and his group using IDs I have:
foreach($user as $key => $single)
{
$sql = "...
where u.id = $single AND g.id = $group[$key] ";
}
but I think this is very bad way. Is there any way to get user and specified group in single query?
Since users and groups are only linked by their ordinal positions in the list, you need to make use of that.
The quick and dirty method would be to unnest() in parallel:
SELECT u.name, u.last_name, g.group_name
FROM (
SELECT unnest(string_to_array('3,1,2', ',')::int[]) AS usr_id -- users
, unnest(string_to_array('10,11,12', ',')::int[]) AS grp_id -- groups
) sel
JOIN usr_grp ug USING (usr_id, grp_id)
JOIN usr u USING (usr_id)
JOIN grp g USING (grp_id);
Note how I replaced SQL key words like user or group as identifiers.
-> SQLfiddle
This way, elements with the same ordinal positions in the array (converted from a comma-separated list) form a row. Both arrays need to have the same number of elements or the operation will result in a Cartesian product instead. That should be the case here, according to your description. Add code to verify if that condition might be violated.
Cleaner alternatives
While the above works reliably, it is a non-standard Postgres feature of SRF (set returning functions) which is frowned upon by some.
There are cleaner ways to do it. And the upcoming version 9.4 of Postgres will ship a new feature: WITH ORDINALITY, allowing for much cleaner code. This related answer demonstrates both:
PostgreSQL unnest() with element number

How to reduce the number of queries in a normalized database?

Imagine a table for articles. In addition to the main query:
SELECT * From articles WHERE article_id='$id'
We also need several other queries to get
SELECT * FROM users WHERE user_id='$author_id' // Taken from main query
SELECT tags.tag
FROM tags
INNER JOIN tag_map
ON tags.tag_id=tag_map.tag_id
WHERE article_id='$id'
and several more queries for categories, similar articles, etc
Question 1: Is it the best way to perform these queries separately with PHP and handle the given results, or there is way to combine them?
Question 2: In the absence of many-to-many relationships (e.g. one tag, category, author for every article identified by tag_id, category_id, author_id); What the best (fastest) was to retrieve data from the tables.
If all the relationships are one-many then you could quite easily retrieve all this data in one query such as
SELECT
[fields required]
FROM
articles a
INNER JOIN
users u ON a.author_id=u.user_id
INNER JOIN
tag_map tm ON tm.article_id=a.article_id
INNER JOIN
tags t t.tag_id=tm.tag_id
WHERE
a.article_id='$id'
This would usually be faster than the three queries separately along as your tables are indexed correctly as MySQL is built to do this! It would save on two round trips to the database and the associated overhead.
You can merge in the user in the first query:
SELECT a.*, u.*
FROM articles a
JOIN users u ON u.user_id = a.author_id
WHERE a.article_id='$id';
You could do the same with the tags, but that would introduce some redundancy in the answer, because there are obviously multiple tags per article. May or may not be beneficial.
In the absence of many-to-many relationships, this would do the job in one fell swoop and would be superior in any case:
SELECT *
FROM users u
JOIN articles a ON a.author_id = u.user_id
JOIN tag t USING (tag_id) -- I assume a column articles.tag_id in this case
WHERE a.article_id = '$id';
You may want to be more selective on which columns to return. If tags ar not guaranteed to exist, make the second JOIN a LEFT JOIN.
You could add an appropriately denormalized view over your normalized tables where each record contains all the data you need. Or you could encapsulate the SQL calls in stored procedures and call these procs from your code, which should aid performance. Prove both out and get the hard figures; always better to make decisions based on evidence rather that ideas. :)

Categories