i have a Problem using WHERE in my Select query.
The DB-table contains a field that stores data divided by a comma like "1,2,3".
I dont know how to check if the field contains 1,2 or 3. The usual query would be
"SELECT * FROM ".$table." WHERE name = '".$val."'".
But of course this only finds entries that equal $val, so if I search for "1" it will only give me the entries only containing "1", not "1,2,3" or "1,2". Is there a way to do that?
Thanks guys
You should fix your data structure. There are lots of good reasons to avoid storing lists of numbers as strings:
Values should be stored as their correct data types. Numbers are not strings.
SQL has pretty poor string processing functionality.
SQL has this great data structure for storing lists. It is called a table.
If the numbers refer to another table, then you cannot declare proper foreign key relationships.
Sometimes, we are stuck with other people's really bad design decisions. In those cases, MySQL offers find_in_set():
where find_in_set(1, name) > 0
Related
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is storing a delimited list in a database column really that bad?
I have been working on a couple of PHP/MySQL projects where all relationships are stored as comma separated strings.
For example a common relationship would be like
(in psuedocode)
table people
id - integer
name - string
age - integer
teams - string (CSV OF integers, ex '1,3,9,21')
table teams
name - String
id - integer
managing relationships becomes a hassle.
To get all teams for a person:
$person = 'SELECT * FROM People WHERE id= x';
then in php I have been doing something like
$person['teams'] = SELECT * FROM teams WHERE id IN ($person['teams']);
as I was writing this i realized i could probably combine them in a mysql query, something like:
SELECT
people.id,
people.name,
people.teams,
teams.name
FROM people
JOIN teams ON FIND_IN_SET(teams.id, people.teams) WHERE people.id=x
with this type of setup I find myself using FIND_IN_SET, pretty frequently
So finally, my question is: Is there a performance benefit to creating relationships like this?
In my experiences so far FIND_IN_SET has usually been doing a full table scan. If there is no performance benefit, in which instances is it beneficial to using a comma seperated list of integers? It seems that mysql designers had something in mind when creating FIND_IN_SET.
You're right, FIND_IN_SET() cannot make use of an index, so it causes a full table scan. Technically, that function is a bogus operation for a relational database, but no doubt there was a lot of demand for it so MySQL implemented it.
Storing data in a comma-separated list is an example of denormalization. Any departure from normalized design can give a performance boost for one type of query, but usually at the expense of all other types of queries against the same data.
For example, if you store players and their teams as a comma-separated list, it makes it very easy to get the list of teams for a given player, without doing a join. That's a performance improvement. But fetching the details for a given player's teams is much more difficult. Likewise searching for all players on a given team.
Use comma-separated lists only if that list is treated as a discrete "black box" piece of data. I.e. your application needs to fetch that list as a whole item, but never a subset of the list, and you never need to write SQL to use elements in that list for searching, joining, sorting, subtotals, etc.
See also my answer to Is storing a delimited list in a database column really that bad?
Table scan can not be considered as a benefit, at any time.
Moreover it's breaking the Normal form ( http://en.wikipedia.org/wiki/Database_normalization), as far as I remember from the school.
I think it's a good practice to have all the primary/foreign keys columns indexed to have performance benefit.
The only idea I would have in such a situation, is to politely ask architect on the particular project what was his idea behind the solution and explain him/her the performance disaster behind this :)
I wonder if it is possible to query a specific part of a comma separated string, something like the following:
$results = mysql_query("SELECT * FROM table1 WHERE $pid=table1.recordA[2] ",$con);
$pid is a number
and recordA contains data like
34,9008,606,,416,2
where i want to check the third part (606)
Thank you in advance
Having comma seperated lists or any data seperation within a mySQL field is frowned upon and is to all extents bad practice.
Rather than looking at querying an element of a delimetered list within a mySQL field consider breaking the field into its own table and then creating an adjacency list to create a 1:many relationship between table1 and it's associated variables.
If you are commited to this route, the simplest method would be to use PHP to manage it as mySQL has very few tools (above and beyond regex / text searches) to drill down to the data you want to extract. $results = explode(',',$query); would create an array of your variables from the returned field allowing you to run as many conditional checks against it as needed.
However, consider adding this to your 'need to re-write / re-think' list. A relational tables structure would allow you to query the database for $pid's value directly as it would be contained within it's own field and linked
If the delimetered variable list is of an inderterminate length or the relationships between the variables are heirarchical you'd be better off searching stackoverflow for information on Directed Acyclic Graphs in mySQL to find a better solution to the problem.
Without knowing the nature or the intended purpose for this script I can't answer in any more detail. I hope this has helped a little.
How about this:
SELECT * FROM table1 WHERE FIND_IN_SET({$pid}, recordA) = 3
Make sure to index recordA. I love normalization as much as the next guy, but sometimes breaking it up is just more trouble than it's worth ;)
So I've got this form with an array of checkboxes to search for an event. When you create an event, you choose one or more of the checkboxes and then the event gets created with these "attributes". What is the best way to store it in a MySQL database if I want to filter results when searching for these events? Would creating several columns with boolean values be the best way? Or possibly a new table with the checkbox values only?
I'm pretty sure selializing is out of the question because I wouldn't be able to query the selialized string for whether the checkbox was ticked or not, right?
Thanks
You can use the set datatype or a separate table that you join. Either will work.
I would not do a bunch of columns though.
You can search the set easily using FIND_IN_SET(), but it's not indexed, so it depends on how many rows you expect (up to a few thousand is probably OK - it's a very fast search).
The normal solution is a separate table with one column being the ID of the event, and the second column being the attribute using the enum datatype (don't use text, it's slower).
create separate columns or you can store them all in one column using bit mask
One way would be to create a new table with a column for each checkbox, as already described by others. I'll not add to that.
However, another way is to use a bitmask. You have just one column myCheckboxes and store the values as an int. Then in the code you have constants or another appropriate way to store the correlation between each checkbox and it's bit. I.e.:
CHECKBOX_ONE 1
CHECKBOX_TWO 2
CHECKBOX_THREE 4
CHECKBOX_FOUR 8
...
CHECKBOX_NINE 256
Remember to always use the next power of two for new values, otherwise you'll get values that overlap.
So, if the first two checkboxes have been checked you should have 3 as the value of myCheckboxes for that row. If you have ONE and FOUR checked you'd have 9 as the values of myCheckboxes, etc. When you want to see which rows have say checkboxes ONE, THREE and NINE checked your query would be like:
SELECT * FROM myTable where myCheckboxes & 1 AND myCheckboxes & 4 AND myCheckboxes & 256;
This query will return only rows having all this checkboxes marked as checked.
You should also use bitwise operations when storing and reading the data.
This is a very efficient way when it comes to speed. You have just a single column, probably just a smallint, and your searches are pretty fast. This can make a big difference if you have several different collections of checkboxes that you want to store and search trough. However, this makes the values harder to understand. If you see the value 261 in the DB it'll not be easy for a human to immeditely see that this means checkboxes ONE, THREE and NINE have been checked whereas it is much easier for a human seeing separate columns for each checkbox. This normally is not an issue, cause humans don't need to manually poke the database, but it's something worth mentioning.
From the coding perspective it's not much of a difference, but you'll have to be careful not to corrupt the values, cause it's not that hard to mess up a single int, it's magnitudes easier than screwing the data than when it's stored in different columns. So test carefully when adding new stuff. All that said, the speed and low memory benefits can be very big if you have a ton of different collections.
I'm trying to figure out how and which is best for storing and getting multiple entries into and from a database. Either using explode, split, or preg_split. What I need to achieve is a user using a text field in a form to either send multiple messages to different users or sharing data with multiple users by enter their IDs like "101,102,103" and the PHP code to be smart enough to grab each ID by picking them each after the ",". I know this is asking a lot, but I need help from people more skilled in this area. I need to know how to make the PHP code grab IDs and be able to use functions with them. Like grabbing "101,102,103" from a database cell and grabbing different stored information in the database using the IDs grabbed from that one string.
How can I achieve this? Example will be very helpful.
Thanks
If I understand your question correctly, if you're dealing with comma delimited strings of ID numbers, it would probably be simplest to keep them in this format. The reason is because you could use it in your SQL statement when querying the database.
I'm assuming that you want to run a SELECT query to grab the users whose IDs have been entered, correct? You'd want to use a SELECT ... WHERE IN ... type of statement, like this:
// Get the ids the user submitted
$ids = $_POST['ids'];
// perform some sanitizing of $ids here to make sure
// you're not vulnerable to an SQL injection
$sql = "SELECT * FROM users WHERE ID IN ($ids)";
// execute your SQL statement
Alternatively, you could use explode to create an array of each individual ID, and then loop through so you could do some checking on each value to make sure it's correct, before using implode to concatenate them back together into a string that you can use in your SELECT ... WHERE IN ... statement.
Edit: Sorry, forgot to add: in terms of storing the list of user ids in the database, you could consider either storing the comma delimited list as a string against a message id, but that has drawbacks (difficult to do JOINS on other tables if you needed to). Alternatively, the better option would be to create a lookup type table, which basically consists of two columns: messageid, userid. You could then store each individual userid against the messageid e.g.
messageid | userid
1 | 1
1 | 3
1 | 5
The benefit of this approach is that you can then use this table to join other tables (maybe you have a separate message table that stores details of the message itself).
Under this method, you'd create a new entry in the message table, get the id back, then explode the userids string into its separate parts, and finally create your INSERT statement to insert the data using the individual ids and the message id. You'd need to work out other mechanisms to handle any editing of the list of userids for a message, and deletion as well.
Hope that made sense!
Well, considering the three functions you suggested :
explode() will work fine if you have a simple pattern that's always the same.
For instance, always ', ', but never ','
split() uses POSIX regex -- which are deprecated -- and should not be used anymore.
preg_split() uses a regex as pattern ; and, so, will accept more situations than explode().
Then : do not store several values in a single database column : it'll be impossible to do any kind of useful work with that !
Create a different table to store those data, with a single value per row -- having several rows corresponding to one line in the first table.
I think your problem is more with SQL than with PHP.
Technically you could store ids into a single MySQL field, in a 'set' field and query against it by using IN or FIND_IN_SET in your conditions. The lookups are actually super fast, but this is not considered best practice and creates a de-normalized database.
What is nest practice, and normalized, is to create separate relationship tables. So, using your example of messages, you would probably have a 'users' table, a 'messages' table, and a 'users_messages' table for relating messages between users. The 'messages' table would contain the message information and maybe a 'user_id' field for the original sender (since there can only be one), and the 'users_messages' table would simply contain a 'user_id' and 'message_id' field, containing rows linking messages to the various users they belong to. Then you just need to use JOIN queries to retrieve the data, so if you were retrieving a user's inbox, a query would look something like this:
SELECT
messages.*
FROM
messages
LEFT JOIN users_messages ON users_messages.message_id = messages.message_id
WHERE
users_messages.user_id = '(some user id)'
Say I have an array of strings in a php array called $foo with a few hundred entries, and I have a MySQL table 'people' that has a field named 'name' with a few thousand entries. What is an efficient way to find out which strings in $foo aren't a 'name' in an entry in 'people' without submitting a query for every string in $foo?
So I want to find out what strings in $foo have not already been entered in 'people.'
Note that it is clear that all of the data will have to be on one box at one point. The goal would be doing this at the same time minimizing the number of queries and the amount of php processing.
I'd put your $foo data in another table and do a LEFT OUTER JOIN with your names table. Otherwise, there aren't a lot of great ways to do this that don't involve iteration at some point.
The best I can come up with without using a temporary table is:
$list = join(",", $foo);
// fetch all rows of the result of
// "SELECT name FROM people WHERE name IN($list)"
// into an array $result
$missing_names = array_diff($foo, $result);
Note that if $foo contains user input it would have to be escaped first.
What about the following:
Get the list of names that are already in the db, using something like:
SELECT name FROM people WHERE name IN (imploded list of names)
Insert each item from the return of array_diff()
If you want to do it completely in SQL:
Create a temp table with every name in the PHP array.
Perform a query to populate a second temp table that will only include the new names.
Do an INSERT ... SELECT from the second temp table into the people table.
Neither will be terribly fast, although the second option might be slightly faster.
CREATE TEMPORARY TABLE PhpArray (name varchar(50));
-- you can probably do this more efficiently
INSERT INTO PhpArray VALUES ($foo[0]), ($foo[1]), ...;
SELECT People.*
FROM People
LEFT OUTER JOIN PhpArray USING (name)
WHERE PhpArray.name IS NULL;
For a few hundred entries, just use array_diff() or array_diff_assoc()
$query = 'SELECT name FROM table WHERE name != '.implode(' OR name != '. $foo);
Yeash, that doesn't look like it would scale well at all.
I'm not sure there is a more efficient way to do this other than to submit all the strings to the database.
Basically there are two options: get a list of all the strings in MySQL and pull them into PHP and do the comparisons, or send the list of all the strings to the MySQL server and let it do the comparisons. MySQL is going to do the comparisons much faster than PHP, unless the list in the database is a great deal smaller than the list in PHP.
You can either create a temporary table, but either way your pushing all the data to the database.