PHP, find exact value from mysql arranged data - php

I have a data in My-Sql column like this
T_interest
1,14,49,145,203,302
It represents each value for personal interest keywords.
I tried to extract the value and distinguish whether it has the value or not for the checkbox.
if(strstr($u_interest['u_interest'], ','.$row['i_idx'])):
$selected = 'checked';
here is the php command that I use right now.
but it doesn't extract exact value from the database.
Let's say I want to check if the data has 14 number or from this user table
T_interest
1,14,49,145,203,302
and if I use above command it tells me that a user has two values.
14,145
It looks like PHP strstr command tells me two values because these two have 14 number.
So, can you help me why this is happening?
If you want more php lines I can post them.

Explode it into an array, this way you get the individual values.
explode(",",$t_interest["u_interest"]);
Then you can test for equality much easier.

You really need to read up on database normalization. A properly normalized design would make this problem moot.
As for your question, since you're forced to use string operations, you'll have to check for multiple different cases:
1) the number you want is at the START of the string
2) the number you want is at the END of the string
3) the number you want is the ONLY number in the string
4) the number you want is in the MIDDLE of thee string:
SELECT ...
WHERE
(T_Interest = 14) OR // only number
(T_Interest LIKE '14,%') OR /// at the begnning
(T_Interest LIKE '%,14') OR // at the end
(T_Insertest LIKE '%,14,%') // in the middle

Related

Searching Large Mysql Database For Exact Date Within Row ID

I have a large mysql database which is about 10gb large. One of the tables in the database is called
clients
In that table there is a colum named
case
The date this client is created is mixed into the number within this column.
Here is an example of an entry in case
011706-0001
The 06 part means this client was created in 2006. I need to pull all the clients that were created in 2015 and 2016. So I need to query for anything that case has a 15 or 16 before the dash.
For example, 000015-0000 or 000016-0000
Is there a way to do this with only mysql? My thought process was I would have to query the whole column then use php to preg_match()
I am worried that based on the size of the database this would cause problems.
To locate rows that have a case column value that contains '06-' (the characters 0 and 6 followed by a dash ...
One option is to use a LIKE comparison operator:
SELECT ...
FROM clients t
WHERE t.case LIKE '%06-%'
ORDER BY ...
The percent sign characters are wildcards in the LIKE comparison, which match any number of characters (zero, one or more.)
MySQL will need to evaluate that condition for every row in the table. MySQL can't make use of an index range scan operation with that.
SELECT ...
FROM clients t
WHERE t.case LIKE '%15-%'
OR t.case LIKE '%16-%'
ORDER BY ...
That will evaluate to true for any values that include the sequence of three characters '15-' or '16-'.
If there's a more standard format for the values in the case column, where the value always starts with exactly six characters representing date 'mmddyy-nnnnn' and you only want to match the 5th thru 7th characters, you could use the underscore wildcard character which matches any one character (in the LIKE comparison) for example... using four underscores
t.case LIKE '____16-%'
Or you could use a SUBSTR function to extract the three characters from the case value, and perform an equality comparison...
SUBSTR(t.case,5,3) = '15-'
SUBSTR(t.case,5,3) IN ('15-','16-')
It's also possible to make use of a REGEXP comparison in place of the LIKE comparison.
In terms of performance, all of the above approaches are going to need to crank through every row in the table, to evaluate the comparison condition.
If that date value was stored as a separate column, as a DATE datatype, and there was an index with that as the leading column, then MySQL could make effective use of a range scan operation, for a query like this...
WHERE t.casedate >= '2015-01-01'
AND t.casedate < '2017-01-01'

php : speed up levensthein comparing, 10k + records

In my MySQL table I have the field name, which is unique. However the contents of the field are gathered on different places. So it is possible I have 2 records with a very similar name instead of second one being discarded, due to spelling errors.
Now I want to find those entries that are very similar to another one. For that I loop through all my records, and compare the name to other entries by looping through all the records again. Problem is that there are over 15k records which takes way too much time. Is there a way to do this faster?
this is my code:
for($x=0;$x<count($serie1);$x++)
{
for($y=0;$y<count($serie2);$y++)
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
}
A preamble: such a task will always be time consuming, and there will always be some pairs that slip through.
Nevertheless, a few ideas :
1. actually, the algorithm can be (a bit) improved
assuming that $series1 and $series2 have the same values in the same order, you don't need to loop over the whole second array in the inner loop every time. In this use case you only need to evaluate each value pair once - levenshtein('a', 'b') is sufficient, you don't need levenshtein('b', 'a') as well (and neither do you need levenstein('a', 'a'))
under these assumptions, you can write your function like this:
for($x=0;$x<count($serie1);$x++)
{
for($y=$x+1;$y<count($serie2);$y++) // <-- $y doesn't need to start at 0
{
$sim=levenshtein($serie1[$x]['naam'],$serie2[$y]['naam']);
if($sim==1)
print("{$A[$x]['naam']} --> {$B[$y]['naam']} = {$sim}<br>");
}
}
2. maybe MySQL is faster
there examples in the net for levenshtein() implementations as a MySQL function. An example on SO is here: How to add levenshtein function in mysql?
If you are comfortable with complex(ish) SQL, you could delegate the heavy lifting to MySQL and at least gain a bit of performance because you aren't fetching the whole 16k rows into the PHP runtime.
3. don't do everything at once / save your results
of course you have to run the function once for every record, but after the initial run, you only have to check new entries since the last run. Schedule a chronjob that once every day/week/month.. checks all new records. You would need an inserted_at column in your table and would still need to compare the new names with every other name entry.
3.5 do some of the work onInsert
a) if the wait is acceptable, do a check once a new record should be inserted, so that you either write it to a log oder give a direct feedback to the user. (A tangent: this could be a good use case for an asynchrony task queue like http://gearman.org/ -> start a new process for the check in the background, return with the success message for the insert immediately)
b) PHP has two other function to help with searching for almost similar strings: metaphone() and soundex() . These functions generate abstract hashes that represent how a string will sound when spoken. You could generate (one or both of) these hashes on each insert, store them as a separate field in your table and use simple SQL functions to find records with similar hashes
The trouble with levenshtein is it only compares string a to string b. I built a spelling corrector once that puts all the strings a into a big trie, and that functioned as a dictionary. Then it would look up any string b in that dictionary, finding all nearest-matching words. I did it first in Fortran (!), then in Pascal. It would be easiest in a more modern language, but I suspect php would not make it easy. Look here.

Is it safe to search through serialized array as a string? (MySql query)

I have a serialized array like this
a:6:{i:0;i:6;i:1;i:65;i:2;i:56;i:3;i:87;i:4;i:48;i:5;i:528;}
For example i want to make a mysql query like this
$id_serialize = 6;
"SELECT id FROM table WHERE col LIKE '% i:" . $id_serialize . "; %'"
Is it possible to get a conflict (for example the numbers are repeated etc.) as a consequence of this query ?
Is there another more effective and correct way to find number in array without unserializing the array and without looping ?
It depends on data you are going to store. For integers it is highly possible.
a:6:{i:0;i:6;i:1;i:65;i:2;i:56;i:3;i:87;i:4;i:48;i:5;i:528;}
This actually menas:
a:6:{...} - array of 6 elements
i:0;i:6; - first element, id 0, value 6
i:1;i:65; - second element, id 1, value 65
and so on
If you will get to array of 7 elements, 7th element definition would be: i:6;i:34 And this would collide with i:0;i:6;. Your query would return results with id 6 along with results with value 6.
A bit more about arrays anatomy http://www.php.net/manual/pl/function.serialize.php#66147
a:1:{i:0;s:5:"i:42;";}
Oops.
It's extremely hard to search within data formats which allow arbitrary content. It's the same reason why regexen are simply unsuited for (X|HT)ML. You should really be normalising the data and store each value in its own column/row.
If you are certain of the contents of the array - that is, if you know that all the items in the array are numbers - you should be able to use your method without too much trouble. If anything else makes it into the array you may start getting false results.

PHP: Compare two sets of numbers, no dupes

I'm creating a lottery contest for my site, and I need to know the easiest way to compare numbers, so that no two people can choose the same numbers. It's 7 sets of numbers, each number is a number between 1 and 30.
For example, if user 1 chooses: 1, 7, 9, 17, 22, 25, 29 how can I make sure that user 2 can't choose those same exact number?
I was thinking about throwing all 7 numbers into an array, sort it so the numbers are in order, then join them into one string. Then when another user chooses their 7 numbers, it does the same, then compares the two. Is there a better way of doing it?
What you describe sounds like the best way to me, IF you are dealing with all submissions in the same script - I would trim(implode(',',$array)) the sorted array, store the resulting string in an array and call in_array() to determine whether the value already exists.
HOWEVER I suspect that what you are actually doing is storing the selections in a database table and comparing later submissions against this table. In this case (I am taking a liberty and assuming MySQL here but I would say it is the most common engine used with PHP) you should create a table with 7 columns choice_1, choice_2 ... choice_7(along with whatever other columns you want) and create a unique index across all seven choice_* columns. This means that when you try and insert a duplicate row, the query will fail. This lets MySQL do all the work for you.
Try array_diff. There are some really good examples on php.net.

I want to scramble an array in PHP

I want PHP to randomly create a multi-dimensional array by picking a vast amount of items out of predefined lists for n times, but never with 2 times the same.
Let me put that to human words in a real-life example: i want to write a list of vegetables and meat and i want php to make a menu for me, with every day something else then yesterday.
I tried and all i got was the scrambling but there were always doubles :s
Try the shuffle function http://us2.php.net/manual/en/function.shuffle.php
Use either array_rand() or shuffle().
Random != unique
You need to either:
a) create a list containing every possible combination and then randomly select and remove one
or
b) store your results so your random selection can be compared to previous selections.

Categories