I have a phone record tables as
phones(id, number);
It may have values as:-
1, 9801234567
2, 980 1234568
3, 9779801234569
4, 9801234570
5, 977 980 1234 571
If someone search for 980 1234567 (with spaces), I can remove the spaces before running the queries to get the result.
But my problem is when someone search for 9779801234571 with a condition that there is no regular format of number, it must return the last record i.e. 977 980 1234 571. Any idea how to do it efficiently?
Here is a way to do this:
where replace(phonenumber, ' ', '') = replace($phonenumber, ' ', '')
Doing this efficiently is another matter. For that, you would have to format your phone numbers in the database in a canonical format -- say by removing all the spaces in an update statement. Then put the number you are searching for in the same format. The query can then use an index on the column.
It is probably best to clean the phone numbers from whitespace before you write them into the database. You can easily to this using this function:
$string = preg_replace('/\s+/', '', $string);
Maybe you also have to strip out other characters like - or /.
Then you can use a simple WHERE condition without bells and whistles. This will also significantly improve the performance of your SELECT statement since you don't have to do conversions of your data in order to find the right row.
This is assuming that you fill the database yourself of course. If that's not the case, ignore this advise.
Related
I have 3 distinct lists of strings. First one contains names of people(from 10 chars to 80 chars long). Second one - room numbers(903, 231 and so on). Last one - group numbers(ABCD-1312, CXVZ-123).
I have a query which is given by a user. Firstly, I tried to search using Levenshtein distance, it didn't work, because whenever user types 3 chars, it gives some room number, even though there is no any digit in query. Then, I tried similar_text(), it worked better, but because people names all have different length, it mostly gives results with shorter names.
Now, the best I come up with is using similar_text() and str_pad() to make each string equal length. Still doesn't work properly.
I want to somehow give extra weight to strings, if they have several matches in a row, or if query and my string starts with the same letter and so on.
$search_min_heap = new SearchMinHeap();
$query = strtolower($query); // similar_text is case sensitive, so make everything lowercase
foreach ($res["result"] as &$item) {
similar_text($query, str_pad(strtolower($item["name_en"]), 100, " "), $cur_distance_en);
similar_text($query, str_pad(strtolower($item["name_ru"]), 100, " "), $cur_distance_ru);
similar_text($query, str_pad(strtolower($item["name_kk"]), 100, " "), $cur_distance_kk);
$cur_max_distance = max($cur_distance_en, $cur_distance_ru, $cur_distance_kk);
$item["matching"] = $cur_max_distance;
$search_min_heap->insert($item);
}
$first_elements = $search_min_heap->getFirstElements($count);
I am searching welds.welder_id and welds.bal_welder_id which are lists of unique welder IDs separated by spaces by the users.
The record set looks like 99,199,99 w259,w259 259 5-a
99,199,259,5-a and w259 are unique welder id numbers
I cannot use the MYSQL INSTR() function by itself as a search for "99" will pull up records with "199"
Users on each project format their welder IDs a different way (000,a000,0aa) usually to match their customer's records.
I really want to avoid using PHP code for a number of reasons.
To select records with "w259" in the welder_id OR in the bal_welder_id columns, my query looks like this.
SELECT * FROM `welds`
WHERE `omit`=0
AND( (`welder_id`='w259' OR `bal_welder_id`='w259')
OR (`welder_id` LIKE 'w259 %' OR `bal_welder_id` LIKE 'w259 %')
OR (`welder_id` LIKE '% w259' OR `bal_welder_id` LIKE '% w259')
OR (INSTR(`welder_id`, ' w259 ') > 0 OR INSTR(`bal_welder_id`,' w259 ') > 0))
ORDER BY `date_welded` DESC
LIMIT 100;
It works but it takes 0.0030 seconds with 1300 test records on my workstation's SSD.
The actual DB will have hundreds of thousands after a year or two.
Is there a better way?
Thanks.
If I understand your question correctly, one option is to use FIND_IN_SET(str, strlist) string function, which returns the position of the string str in the comma separated string list strlist, for example:
SELECT FIND_IN_SET('b','a,b,c,d');
will return 2. Since your string is not separated by commas, but by spaces, you could use REPLACE() to replace spaces with commas. Your query can be like this:
SELECT * FROM `welds`
WHERE
`omit`=0
AND
(FIND_IN_SET('w259', REPLACE(welder_id, ' ', ','))>0
OR
FIND_IN_SET('w259', REPLACE(bal_welder_id, ' ', ','))>0)
The optimizer however cannot to much, since FIND_IN_SET cannot make use of an index, if present. I would suggest you to normalize your table, if it is possible.
I am running the following SQL statement from a PHP script:
SELECT PHONE, COALESCE(PREFERREDNAME, POPULARNAME) FROM distilled_contacts WHERE PHONE LIKE :phone LIMIT 6
As obvious, the statement returns the first 6 matches against the table in question. The value I'm binding to the :phone variable is goes something like this:
$search = '%'.$search.'%';
Where, $search could be any string of numerals. The wildcard characters ensure that a search on, say 918, would return every record where the PHONE field contains 918:
9180078961
9879189872
0098976918
918
...
My problem is what happens if there does exist an entry with the value that matches the search string exactly, in this case 918 (the 4th item in the list above). Since there's a LIMIT 6, only the first 6 entries would be retrieved which may or may not contain the one with the exact match. Is there a way to ensure the results always contain the record with the exact match, on top of the resulting list, should one be available?
You could use an order by to ensure the exact match is always on top:
ORDER BY CASE WHEN PHONE = :phone THEN 1 ELSE 2 END
Using $search = ''.$search.'%' will show result, that matches the starting value.
Evening All.
I have a mysql database for a property website. There is a search form where people can enter a location or postcode in the same field.
Part of the SQL is
PostCode LIKE '$Loc%
Put my problem is some people enter a post code like this : "l236yt" and some with a space like this "l23 6yt".
The database contains the postcodes with the space in them so how can make it work with or without the space ??
Any help will be greatly appreciated
thanks baz
Assuming the values in your database are without space, just sanitize the user value to a value without space:
$val = str_replace(' ', '', $val);
You could convert the string into an array of characters, and search for spaces, if you don't find any, you can then proceed to insert a space (the space will always be after the third character in a 6 character postcode, and after the fourth character in a 7 character postcode, so you can use regex to do this quite simply. I believe there are also postcodes which are 5 characters long, and these will be like A1 1AA, so after the second character you'll find the space.
I can't help with precise code due to my lack of knowledge of the language, but good luck!
Expects a valid postcode as input (with or without spaces), and breaks it down into constituent parts as well as returning it neatly formatted with a space in the appropriate place
function parsePostcode($postcode) {
$postcode = preg_replace('/\s*/','',strtoupper($postcode));
$sector = substr($postcode,0,-2);
$outcode = $district = substr($sector,0,-1);
list($area) = sscanf($district,'%[A-Z]');
$incode = substr($postcode,-3);
return array(
'postcode' => $postcode,
'formatted' => $outcode.' '.$incode,
'area' => $area,
'district' => $district,
'sector' => $sector,
'outcode' => $outcode,
'incode' => $incode,
);
}
Your first priority is to remove the LIKE condition from your query. Using LIKE conditions forces MySQL to evaluate every row in your table and is very inefficient. Try to avoid using a LIKE unless absolutely necessary. In order to change this part of your query, you will need to replace it with:
PostCode = "$Loc"
This presents you with two options:
1) sanitise your input. Postcodes follow a well known format so it is possible to convert the value that someone enters into something you expect. You can then search on. As $Loc would match exactly what you have in the database, it would be very fast to find in your database (provided you have indexed the field of course!).
2) overload the database with multiple values to represent the same postcode. This would mean that you would put both "l236yt" and "l23 6yt" in the database and handle them as if they are different values. This also helps when you want to search on just the first part of the postcode, such as "l23", but would only work if you have a one-to-many relationship between postcodes and locations.
Using Drupal 7, and I'm trying to get results from the database using the LIKE command but it doesn't recognize my wildcards. I'm not sure if this is even a Drupal issue, or if I'm doing something wrong. Anyways here's an example of the data I'm trying to match, along with my patters
Data to Match
a:2:{i:1;s:2:"17";i:2;s:1:"3";}
My like Queries
$pattern1 = 'a:2:{i:1;s:2:"17";i:2;s:1:"%";}'//works
$pattern2 = 'a:2:{i:1;s:1:"%";i:2;s:1:"3";}'//fails
$result = db_query(
"
SELECT pa.nid, pa.model, pa.combination
FROM {$Product_Adjustments} pa
WHERE pa.combination LIKE :pattern
",
array(
':pattern' => $pattern1
)
);
Additionally, I've tried the '_' wildcard, but that doesn't bring anything up either
Are you sure the pattern is correct? Notice pattern 1, the first string is 2 long, and in pattern 2 you're looking for one that's only 1 long. Are you sure that's right? Are the lengths of the individual pieces of that serialized data predictable enough to even query this way? It seems unlikely, and you'll probably have to store some normalized data instead.