Searching for phonenumber in database with regex - php

I need to search in a database for a phonenumber. However, I don't know how the phone number is stored in the database, this can be in different ways, like:
0123456789
012 3456789
012 34 56 78 9
012-3456789
The string that I have to look for is always formatted like
0123456789
My query now looks like:
SELECT * FROM account WHERE phonenumber = '0123456789'
But this ofcourse only works when the phonenumber is formatted like the search string. How do I use a regex of other function to search for all kind of formatted phonenumbers?

Use Mysql REGEXP. This is a basic example of how You can achieve that. Works with every number format in Your example.
Think of better regexp to be more precise.
SELECT * FROM account WHERE phonenumber REGEXP '012( |-|)34( |)56( |)78( |)9';

While it is possible to perform REGEXP's in MySql, I don't think its a very solid solution and the expression will be hard to tune.
SELECT * FROM account WHERE phonenumber REGEXP '^([0-9]{1}( |-|)){9}[0-9]{1}$'
A good tool to test expressions is this site: http://www.spaweditor.com/scripts/regex/index.php
The trick is to normalize your data before you enter in in your database which means strip the telephone number of all non numeric characters.
You'll need to fix the numbers already in the table too.
Best way is to store the data normalized, i.e.: Country + area + number separately .

Related

How to sanitizing phone number before searching in multi search field?

I have a field in which a user can search a MySQL database for email, phone, and username. All numbers in the database are 10 digit (1231231234) format.
IF (big if there) the user enters a phone number in the following format(s) I want it to be sanitized into just a 10 digit string as it correlates in the database:
(123)123-1234
123-123-1234
123.123.1234
+1(123)123-1234
11231231234
Usernames and emails are allowed to have . and - characters. Hence I don't know how to use PHP to determine if it is in one of these formats and then sanitize it accordingly. Ideas?
This will remove anything you need to remove from phone numbers or any other value. Simply update the array if you need to remove other things.
$string = str_replace(array('-',' ','.','(',')',',','"','+'),'',$string);
libphonenumber can properly reformat phone numbers to a common format, adding international codes where appropriate, and then it can be stored in the database as a simple, searchable string.

Determine Country from Telephone Numbers

I have seen a few question on SO similar to what I require but nothing seems to fit the bill.
I am in the position where I need to deal with a call record and determine the country using the phone number. The number dialed can be any country for example:
44 7899455120 - UK
34 965791845 - Spain
355 788415235 - Albania
Obviously the world would be great if all calling codes were two digits but this is not the case. Currently I have a database full of countries and their relevant codes and in order to match I need to effectively take the first digit of the number ie 4 for the UK example and do a query of the database eg:
SELECT * from countries WHERE code LIKE '4%'
This may give me for example 20 results. So I loop again and do say
SELECT * from countries WHERE code LIKE '44%'
This may give me say one result, now I can determine it is UK. Some codes however like Albania are three digits and require more loops and database calls. This seems quite rudimentary and inefficient but as is I cannot think of another way to achieve this. I realise three calls to a database may not seem like much but if you have 1000 calls to deal with they soon add up.
Looking at the following question:
What regular expression will match valid international phone numbers?
There seems to be some great information on validating a number against country codes, but not so much on determining the country code from a number. Any advice or suggestions on a cleaner method would be much appreciated.
Spaces in the phone are shown for clarity
A library exists that will parse a string of digits and reformat it to international standards (a number like 4402081231234 to '+44 20 8123 1234'). It will also return the Phone Number region, 'GB' or 'US' from a number, if there is the country code embedded in the number.
https://github.com/googlei18n/libphonenumber The original library is in Java, but there are also versions in Javascript, Python, Ruby and PHP, among others.
There is no overlap ambiguity in the country codes. Meaning: the country code 11 is illegal because 1 is assigned to North America. Similarly, 20 is Egypt and there are no other country codes that start with 20. And the country codes that start with 21 are all 3 digits.
Since the is no overlap ambiguity, you can directly search for the country code in one query for the phone number 12125551212 like this:
select country
, code
from countrycodes
where code in ('121', '12', '1')
Again, there are no country codes 121 or 12, so the only criteria that will match is the 1.
Assuming the phone will always look like that:
$phone = "355 788415235"; // Albania
$parts = explode(" ", $phone);
$code = $parts[0]; // First part separated by space, 355.
Then query by that directly. No regular expression needed.
If that's not the case, consider separating the country code from the number on the input level.
On your system, every phone number has white space after country code so you can use it to determine country.
Create a table which has all country codes. Lıke
id | country | code
1 | Turkey | 90
2 | Spain | 34
(There is a table for you: http://erikastokes.com/mysql-help/country.sql.txt )
Than explode your phone number. Delimeter is white space " ".
$phoneNumber = "355 788415235";
$countryCode = explode(" ",$phoneNumber); // it divides phone number to two parts.
$countryCode = $countryCode[0]; // it returns 355. We write index 0 because country code is first part.
//Now you can call your country by country code.
$sqlQuery ="SELECT country FROM yourTableName WHERE code = '$countryCode' ";
...
//it will works like a charm. Because i currently using this.

Generate words (car brands/models) with mistakes

I am developing a fuzzy search mechanism. I have car brands/models and cities in database (mysql)(english and russian names) - about 1000 items. User can enter this words with mistakes or in translit. Now I am retrieving all these words from db and compare each word in loop with user entered word (using livenstein distance and other functions).
Is there any way to generate many forms of each word (car brands/models) + words with mistakes, because I want to retrieve these words from db (using like sql operator). For example: I have car brand: Toyota and I want to generate - Tokota, Tobota, Toyoba, Tayota, Тойота, Токота, Тобота (russian) - many many forms of each word. And user can enter any of this word and I can find that it is Toyota he means.
Well, there is a function called SOUNDEX in MySQL. I don't know it is what you need.
For example:
SELECT SOUNDEX('Toyyota') == SOUNDEX('Toyota')
Here is from the MySQL Document
Returns a soundex string from str. Two strings that sound almost the
same should have identical soundex strings. A standard soundex string
is four characters long, but the SOUNDEX() function returns an
arbitrarily long string. You can use SUBSTRING() on the result to get
a standard soundex string. All nonalphabetic characters in str are
ignored. All international alphabetic characters outside the A-Z range
are treated as vowels.
This function, as currently implemented, is intended to work well with
strings that are in the English language only. Strings in other
languages may not produce reliable results.
Reference: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex

Parsing input from user in any order or format

I am having some trouble trying to figure out how to parse information collected from user. The information I am collecting is:
Age
Sex
Zip Code
Following are some examples of how I may receive this from users:
30 Male 90250
30/M/90250
30 M 90250
M 30 90250
30-M-90250
90250,M,30
I started off with explode function but I was left with a huge list of if else statements to try to see how the user separated the information (was it space or comma or slash or hypen)
Any feedback is appreciated.
Thanks
It's easy enough. The ZIP code is always 5 digits, so a simple regex matching /\d{5}/ will work just fine. The Age is a number from 1 to 3 digits, so /\d{1,3}/ takes care of that. As for the gender, you could just look for an f for female and if there isn't one assume male.
With all that said, what's wrong with separate input fields?
You might want to use a few regular expressions:
One that looks for 5 numeric digits: [^\d]\d{5}[^\d]
One that looks for 2 numeric digits: [^\d]\d{2}[^\d]
One that looks for a single letter: [a-zA-Z]
[EDIT]
I've edited the RegExes. They now match every one of the presented alternatives, and don't require any alteration of the input string (which makes it a more efficient choice). They can also be run in any order.

Accepting values containing multiple point system

How do i accept values from user that contain multiple points like 1.2.1...if i use float--1.2.1 gets converted to 1.2.
Thanks.
the simple answer: if u want multiple points DONT USE FLOAT :-)
use something like varchar or text instead
Treat it as text.
A float is used to represent Real numbers, and "1.2.1" is not a Real number.
Or, if "1.2.1" is simply a grouping of numbers, you could split the input of "1.2.1" into three separate numbers using the period as a delimiter, and store them as distinct numbers.

Categories