Whitelist in php - php

I have an input for users where they are supposed to enter their phone number. The problem is that some people write their phone number with hyphens and spaces in them. I want to put the input trough a filter to remove such things and store only digits in my database.
I figured that I could do some str_replace() for the whitespaces and special chars.
However I think that a better approach would be to pick out just the digits instead of removing everything else. I think that I have heard the term "whitelisting" about this.
Could you please point me in the direction of solving this in PHP?
Example: I want the input "0333 452-123-4" to result in "03334521234"
Thanks!

This is a non-trivial problem because there are lots of colloquialisms and regional differences. Please refer to What is the best way for converting phone numbers into international format (E.164) using Java? It's Java but the same rules apply.
I would say that unless you need something more fully-featured, keep it simple. Create a list of valid regular expressions and check the input against each until you find a match.
If you want it really simple, simply remove non-digits:
$phone = preg_replace('![^\d]+!', '', $phone);
By the way, just picking out the digits is, by definition, the same as removing everything else. If you mean something different you may want to rephrase that.

$number = filter_var(str_replace(array("+","-"), '', $number), FILTER_SANITIZE_NUMBER_INT);
Filter_Var removes everything but pluses and minuses, and str_replace gets rid of those.
or you could use preg_replace
$number = preg_replace('/[^0-9]/', '', $number);

You could do it two ways. Iterate through each index in the string, and run is_numeric() on it, or you could use a regular expression on the string.

On the client side I do recommand using some formating that you design when creating a form. This is good for zip or telephone fields. Take a look at this jquery plugin for a reference. It will much easy later on the server side.

Related

PHP - How to ask if field contains number or + or spaces or empty?

My website form is getting hammered with spam. I have noticed in the "Phone" field the spam bots always insert text rather that a number so I would like to add an if statement to the php mailer blocking the email if the phone field doesn't contain any of the following:
1) I want users to be able to leave the field blank, so empty field must be accepted.
2) Must contain "numbers" or "plus sign" or "spaces"
How would I write this in PHP?
Any help is appreciated
EDIT: Just though lol it would be much easier to just check if the field contains alphabetical characters. How would I do this?
EDIT2: Sorted. I used "if (ctype_alpha ($phone) !== false)"
Regular expressions are probably the best way, although not necessarily the easiest to understand at first. But regular expressions are definitely a good thing to learn if you are not familiar with them. My favorite introduction is this site: http://www.zytrax.com/tech/web/regex.htm And this is a good site for interactively building a regex and seeing how it works in realtime: http://www.regexr.com/ I'm sure there are plenty of other similar sites but those are the two I always go back to.
If you search around for a regular expression solution you will find countless possibilities and variations. My personal advice is to keep it simple. I would start with considering how you store the phone number data. I usually just keep the numbers, so I would simplify it by first removing those "allowed" characters and then checking if what's left over is just numbers.
$phone = str_replace(Array('+', ' ', '(', ')'), '', $phone);
That will replace all pluses, spaces, and parentheses with an empty string (i.e. remove them). Then you can check if the string is numeric, and if it is store it, otherwise print/return an error.
if (!is_numeric($phone))
// stop processing and output an error
First of all You must use some spamblock for example: token, honey pot, captcha etc.
In my country mobile or local phone number contains only 9digits without country code which is +XX. So i create INT(10) field in db. After submit form remove everything without digits.
For example:
$phoneNumber = (int) substr( preg_replace( '#[^\d]+#', '', $_POST['phone_numer'] ), 0, 9 );
In many project allways works.

Using preg_match_all to filter out strings containing this but not this

im having an issue with preg_match_all. I have this string:
$product_req = "ACTIVE-6,CATEGORY-ACTIVE-8,CATEGORY-ACTIVE-4,ACTIVE-9";
I need to get the numbers preceded by "ACTIVE-" but not by "CATEGORY-ACTIVE-", so in this case the result should be 6,9. I used the statement below:
preg_match_all("/ACTIVE-(\d+)/", $product_req, $this_act);
However this will return all the numbers because all of them are in fact preceded by "ACTIVE-" but thats not what i meant because i need to leave out those preceded by "CATEGORY-ACTIVE-". How can i configure preg_match_all to do it? Or maybe there is some other function that can do the job?
EDIT:
I tried this:
preg_match_all("/CATEGORY-ACTIVE-(\d+)/", $product_req, $this_cat_act);
preg_match_all("/ACTIVE-(\d+)/", $product_req, $this_act);
$act_cat = str_replace($this_cat_act[1],"",$this_act[1]);
it kinda works, but i guess there is a better and cleaner way to do it. Besides the output is kinda weird too.
Thank you.

I want to add one hash to all the hashes in a string (PHP)

So in my string, I have certain sections with hashes. For example, consider the string "#Hello, this is a sample string. This is another example of ###hashes".
I want to replace that with:
"##Hello, this is a sample string, This is another example of ####hashes".
(note that the number of hashes in each instance increased by one)
However, I'm not too sure how. I'd imagine it involved regular expressions, and I've searched a bit, but I'm not too sure what to do.
Can anyone help/lead me on the right path?
Cheers
preg_replace('/(#[^#])/', '#\1', $string);
This works too:
preg_replace('/#+/', '#$0', $string);

Fetch All URLs from a Page using Regex

Original format:
<a href="http://www.example.com/t434234.html" ...>
1. I need to fetch all URLs of this format:
http://www.example.com/t[ANY CHARACTER].html
ANY CHARACTER is where value changes from URL to another. The rest are fixed.
Here is my attempt:
preg_match("#http:\/\/www\.aqarcity\.com\/t[a-zA-Z0-9_]\.html#", $page, $urls);
I get empty results. I don't know where i went wrong...
The problem appears to be that [a-zA-Z0-9_] will only match exactly one character. If you want to match zero or more characters, use [a-zA-Z0-9_]*. For one or more, use [a-zA-Z0-9_]+. For exactly six characters, use [a-zA-Z0-9_]{6}. For e.g. one to six characters, use [a-zA-Z0-9_]{1,6}.
Also note that, since you're using # as the delimiter, you don't need to escape the / characters. As far as I know this will not make your code misbehave, but it'll be easier to read if you remove the backslashes before the slashes.
Finally, please realize that regular expressions are a rather dangerous way to work with HTML. In this case, you may pick up matching URLs from comments, Javascript code, and other things that aren't links. It is literally impossible to correctly parse HTML with unaugmented regular expressions—they don't have the expressive power necessary to do so. I don't know what sorts of HTML parsers are available for PHP, but you may want to look into them.

removing phone number from a document

I've got a challenge that I am hoping that the SO community is able to help me with.
I trying to parse a lot of html documents in my PHP application to remove personal details, such as names, addresses and phone numbers. I can remove most of these details without too much trouble, however the phone number is a real problem for me.
My idea is to take the text from these documents and the use a regex to identify the phone numbers and replace them with another value such as 'xxxx'.
I've got 2 regex that I am using one for UK landline numbers and one for UK cell/mobile numbers.
However when I try and run them against the text it just returns an empty string.
I am using the following preg_replace code:
$pattens = array(
'/^(((\+44\s?\d{4}|\(?0\d{4}\)?)\s?\d{3}\s?\d{3})|((\+44\s?\d{3}|\(?0\d{3}\)?)\s?\d{3}\s?\d{4})|((\+44\s?\d{2}|\(?0\d{2}\)?)\s?\d{4}\s?\d{4}))(\s?\#(\d{4}|\d{3}))?$/',
'/^(\+44\s?7\d{3}|\(?07\d{3}\)?)\s?\d{3}\s?\d{3}$/'
);
$replace = array('xxxxx', 'xxxxx');
//do the search for the numbers.
$updatedContents = preg_replace($pattens, $replace, $htmlContents);
At the moment this is causing me a lot of head scratching as I thought that I had this nailed, but at the moment I can't see what's wrong??
I am sure that it is something really simple.
Thanks,
Grant
You probably don't want to anchor your regular expressions. Remove the ^ from the beginning and the $ from the end.

Categories