We are using FormTools to manage client data in our organisation. Due to recent changes in security policy my employer has asked me to mask phone numbers and email addresses of the clients. I used custom fields module and created the fields to mask them by using following code
{$VALUE|substr:-4}
this works well for the phone numbers but now my employer has asked me to mask only the domain part of the email address
eg: email#xyz.com should be displayed as email#xxx.com
the above mentioned smarty variable $value contains the email address.
You can use
{'/(#)|(?!^)\G\w/'|preg_replace:'$1x':$value}
The regex used is (#)|(?!^)\G\w. It does the following:
(#) - matches and captures #
| - or...
(?!^)\G\w - an alphanumeric character (\w) after each successful match.
Thus, we find # first, and then we only match alphanumeric symbols right after it replacing all of them with x symbol. The # itself is restored in the result with the help of a back-reference $1.
Here is an IDEONE PHP demo
The smarty syntax for using PHP functions is described in this SO post.
Related
Hi so I've been trying to work around this very basic thing which is to allow email accepted characters but not allow the # sign, because I am forcing the domain part of an email via a dropdown box which the user selects, therefore there should be no # sign in the email input.
I had this regular expression that included the # after the first class however removing it doesn't seem to allow any characters at all.
^([a-z0-9_\.-])+([\da-z\.-]+)\.([a-z\.]{2,6})$
Anyone able to point me in the right direction would be highly appreciated.
Your regex seems to work quite well for example for this string: nameDomain.com
See it here https://regex101.com/r/pAtPIg/1 in action.
However, the regex is not shellproof... For example it will match this string nameDomain..., too, wich is not a vaild email (stripped by # sign)...
I have tried the one on the .gov website, as stated on many questions here, but it doesnt seem to work for short postcodes.
My regex:
preg_match('^(([gG][iI][rR] {0,}0[aA]{2})|((([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y]?[0-9][0-9]?)|(([a-pr-uwyzA-PR-UWYZ][0-9][a-hjkstuwA-HJKSTUW])|([a-pr-uwyzA-PR-UWYZ][a-hk-yA-HK-Y][0-9][abehmnprv-yABEHMNPRV-Y]))) {0,}[0-9][abd-hjlnp-uw-zABD-HJLNP-UW-Z]{2}))^', $this->post['location'], $matches)
When I use a long postcode of format: AA9 9ZZ it works, but one of format AA9 doesnt. I need the following formats to work:
AA9
AA99
AA9 9ZZ
AA99 9ZZ
According to the pattern you have given and making the second part optional, I obtain:
~^(?:gir(?: *0aa)?|[a-pr-uwyz](?:[a-hk-y]?[0-9]+|[0-9][a-hjkstuw]|[a-hk-y][0-9][abehmnprv-y])(?: *[0-9][abd-hjlnp-uw-z]{2})?)$~i
demo
or to make it more readable:
~ # pattern delimiter
^ # start of the string anchor
(?: # branch 1
gir
(?:[ ]*0aa)? # second part optional (branch 1)
| # branch 2
[a-pr-uwyz] # I put it in factor to shorten the pattern
(?:
[a-hk-y]?[0-9]+
|
[0-9][a-hjkstuw]
|
[a-hk-y][0-9][abehmnprv-y]
)
(?:[ ]*[0-9][abd-hjlnp-uw-z]{2})? # second part optional (branch 2)
)
$ # end of the string anchor
~ix
^[A-Z]{1,2}\d{1,2}(?:(?: )?\d[A-Z]{2})?$ is the pattern I could come up with and it seems to work.
This could probably be improved, though. I'm not a Regexpert.
Here's a live example
Due to the way UK postcodes work, validating them by using regex is not a foolproof solution. Two of the main problems are:
Royal Mail could change the format of some postcodes, adding sub-districts that isn't covered by the regex you choose to use.
Validating by regex only ensures the postcode is in a valid format according to your regex rules, not that it's a postcode that exists as part of an address.
Royal Mail provide a PAF Database, which contains all UK addresses, including postcodes. Many companies have exposed this data through APIs and website plugins.
For example, I work for a company called PCA Predict, and we have a demo of our solution in action. It's a plugin for online checkout forms that allows a customer to start typing any part of their address, and will auto fill the fields when they select their's.
We also offer REST APIs to silently validate and return addresses.
Please feel free to comment if you need any help with address validation, as it can be much harder than it initially looks! I'm also not saying you should use our services, but give it a go, and have a Google about other options as well.
I have had the same problem and it is hard to validate a postcode, Royal Mail Add and Remove postcodes quite frequently. so for the past 2 weeks I have been building an address database and have created a very nasty looking API for free you can validate Postcode and it will return every address for that postcode.
I hope this is helpful.
https://www.pervazive.co.uk/free-api-for-uk-postcode-lookup/
Endpoint
GET: https://api.pervazive.co.uk/postcode.php?postcode=[POSTCODE]
Response
Request GET: -> https://api.pervazive.co.uk/postcode.php?postcode=AB10+1AB
Return Format: JSON
{"predictions":[
{"ID":"0","Address":"Aberdeen City Council, Marischal College, Broad Street, Aberdeen, Aberdeenshire, AB10 1AB","Postcode":"AB10 1AB"}
],"Execution_Time":"0.50983214378357","status":"200"}
I am going through our old site files and data that has our members emails and correspondence for 10 years.
I am extracting all of the email addresses (and botched email entries) and adding them to our new sites db.
It was a beginner attempt cms and had no error checking and validation.
So, I am having trouble matching emails with spaces and double #.
jam # spa ces1.com
jam#spac es2.com
jam##doubleats.org
I have constructed this loose regex that intentionally allows for a whole bunch of incorrect email formats but, the above three are examples of ones I can't figure out.
Here is my current "working" code:
$pattern1= '([\s]*)([_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*([ ]+|)#([ ]+|)([a-zA-Z0-9-]+\.)+([a-zA-Z]{2,}))([\s]*)';
$pattern2='\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b';
$pattern="/$pattern1|$pattern2/i";
$isago = preg_match_all($pattern,$text,$matches);
if ($isago) {.......
I need another pattern that would allow the three email examples above to be recognized as email addresses. (actual validation comes later)
Also, is there is any other patterns I could use that would allow me to recognize possible emails in the files?
Thanks for any help.
For the third case you can change your # to #{1,2}.
For the first and second you can add a space in your regex pattern1:
$pattern1= '([\s]*)([_a-zA-Z0-9-]+(\.[_a-zA-Z0-9-]+)*([ ]+|)#{1,2}([ ]+|)([ a-zA-Z0-9-]+\.)+([a-zA-Z]{2,}))([\s]*)';
$pattern2='\b[A-Z0-9._%+-]+#{1,2}[A-Z0-9.-]+\.[A-Z]{2,4}\b';
This answer is like a joke I know... but, how about this RegEx:
/[\S ]+#[\S ]+\.[\S ]+/i
That's works for you? I'm tested it in a document and match the three mails.
For general purpose you should use something like this:
/[A-Za-z0-9\._]+#[A-Za-z0-9\._]+\.[A-Za-z0-9\._]+/i
With that you would match all the emails, even separated by newline or commas.
Thare are new nations domains and TLDs like "http://президент.рф/" - for Russian Federation domains, or http://example.新加坡 for Singapore...
Is there a regex to validate these domains?
I have found this one: What is the best regular expression to check if a string is a valid URL?
But when I try to use one of the expressions listed there - PHP is getting overhitted :)
preg_match(): Compilation failed: character value in \x{...} sequence is too large at offset 81
P.S.
1) Last part was solved by #OmnipotentEntity
2) But the main problem - to validate international domain - still exists, because example regexp doesn't validate well.
Use the "u" modifier to match unicode characters. The example you gave only uses the "i" modifier.
No, there's no regexp to validate those domains. Each TLD has different rules about which Unicode code points are permissible within their IDNs (if any). You would need a very big lookup table which would have to be kept up-to-date to know which specific characters are legal.
Furthermore there are rules about whether left-to-right written characters and right-to-left characters can be combined within a single DNS label.
BTW, the RFCs mentioned in the other comments are obsolete. The recently approved set are RFCs 5890 - 5895.
I was wondering if the codes below are the correct way to check for a street address, email address, password, city and url using preg_match using regular expressions?
And if not how should I fix the preg_match code?
preg_match ('/^[A-Z0-9 \'.-]{1,255}$/i', $trimmed['address']) //street address
preg_match ('/^[\w.-]+#[\w.-]+\.[A-Za-z]{2,6}$/', $trimmed['email'] //email address
preg_match ('/^\w{4,20}$/', $trimmed['password']) //password
preg_match ('/^[A-Z \'.-]{1,255}$/i', $trimmed['city']) //city
preg_match("/^[a-zA-Z]+[:\/\/]+[A-Za-z0-9\-_]+\\.+[A-Za-z0-9\.\/%&=\?\-_]+$/i", $trimmed['url']) //url
Your street address: ^[A-Z0-9 \'.-]{1,255}$
you need not escape the single quote.
since you have a dot in the char
class, it will allow all char (except
newline). So effective your regex becomes ^.{1,255}$
you are allowing it to be of min
length of 1 and max of length 255. I
would suggest you to increase the min
length to something more than 1.
Your email regex: ^[\w.-]+#[\w.-]+\.[A-Za-z]{2,6}$
again you are having . in the char
class. fix that.
Your password regex: ^\w{4,20}$
allows for a passwd of length 4 to 20
and can contain only alphabets(upper
and lower), digits and underscore. I would suggest you to allow
special char too..to make your
password stronger.
Your city regex: ^[A-Z \'.-]{1,255}$
has . in char class
allows min length of 1 (if you want
to allow cities of 1 char length this
is fine).
EDIT:
Since you are very new to regex, spend some time on Regular-Expressions.info
This seems overly complicated to me. In particular I can see a few things that won't work:
Your regex will fail for cities with non-ASCII letters in their names, such as "Malmö" or 서울, etc.
Your password validator doesn't allow for spaces in the password (which is useful for entering pass-phrases) it doesn't even allow digits or punctuation, which many people will like to put in their passwords for added security.
You address validator won't allow for people who live in apartments (12/345 Foo St)
(this is assuming you meant "\." instead of "." since "." matches anything)
And so on. In general, I think over-reliance on regular expressions for validation is not a good thing. You're probably better off allowing anything for those fields and just validating them some other way.
For example, with email addresses: just because an address is valid according to the RFC standard doesn't mean you'll actually be able to send email to it (or that it's the correct email address for the person). The only reliable way to validate an email address is to actually send an email to it and get the person to click on a link or something.
Same thing with URLs: just because it's valid according to the standard doesn't actually mean there's a web page there. You can validate the URL by trying to do an actual request to fetch the page.
But my personal preference would be to just do the absolute minimum verification possible, and leave it at that. Let people edit their profile (or whatever it is you're verifying) in case they make a mistake.
There's not really a 'correct' way to check for any of those things. It depends on what exactly your requirements are.
For e-mail addresses and URLs, I'd recommend using filter_var instead of regexps - just pass it FILTER_VALIDATE_EMAIL or FILTER_VALIDATE_URL.
With the other regexps, you need to make sure you escape . inside character classes (otherwise it'll allow everything), and you might want to consider that the City/Street ones would allow rubbish such as ''''', or just whitespace.
Please don't assume that you know how an address is made up. There are thousands of cities, towns and villages with characters like & and those from other alphabets.
Just DON'T try to validate an address unless you do it thru an API specific to a country (USPS for the US, for example).
And why would you want to limit the characters in a users password? Don't have ANY requirements on the password except for it existing.
Your site will be unusable if you use those regex.