Calculating if an email fits within a regex? - php

So I am trying to determine if someone is using a temporary email made by our system. If a user tries to login with a social account (Twitter / Facebook) and they decline access to email I generate an email for our system which is AccountID#facebook.com or AccountID#twitter.com so an example would be 123456789#facebook.com. This is a temporary email until a user enters a real email. I am trying to compare this using regex.
if (preg_match("/^[0-9]#twitter.com/", Auth::user()->email, $matches)) {
}
However I think my regex is incorrect. How would one check if the format of a string is N Number of digits followed by #twitter.com or #facebook.com

How would one check if the format of a string is N Number of digits followed by #twitter.com or #facebook.com
You can use this regex:
'/^\d+#(?:facebook|twitter)\.com$/'
You are using ^[0-9]# which will allow for only single digit at start. Besides DOT is a special character in regex that needs to be escaped. Also note use of end anchor $ in your anchor to avoid matching unwanted input.

You forget to set ID as MULTIPLE number:
if (preg_match("/^[0-9]+#(twitter|facebook)\.com/", Auth::user()->email, $matches))
{
//Your code here
}

Related

Using PHP to convert email addresses to mailto with SEPARATE user/domain variables

I have the below function based on an answer in SO that converts email addresses in a string to mailto links. However, I'm trying to take it one step further, and separate the username and domain name into separate variables. The below function instead separates everything before the last period into one variable and after the period into another.
function emailize($str)
{
$mail_pattern = "/([A-z0-9\._-]+\#[A-z0-9_-]+\.)([A-z0-9\_\-\.]{1,}[A-z])/";
$str = preg_replace($mail_pattern, '<a class='obfuscate' data-user="$1" data-domain="$2" href="#">Click Here</a>', $str);
return $str;
}
Using test#example.com as the email address, I was hoping $1 would be test and $2 would be example.com.
Also, I will be passing a large string that potentially has multiple email addresses in it.
Thank You
With altered capture groups you can achieve what you want. A :? in a capture group makes it non-capturing. So
([A-z0-9\._-]+)\#([A-z0-9_-]+\.(?:[A-z0-9\_\-\.]{1,}[A-z]))
should give you the username as 1 capture group and the domain as another.
Demo: https://regex101.com/r/BF7p2g/1/

Instagram username Regex -PHP

I have register form and input of "instagram-username".
Instagram username can included only: a-z A-Z 0-9 dot(.) underline(_)
This is my code:
if(!empty($instaUsername) && preg_match("/([A-Za-z._])\w+/", $instaUsername)) {
throw new Exception ("Your name Instagram is incorrect");
}
When $instaUsername = "name.123" or "name_123" this give me the error.
How to make a regular expression according to the following requirements?
a-z A-Z 0-9 dot(.) underline(_)
I would love to have good tutorials on regex as comment.
jstassen has a good write up on insta user names:
(?:^|[^\w])(?:#)([A-Za-z0-9_](?:(?:[A-Za-z0-9_]|(?:\.(?!\.))){0,28}(?:[A-Za-z0-9_]))?)
Thus you want a regex what validates Instagram usernames? Well then, shall we first do some research on what the requirements actually are? Okay let's start!
...
Well it seems like Instagram doesn't really speak out about the requirements of a valid username. So I made an account and checked what usernames it accepts and what usernames are rejected to get a sense of the requirements. The requirements I found are as following:
The length must be between 3 and 30 characters.
The accepted characters are like you said: a-z A-Z 0-9 dot(.) underline(_).
It's not allowed to have two or more consecutive dots in a row.
It's not allowed to start or end the username with a dot.
Putting all of that together will result in the following regex:
^[\w](?!.*?\.{2})[\w.]{1,28}[\w]$
Try it out here with examples!
Moving forward from the comment section, this is what you want:
if(!empty($instaUsername) && preg_match('/^[a-zA-Z0-9._]+$/', $instaUsername)) {
throw new Exception ("Your name Instagram is incorrect");
}
This regex should work for you:
~^([a-z0-9._])+$~i
You can use anchors to match the start (^) and the end $ of your input. And use the modifier i for case-insensitivity.
I found a lot of the answers here way too complicated and didn't account for a username being used in a sentence (like hey #man... what's up?), so I did my own
/#(?:[\w][\.]{0,1})*[\w]/
This says:
#: Find an # symbol (optional)
([\w][\.]{0,1})+: Find a word character (A-Z, 0-9, _) and up to 1 dot in a row, * as many times as you like (e.g. allow #u.s.e.r.n.a.m.e, but not #u..sername)
(?: this before the above just means "don't actually create a capturing group for this"
[\w] end with a word character (i.e. don't allow a final dot, like #username. should exclude the dot)
If you want to limit the username to 30chars like IG does, then:
#(?:(?:[\w][\.]{0,1})*[\w]){1,29}
This just wraps everything in another non-capturing (?: group and limits the total length to 29 chars
Okay so my edit was rejected because i should have posted it as an answer or comment so
if(!empty($instaUsername) && !preg_match("/([A-Za-z._])\w+/", $instaUsername)) {
throw new Exception ("Your name Instagram is incorrect");
}
i use this pattern
if(empty($instaUsername) or preg_match("/[a-z|A-Z|0-9|\.|\_]+$/", $instaUsername)==false) {
throw new Exception ("Your name Instagram is incorrect");
}
Reg
Useful regex
(?!.*\.\.)(?!.*\.$)[^\W][\w.]{0,29}$

Using a regular expression to validate email addresses

I have just started learning to code both PHP as well as HTML and had a look at a few tutorials on regular expressions however have a hard time understanding what these mean. I appreciate any help.
For example, I would like to validate the email address peanuts#monkey.com. I start off with the code and I get the message invalid email address.
What am I doing wrong?
I know that the metacharacters such as ^ denote the start of a string and $ denote the end of a string however what does this mean? What is the start of a string and what is the end of a string?
When do I group regular expressions?
$emailaddress = 'peanuts#monkey.com';
if(preg_match('/^[a-zA-z0-9]+#[a-zA-z0-9]+\.[a-zA-z0-9]$/', $emailaddress)) {
echo 'Great, you have a valid email address';
} else {
echo 'boo hoo, you have an invalid email address';
}
What you have written works with some small modifications if that is what you want to use, however you miss a '+' at the end.
1)
^[a-zA-Z0-9]+#[a-zA-Z0-9]+\.[a-zA-Z0-9]+$
The caret and dollar character match positions rather than characters, ^ is equal to the beginning of line and $ is equal to the end of line, they are used to anchor your regex. If you write your regex without those two you will match email addresses everywhere in your text, not only the email addresses which is on a single line in this case. If you had written only the ^ (caret) you would have found every email address which is on the start of the line and if you had written only the $ (dollar) you would have found only the email addresses on the end of the line.
Blah blah blah someEmail#email.com
blah blah
would not give you a match because you do NOT have a email address at the beginning of line and the line does not terminate with it either so in order to match it in this context you would have to drop ^ and $.
Grouping is used for two reasons as far I know: Back referencing and... grouping. Grouping is used for the same reasons as in math, 1 + 3 * 4 is not the same as (1 + 3) * 4. You use parentheses to constrain quantifiers such as '+', '*' and '?' as well as alternation '|' etc.
You also parentheses for back referencing, but since I can't explain it better I would link you to: http://www.regular-expressions.info/brackets.html
I will encourage you to take a look at this book, even though you only read the first 2-3 chapters you will learn a lot and it is a great book! http://oreilly.com/catalog/9781565922570
And as the commentators say, this regex is not perfect but it works and show you what you had forgotten. You were not far away!
UPDATED as requested:
The '+', '*' and '?' are quantifiers. And is also a good example where you group.
'+' mean match whatever charachter preceeds it or group 1 or n times.
'*' mean match whatever charachter preceeds it 0 or n times.
'?' mean match whatever charachter preceeds it or the group 0 or 1 time.
n times meaning (indefinitely)
The reason why you use [a-zA-Z0-9]+ is without the '+' it will only match one character. With the + it will match many but it must match at least one. With * it match many but also 0, and ? will match 1 character at most but also 0.
Your regex doesn't match email addresses. Try this one:
/\b[\w\.-]+#[\w\.-]+\.\w{2,4}\b/
I recommend you read through this tutorial to learn about Regular Expressions.
Also, RegExr is great for testing them out.
As for your second question; the ^ character means that the regular expression must start matching from the first character in the string you input. The $ means that the regular expression must end at the final character in the string you input. In essence, this means that your regular expression will match the following string:
peanuts#monkey.com
but NOT the following string:
My email address is peanuts#monkey.com, and I love it!
Grouping regular expressions has lots of use cases. Using matching groups will also make your expression cleaner and more readable. It's all explained quite well in the tutorial I linked earlier.
As CanSpice points out, matching all possible email addresses isn't all that easy. Using the RFC2822 Email Validation expression will do a better job:
/[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*#(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/
There are many alternatives, but even the simplest ones will do a fair job as most email addresses end in .com (or other 2-4 character length top domains).
The only reason your original expression doesn't work is that you're limiting the number of characters behind the period (.) in your expressions to 1. Changing your expression to:
/^[a-zA-z0-9]+#[a-zA-z0-9]+\.[a-zA-z0-9]+$/
Will allow for an infinite amount of characters behind the last period.
/^[a-zA-z0-9]+#[a-zA-z0-9]+\.[a-zA-z0-9]{2,4}$/
Will allow 2 to 4 characters behind the last period. That would match:
name#email.com
name#email.info
but not:
fake#address.suckers
The top level domain (".com," ".net," ".museum") can be from 2 to 6 characters. So you should be saying 2,6 instead of 2,4.
I wrote an extremely good email address regular expression a few years ago:
^\w+([-+._]\w+)#(\w+((-+)|.))\w{1,63}.[a-zA-Z]{2,6}$
A lot of research went into that. But I have some basic tips:
DON'T JUST COPY-PASTE! If someone says "here's a great regex for that," don't just copy paste it! Understand what's going on! Regular expressions are not that hard. And once you learn them well, it'll pay dividends forever. I got good at them by taking a class in Perl back in college. Since then, I've barely gotten any better and am WAY better than the vast majority of programmers I know. It's sad. Anyways, learn it!
Start small. Instead of building a giant regex and testing it when you're done, test just a few characters. For example, when writing an email validator, why not try \w+#\w+.\w+ and see how good that is? Add in a few more things and re-test. Like ^\w+#\w+.[A-Za-z]{2,6}$
The start and end of a regex string means that nothing can come before or after the characters you specify. Your regex string needs to account for underscores, needs capitals Zs with your capital ranges, and other adjustments.
/^[a-zA-Z_0-9]+#[a-zA-Z0-9]+\.[a-zA-z0-9]{2,4}$/
{2,4} says the top level domain is between 2 and 4 characters.
This will validate ANY email address (at least i've tried a lot )
preg_match("/^[a-z0-9._-]{2,}+\#[a-z0-9_-]{2,}+\.([a-z0-9-]{2,4}|[a-z0-9-]{2,}+\.[a-z0-9-]{2,4})$/i", $emailaddress);
Hope it works!
Make sure you ALWAYS escape metacharacters (like dot):
if(preg_match('/^[a-zA-z0-9]+#[a-zA-z0-9]+\.[a-zA-z0-9]$/', $emailaddress)) {

Make sure username is not a phone number

I'm writing a mobile website and I would like the user to be able to login via username or phone number. I think the easist way to validate their response it to not allow them to signup using a phone number as their user name.
The problem is that I'll need to check if the input of the username field is JUST a 10 or 11 digit number. This is where my inexperance in regex comes to my disadvantage. I'm hoping to try something like
function do_reg($text, $regex)
{
if (preg_match($regex, $text)) {
return TRUE;
}
else {
return FALSE;
}
}
$username = $_POST['username'];
if(do_reg($username, '[0-9]{10,11}')){
die('cannot use a 10 or 11 digit number as a username');
}
The above regex is matching all numbers that are 10-11 digits long. I think maybe I need a way to say if the ONLY thing in the user input field is a 10-11 digit number get mad otherwise release the butterflies and rainbows.
EDIT: For the record I decided to make sure the username wasn't JUST a number. Thought this would be simpler and I didn't like the idea of having people use numbers as logins.
So I ended up with
if (!empty($username) && preg_match('/^\d+$/', $username )) {
die('username cannot be a number');
}
Thanks for the help all.
You are almost correct, except PCRE in PHP requires delimiters, and probably some anchors to make sure the field consists only of numbers.
if(do_reg($username, '/^\d{10,11}$/')){
// ^^ ^^
And probably use \d instead of [0-9].
(BTW, you should just call preg_match directly:
if (!preg_match('/^\d{10,11}$/', $username)) {
release('bufferflies', 'rainbows');
}
You need to anchor the regex to match the entire string: ^[0-9]{10,11}$.
^ matches the beginning of a string; $ matches the end.
Limit usernames to only 10 characters and require there username to start with a letter. How would a user write a 10 digit phone number as their username if they are required to enter in at least 1 alpha character (since phone numbers can't start with a 0/o or a 1/l)? (Heck I would require at least 3 alpha chars just to be safe).
When your app gets bigger then you can allow for longer usernames and take into account some of these issues:
Do not use ^ or $ signs if you are only testing the username: if(do_reg($username, '/^\d{10,11}$/')){
The reason I say this is anyone could defeat that by placing a letter in their username, a1235551212
instead use this:
if(do_reg($username, '/\d{10,11}/')){ because that will flag a1235551212d
Also, importantly, remember, that all of these regular expressions are only checking for numbers, there's nothing to stop a user from doing the following: ltwo3for5six7890. Unless of course you limit the username size.
You just should include start and end of the string in the regex
^[0-9]{10,11}$

how to validate an email address to this form: 234903284#student.uws.edu.au

I'd like to know a regex that would validate:
an email address to this form: 234903284#student.uws.edu.au
couple issues:
"student." is optional and could be any word eg "teacher.".
"324234234" can be any alpha numeric characters (number, word, _ etc.)
the email must end in "uws.edu.au"
This is what I have so far:
/(\d*)#\w*\.uws\.edu\.au/
valid addresses:
me#uws.edu.au
234234324#student.uws.edu.au
theking#teacher.uws.edu.au
etc.
Thanks Guys
Three thoughts:
Change the initial \d to \w to match "word" characters [a-zA-Z0-9_] instead of just digits.
Make the subdomain optional using ?
Use + instead of * when matching the username and subdomain. Otherwise #.uws.edu.au will validate.
Suggested:
/\w+#(\w+\.)?uws\.edu\.au/
You said:
Just tried /(\w*)#(\w*.)?uws.edu.au/ and that seemed to work. Any further suggestions are welcome – Jason 4 secs ago
Your regex will match "#teacher.uws.edu.au" (i.e. "name portion" omitted).
To fix this, you could use:
/(\w+)#(\w+\.)?uws\.edu\.au/
Which will require at least one character in the name portion, and at least one char before the dot (if there is a dot) in the subdomain spot.
Also (I think) that \w will not match . (and probably other chars that you care about in the name portion too), so bob.jones#student.uws.edu.au would fail to match. The following would add the char ., _, and - into the "name" portion:
/([\w\._-]+)#(\w*\.)?uws\.edu\.au/
you could add any other chars you need in the same way.
NOTE: Matching email addresses in general a more complex thing than you might think (lots of strange things are technically allowed in email addresses. Here is an article on the subject (There are many other sources of similar information available).

Categories