Regular expression to match single dot but not two dots? - php

Trying to create a regex pattern for email address check. That will allow a dot (.) but not if there are more than one next to each other.
Should match:
test.test#test.com
Should not match:
test..test#test.com
Now I know there are thousands of examples on internet for e-mail matching, so please don't post me links with complete solutions, I'm trying to learn here.
Actually the part that interests me the most is just the local part:
test.test that should match and test..test that should not match.
Thanks for helping out.

You may allow any number of [^\.] (any character except a dot) and [^\.])\.[^\.] (a dot enclosed by two non-dots) by using a disjunction (the pipe symbol |) between them and putting the whole thing with * (any number of those) between ^ and $ so that the entire string consists of those. Here's the code:
$s1 = "test.test#test.com";
$s2 = "test..test#test.com";
$pattern = '/^([^\.]|([^\.])\.[^\.])*$/';
echo "$s1: ", preg_match($pattern, $s1),"<p>","$s2: ", preg_match($pattern, $s2);
Yields:
test.test#test.com: 1
test..test#test.com: 0

This seams more logical to me:
/[^.]([\.])[^.]/
And it's simple. The look-ahead & look-behinds are indeed useful because they don't capture values. But in this case the capture group is only around the middle dot.

strpos($input,'..') === false
strpos function is more simple, if `$input' has not '..' your test is success.

To answer the question in the title, I'd update the RegExp by Junuxx and allow dots in the beginning and end of the string:
'/^\.?([^\.]|([^\.]\.))*$/'
which is optional . in the beginning followed by any number of non-. or [non-. followed by .].

^([^.]+\.?)+#$
That should do for the what comes before the #, I'll leave the rest for you.
Note that you should optimise it more to avoid other strange character setups, but this seems sufficient in answering what interests you
Don't forget the ^ and $ like I first did :(
Also forgot to slash the . - silly me

Related

Regex - Match characters but don't include within results

I have got the following Regex, which ALMOST works...
(?:^https?:\/\/)(?:www|[a-z]+)\.([^.]+)
I need the result to be the only result, or within the same position in the Array.
So for example this http://m.facebook.com/ matches perfect, there is only 1 group.
However, if I change it to http://facebook.com/ then I get com/in place of where Facebook should be. So I need to have (?:www|[a-z]+) as an optional check really.
Edit:
What I expect is just to match facebook, if ANY of the strings are as follows:
http://www.facebook.com
http://facebook.com
http://m.facebook.com
And obviously the https counterparts.
This is my Regex now
(?:^https?:\/\/)(?:www)?\.?([^.]+)
This is close, however it matches the m on when I try `http://m.facebook.com
https://regex101.com/r/GDapY5/1
So I need to have (?:www|[a-z]+) as an optional check really.
A ? at the end of a pattern is generally used for "optional" bits -- it means "match zero or one" of that thing, so your subpattern would be something like this:
(?:www|[a-z]+)?
If you're simply trying to get the second level domain, I wouldn't bother with regex, because you'll be constantly adjusting it to handle special cases you come across. Just split on dots and take the penultimate value:
$domain = array_reverse(explode('.', parse_url($str)['host']))[1];
Or:
$domain = array_reverse(explode('.', parse_url($str, PHP_URL_HOST)))[1];
Perhaps you could make the first m. part optional with (?:\w+\.)?.
Instead of a capturing group you could use \K to reset the starting point of the reported match.
Then match one or more word characters \w+ and use a positive lookahead to assert that what follows is a dot (?=\.)
For example:
^https?://(?:www)?(?:\w+\.)?\K\w+(?=\.)
Edit: Or you could match for m. or www. using an alternation:
^https?://(?:m\.|www\.)?\K\w+(?=\.)
Demo Php

PHP match for strings between two (starting, ending) delimters

String;
RandomValue1:|RandomSentence1.|RandomValue2:|RandomSentence2.|
I'm trying to match RandomSentence1. and RandomSentence2.. I figured the "." in the sentence could be used to help the matching since every sentence ends with a period. So if I don't have the period in my match. I'm OK with that. I've never been very good at RegEx but I'm always willing to try and learn. Through the results on here I haven't been able to come up with anything that works. I'd be coding this in PHP. I believe either preg_match() or preg_split() would be the usage here.
I initially tried; .*:\|.*\.\|
But that just matches the entire string since it ends with .|.
Then I tried this; .*:\|\s*(.*?)\s*\|
But that only matched the RandomSentence2.
These are adaptions of what I've found online.
This should work for a regex to capture all. Look for NOT . or | followed by . and |:
preg_match_all('/([^.|]+\.)\|/', $string, $matches);
print_r($matches[1]);
An alternate if you want to do something with the other entries would be to split and then find what you want. Split on | then grep for array values ending in .:
$matches = preg_grep('/\.$/', explode('|', $string));
Since you already know there is a dot at the end, you can just match all
with something simple (?<=\|)[^|.]+(?=\.\|)
https://regex101.com/r/ZsHcWq/1
(?<= \| )
[^|.]+
(?= \.\| )

Regex to validate username

I'm trying to understand what's wrong with this regex pattern:
'/^[a-z0-9-_\.]*[a-z0-9]+[a-z0-9-_\.]*{4,20}$/i'
What I'm trying to do is to validate the username. Allowed chars are alphanumeric, dash, underscore, and dot. The restriction I'm trying to implement is to have at least one alphanumeric character so the user will not be allowed to have a nickname like this one: _-_.
The function I'm using right now is:
function validate($pattern, $string){
return (bool) preg_match($pattern, $string);
}
Thanks.
EDIT
As #mario said, yes,t here is a problem with *{4,20}.
What I tried to do now is to add ( ) but this isn't working as excepted:
'/^([a-z0-9-_\.]*[a-z0-9]+[a-z0-9-_\.]*){4,20}$/i'
Now it matches 'aa--aa' but it doesn't match 'aa--' and '--aa'.
Any other suggestions?
EDIT
Maybe someone wants to deny not nice looking usernames like "_..-a".
This regex will deny to have consecutive non alphanumeric chars:
/^(?=.{4,20}$)[a-z0-9]{0,1}([a-z0-9._-][a-z0-9]+)*[a-z0-9.-_]{0,1}$/i
In this case _-this-is-me-_ will not match, but _this-is-me_ will match.
Have a nice day and thanks to all :)
Don't try to cram it all into one regex. Make your life simpler and use a two step-approach:
return (bool)
preg_match('/^[a-z0-9_.-]{4,20}$/', $s) && preg_match('/\w/', $s);
The mistake in your regex probably was the mixup of * and {n,m}. You can have only one of those quantifiers, not *{4,20} both after another.
Very well, here is the cumbersome solution to what you want:
preg_match('/^(?=.{4})(?!.{21})[\w.-]*[a-z][\w-.]*$/i', $s)
The assertions assert the length, and the second part ensures that at least one letter is present.
Try this one instead:
'/[a-z0-9-_\.]*[a-z0-9]{1,20}[a-z0-9-_\.]*$/i'
Its probably just a matter if finetuning, you could try something like this:
if (preg_match('/^[a-zA-Z0-9]+[_.-]{0,1}[a-zA-Z0-9]+$/m', $subject)) {
# Successful match
} else {
# Match attempt failed
}
Matches:
a_b <- you might not want this.
ysername
Username
1254_2367
fg3123as
Non-Matches:
l__asfg
AHA_ar3f!
sAD_ASF_#"#T_
"#%"&#"E
__-.asd
username
1___
Non-matches you might want to be matches:
1_5_2
this_is_my_name
It is clear to me that you should split this into two checks!
Firstly check that they are using all valid characters. If they're not, then you can tell them that they are using invalid characters.
Then check that they have at least one alpha-numeric character. If they're not, then you can tell them that they must.
Two distinct advantages here: more meaningful feedback to the user and cleaner code to read and maintain.
Here is a simple, single regex solution (verbose):
$re = '/ # Match password having at least one alphanum.
^ # Anchor to start of string.
(?=.*?[A-Za-z0-9]) # At least one alphanum.
[\w\-.]{4,20} # Match from 4 to 20 valid chars.
\z # Anchor to end of string.
/x';
In Action (short form):
function validate($string){
$re = '/^(?=.*?[A-Za-z0-9])[\w\-.]{4,20}\z/';
return (bool) preg_match($re, $string);
}
Try this:
^[a-zA-Z][-\w.]{0,22}([a-zA-Z\d]|(?<![-.])_)$
From related question: Create one RegEx to validate a username
^[A-Za-z][A-Za-z0-9]*(?=.{3,31}$)[a-z0-9]{0,1}([a-z0-9._-][a-z0-9]+)*[a-z0-9.-_]{0,1}$
This will Validate the username
start with an alpha
accept underscore dash and dots
no spaces allowed
Why don't you make it simpler like this?
^[a-zA-Z][a-zA-Z0-9\._-]{3,9}
First letter should be Alphabetical.
then followed by character or symbols you allowed
length of the word should be between 4,10 (as explicitly force the first word)

Get Everything between two characters

I'm using PHP. I'm trying to get a Regex pattern to match everything between value=" and " i.e. Line 1 Line 2,...,to Line 4.
value="Line 1
Line 2
Line 3
Line 4"
I've tried /.*?/ but it doesn't seem to work.
I'd appreciate some help.
Thanks.
P.S. I'd just like to add, in response to some comments, that all strings between the first " and last " are acceptable. I'm just trying to find a way to get everything between the very first " and very last " even when there is a " in between. I hope this makes sense. Thanks.
Assuming the desired character is "double quote":
$pat = '/\"([^\"]*?)\"/'; // text between quotes excluding quotes
$value='"Line 1 Line 2 Line 3 Line 4"';
preg_match($pat, $value, $matches);
echo $matches[1]; // $matches[0] is string with the outer quotes
if you just want answer and not want specific regex,then you can use this:
<?php
$str='value="Line 1
Line 2
Line 3
Line 4"';
$need=explode("\"",$str);
var_dump($need[1]);
?>
/.*?/ has the effect to not match the new line characters. If you want to match them too, you need to use a regular expression like /([^"]*)/.
I agree with Josh K that a regular expression is not required in this case (especially if you know there will not be any apices apart the one to delimit the string). You could adopt the solution given by him as well.
If you must use regex:
if (preg_match('!"([^"]+)"!', $value, $m))
echo $m[1];
You need s pattern modifier. Something like: /value="(.*)"/s
I'm not a regex guru, but why not just explode it?
// Say $var contains this value="..." string
$arr = explode('value="');
$mid = explode('"', $arr[1]);
$fin = $mid[0]; // Contains what you're looking for.
The specification isn't clear, but you can try something like this:
/value="[^"]*"/
Explanation:
First, value=" is matched literally
Then, match [^"]*, i.e. anything but ", possibly spanning multiple lines
Lastly, match " literally
This does not allow " to appear between the "real" quotes, not even if it's escaped by e.g. preceding with a backslash.
The […] is a character class. Something like [aeiou] matches one of any of the lowercase vowels. [^…] is a negated character class. [^aeiou] matches one of anything but the lowercase vowels.
References
regular-expressions.info/Examples - Programming Language Constructs - Strings
Has variations on different string patterns (e.g. allowing escaped quotes)
Related questions
Difference between .*? and .* for regex
As much as is practical, negated character class is always a better option than .*?

Need to negate this regex pattern, but no clue how

I found a regex pattern for PHP that does the exact OPPOSITE of what I'm needing, and I'm wondering how I can reverse it?
Let's say I have the following text: Item_154 ($12)
This pattern /\((.*?)\)/ gets what's inside the parenthesis, but I need to get "Item_154" and cut out what's in parenthesis and the space before the parenthesis.
Anybody know how I can do that?
Regex is above my head apparently...
/^([^( ]*)/
Match everything from the start of the string until the first space or (.
If the item you need to match can have spaces in it, and you only want to get rid of whitespace immediately before the parenthetical, then you can use this instead:
/^([^(]*?)\s*\(/
The following will match anything that looks like text (...) but returns just the text part in the match.
\w+(?=\s*\([^)]*\))
Explanation:
The \w includes alphanumeric and underscore, with + saying match one or more.
The (?= ) group is positive lookahead, saying "confirm this exists but don't match it".
Then we have \s for whitespace, and * saying zero or more.
The \( and \) matches literal ( and ) characters (since its normally a special chat).
The [^)] is anything non-) character, and again * is zero or more.
Hopefully all makes sense?
/(.*)\(.*\)/
What is not in () will now be your 1st match :)
One site that really helped me was http://gskinner.com/RegExr/
It'll let you build a regex and then paste in some sample targets/text to test it against, highlighting matches. All of the possible regex components are listed on the right with (essentially) a tooltip describing the function.
<?php
$string = 'Item_154 ($12)';
$pattern = '/(.*)\(.*?\)/';
preg_match($pattern, $string, $matches);
var_dump($matches[1]);
?>
Should get you Item_154
The following regex works for your string as a replacement if that helps? :-
\s*\(.*?\)
Here's an explanation of what's it doing...
Whitespace, any number of repetitions - \s*
Literal - \(
Any character, any number of repetitions, as few as possible - .*?
Literal - \)
I've found Expresso (http://www.ultrapico.com/) is the best way of learning/working out regular expressions.
HTH
Here is a one-shot to do the whole thing
$text = 'Item_154 ($12)';
$text = preg_replace('/([^\s]*)\s(\()[^)]*(\))/', $1$2$3, $text);
var_dump($text);
//Outputs: Item_154()
Keep in mind that using any PCRE functions involves a fair amount of overhead, so if you are using something like this in a long loop and the text is simple, you could probably do something like this with substr/strpos and then concat the parens on to the end since you know that they should be empty anyway.
That said, if you are looking to learn REGEXs and be productive with them, I would suggest checking out: http://rexv.org
I've found the PCRE tool there to very useful, though it can be quirky in certain ways. In particular, any examples that you work with there should only use single quotes if possible, as it doesn't work with double quotes correctly.
Also, to really get a grip on how to use regexs, I would check out Mastering Regular Expressions by Jeffrey Friedl ISBN-13:978-0596528126
Since you are using PHP, I would try to get the 3rd Edition since it has a section specifically on PHP PCRE. Just make sure to read the first 6 chapters first since they give you the foundation needed to work with the material in that particular chapter. If you see the 2nd Edition on the cheap somewhere, that pretty much the same core material, so it would be a good buy as well.

Categories