REGEX validate MAC and delimeters - php

I want to validate MAC address and allow to use only one kind of delimiter.
I use pattern :
^([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2})$
It works fine, but there is some bug.
For example: 01-23:45:67:89-AB is valid according pattern. How to allow use only one kind of delimiter ?
Thanks.

You may use
^[0-9A-Fa-f]{2}(?=([:-]))(?:\1[0-9A-Fa-f]{2}){5}$
See the regex demo
Details
^ - start of a string
[0-9A-Fa-f]{2} - two hex chars
(?=([:-])) - the next char must be a : or -, this value is captured into Group 1
(?:\1[0-9A-Fa-f]{2}){5} - exactly five occurrences of
\1 - the same char that is stored in Group 1 buffer
[0-9A-Fa-f]{2} - two hex chars
$ - end of string.
Alternatively, to shorten the pattern a bit, you may also use
^([0-9A-Fa-f]{2})(?=([:-]))(?:\2(?1)){5}$
See this regex demo. You may also use a case insensitive modifier to "shrink" it even more: '~^([0-9A-F]{2})(?=([:-]))(?:\2(?1)){5}$~i'. The thing is that the first part of the pattern, ([0-9A-Fa-f]{2}), is captured, and (?1) recurses the pattern later (so that you do not need to write it again).

Related

NOT words in Regex Pattern

I am trying to grab the text after the first hyphen in a pattern
<title>.*?-(.*?)(-|<\/title>)
which then grabs DesiredText from the pattern below:
<title>Stuff - DesiredText - Other Stuff</title>
However in this pattern:
<title>Stuff - Unwanted - DesiredText - Otherstuff</title>
I want it to skip the 'Unwanted' text and match the text after the next hyphen instead (DesiredText). I made a regex101 with both patterns and need to modify my basic regex so that if a word or words I don't want to match are present in that capture group it then matches the second hyphen text instead:
https://regex101.com/r/veSqH3/1
I believe this is what you are looking for. The key is in using the caret (^) character within the square-bracket character list ([]). Using the caret and brackets together indicate a blacklist. It will only match things that are NOT in the list.
https://regex101.com/r/alAZhj/3
Pattern: <title>.*?-\s*([^-\s]*)\s*- End<\/title>
This matches anything in between the middle hyphens that is not a hyphen or space. You can of course modify the pattern to include such characters by using the following pattern.
Pattern: <title>.*?-\s*([^-]*)\s*- End<\/title>
This will match anything in between the middle hyphens that is not a hyphen, so that you can have less restricted text in there.
This will use a negative lookahead to disqualify Note. There may be ways to optimize the pattern, but I cannot do so with confidence because I don't know how variable your inputs strings are.
Pattern: /<title>.*?- (?P<title>(?!Note).*?)(?= -|<])/
Demo
I am using a positive lookahead to ensure the captured match doesn't have any unwanted trailing characters.
If you just want the second last delimited value, you could do something like this to return the value as the fullstring match:
~- \K[^-]*(?= - [^-]*?</title>)~
Or faster with a capture group:
~- ([^-]*) - [^-]*?</title>~
This assumes there are no hyphens in the value.
I took a different approach and focused on returning the capture prior to the last word, rather than any sort of negation. In this way it's highly generic.
This pattern will match what you want in the capture group:
\s-\s([a-zA-Z]+)\s-\s[a-zA-Z]+<\/title>
If you are concerned that this only match between title tags, then you can add:
<title>.*?\s-\s([a-zA-Z]+)\s-\s[a-zA-Z]+<\/title>
Here's a link to the Test
The only limitation to this I see, is that it uses words and whitespace, so if your desired match is "- Some phrase -" then this won't work with it, but that was not indicated in your example. It's a bit unclear because you used "other stuff" and then "otherstuff".

PHP regex pattern for matching username

I'm developing a laravel application where a user can refer to his profile by putting his username in the appropriate form.
Let's see an example:
A user named John can refer to his profile using the following text: #John
I spent several hours trying to understand how regex works, but this pattern is where i've got so far: #([A-Za-z0-9]+)
This pattern perfectly matches the example above, but it also matches other formats that it normally shouldn't.
I need some help creating the perfect pattern.
It should only match a string that starts with the # symbol.
For example: #John, #Sam, #Bill, etc.
It shouldn't match a string that doesn't start with the # symbol.
For example: a#John, something#Sam, 123#Bill, etc.
It should also match those formats that contain more than one # symbols.
For example: #John#, #Sam#something, #Bill##sometext, etc.
In this case the pattern should capture: John#, Sam#something, Bill##sometext
Thanks for your help and sorry for my bad english.
This should work:
(?<=\s|^)#([\w#]+)
There is a positive lookbehind assertion to make sure the tag is preceded by whitespace, or the start of the string. After that it's just a case of consuming the # character and putting the username inside a capturing group.
Regex demo
Your regex is almost correct.
Firstly, you want to say that your regex should match also the begining of the string. You can achieve that with caret symbol (^):
^#([A-Za-z0-9]+)
Secondly, you want to be able to put the # sign inside. Now it's easy - just add that symbol inside the brackets.
^#([A-Za-z0-9#]+)
Try /(?:\s#{1,3})([a-zA-Z#.]+)/i
Explain
# Character. Matches a "#" character (char code 64).
{1,3} Quantifier. Match between 1 and 3 of the preceding token.
\w Word. Matches any word character (alphanumeric & underscore).
+ Plus. Match 1 or more of the preceding token.
Here is regexr: http://regexr.com/3djhq

Regex separate ports by comma

Problem
I have a working port validity checker, however I need to separate the ports by a comma (no spaces). For instance instead of just '80' being valid, now '80,443,8080' would be valid.
Regex
(^(?:6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|[1-9][0-9]{1,3}|[0-9])$
Tried
I realise that I may need to break the query up, so tried many things including appending this (,\n|,?$) to the end of the query, however this did not work.
Since it is PHP, with PCRE regex flavor, you can easily recurse the subpattern with a subroutine:
^(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|[1-9][0-9]{1,3}|[0-9])(?:,(?1))*$
See the regex demo
Explanation:
^ - start of string
(6553[0-5]|655[0-2][0-9]|65[0-4][0-9]{2}|6[0-4][0-9]{3}|[1-5][0-9]{4}|[1-9][0-9]{1,3}|[0-9]) - Group 1: the single port validation subpattern
(?:,(?1))* - 0+ sequences of , followed with the subpattern above (with the (?1) subroutine that re-uses the pattern inside the Group 1)
$ - end of string
Other way, you can negate what you are looking for. This means:
5 digits numbers upper to 65535, or at least 6 digit numbers
contiguous commas, or commas at the end or beginning of the string
characters that are not a digit or a comma
To use this kind of pattern, you obviously need to negate the preg_match function:
return !preg_match('~6(?:5(?:5(?:3(?:[6-9]|\d{2})|[4-9]\d)|[6-9]\d{2})|[6-9]\d{3})|[7-9]\d{4}|[1-9]\d{5}|\B,|,\B|[^\d,]~S', $str);
The S modifier switches on the optimization for non-anchored patterns.
This approach can be interesting since the search stops as soon as the pattern succeeds.

Matching ugly extra abbreviations and numbers in titles with PHP regex

I have to create regex to match ugly abbreviations and numbers. These can be one of following "formats":
1) [any alphabet char length of 1 char][0-9]
2) [double][whitespace][2-3 length of any alphabet char]
I tried to match double:
preg_match("/^-?(?:\d+|\d*\.\d+)$/", $source, $matches);
But I coldn't get it to select following example: 1.1 AA My test title. What is wrong with my regex and how can I add those others to my regex too?
In your regex you say "start of string, followed by maybe a - followed by at least one digit or followed by 0 or more digits, followed by a dot and followed by at least one digit and followed by the end of string.
So you regex could match for example.. 4.5, -.1 etc. This is exactly what you tell it to do.
You test input string does not match since there are other characters present after the number 1.1 and even if it somehow magically matched your "double" matching regex is wrong.
For a double without scientific notation you usually use this regex :
[-+]?\b[0-9]+(\.[0-9]+)?\b
Now that we have this out of our way we need a whitespace \s and
[2-3 length of alphabet]
Now I have no idea what [2-3 length of alphabet] means but by combining the above you get a regex like this :
[-+]?\b[0-9]+(\.[0-9]+)?\b\s[2-3 length of alphabet]
You can also place anchors ^$ if you want the string to match entirely :
^[-+]?\b[0-9]+(\.[0-9]+)?\b\s[2-3 length of alphabet]$
Feel free to ask if you are stuck! :)
I see multiple issues with your regex:
You try to match the whole string (as a number) by the anchors: ^ at the beginning and $ at the end. If you don't want that, remove those.
The number group is non-catching. It will be checked for matches, but those won't be added to $matches. That's because of the ?: internal options you set in (?:...). Remove ?: to make that group catching.
You place the shorter digit-pattern before the longer one. If you swap the order, the regex engine will look for it first and on success prefer it over the shorter one.
Maybe this already solves your issue:
preg_match("/-?(\d*\.\d+|\d+)/", $source, $matches);
Demo

regex validation

I am trying to validate a string of 3 numbers followed by / then 5 more numbers
I thought this would work
(/^([0-9]+[0-9]+[0-9]+/[0-9]+[0-9]+[0-9]+[0-9]+[0-9])/i)
but it doesn't, any ideas what i'm doing wrong
Try this
preg_match('#^\d{3}/\d{5}#', $string)
The reason yours is not working is due to the + symbols which match "one or more" of the nominated character or character class.
Also, when using forward-slash delimiters (the characters at the start and end of your expression), you need to escape any forward-slashes in the pattern by prefixing them with a backslash, eg
/foo\/bar/
PHP allows you to use alternate delimiters (as in my answer) which is handy if your expression contains many forward-slashes.
First of all, you're using / as the regexp delimiter, so you can't use it in the pattern without escaping it with a backslash. Otherwise, PHP will think that you're pattern ends at the / in the middle (you can see that even StackOverflow's syntax highlighting thinks so).
Second, the + is "greedy", and will match as many characters as it can, so the first [0-9]+ would match the first 3 numbers in one go, leaving nothing for the next two to match.
Third, there's no need to use i, since you're dealing with numbers which aren't upper- or lowercase, so case-sensitivity is a moot point.
Try this instead
/^\d{3}\/\d{5}$/
The \d is shorthand for writing [0-9], and the {3} and {5} means repeat 3 or 5 times, respectively.
(This pattern is anchored to the start and the end of the string. Your pattern was only anchored to the beginning, and if that was on purpose, the remove the $ from my pattern)
I recently found this site useful for debugging regexes:
http://www.regextester.com/index2.html
It assumes use of /.../ (meaning you should not include those slashes in the regex you paste in).
So, after I put your regex ^([0-9]+[0-9]+[0-9]+/[0-9]+[0-9]+[0-9]+[0-9]+[0-9]) in the Regex box and 123/45678 in the Test box I see no match. When I put a backslash in front of the forward slash in the middle, then it recognizes the match. You can then try matching 1234/567890 and discover it still matches. Then you go through and remove all the plus signs and then it correctly stops matching.
What I particularly like about this particular site is the way it shows the partial matches in red, allowing you to see where your regex is working up to.

Categories