Regex / preg_match to find 13 character unique ID - php

My database creates new entries using the PHP uniqid function. This means the ID is 13 characters and a mix of numbers and letters.
Examples of IDs:
5a0ae6fa29476
5a26822fbfd19
5a2a952fc9558
When an email comes in, it is meant to check the subject for a # followed by the ID - example subject: "Re: [Item #5a0ae6fa29476] Need Info". It must contain the #.
I'd like to use preg_match / regex to pull the ID from the email.
I'm currently using:
/(?!#)\w{13}/
But the problem with it is that the # is not important and the following strings in email subjects will still be processed:
5a0ae6fa29476
13_characters
Communications
(any 13 character string involving letters, numbers or underswcores)
Can anyone advise a better regex to use? Thanks in advance

You need to match the # symbol before the 13 digits, but you may also discard it easily with the \K operator:
/#\K\w{13}\b/
Details
# - a # symbol
\K - match reset operator discarding all text matched so far
\w{13} - 13 word chars ending with a
\b - word boundary
See the regex demo.

Related

REGEX validate MAC and delimeters

I want to validate MAC address and allow to use only one kind of delimiter.
I use pattern :
^([0-9A-Fa-f]{2}[:-]){5}([0-9A-Fa-f]{2})$
It works fine, but there is some bug.
For example: 01-23:45:67:89-AB is valid according pattern. How to allow use only one kind of delimiter ?
Thanks.
You may use
^[0-9A-Fa-f]{2}(?=([:-]))(?:\1[0-9A-Fa-f]{2}){5}$
See the regex demo
Details
^ - start of a string
[0-9A-Fa-f]{2} - two hex chars
(?=([:-])) - the next char must be a : or -, this value is captured into Group 1
(?:\1[0-9A-Fa-f]{2}){5} - exactly five occurrences of
\1 - the same char that is stored in Group 1 buffer
[0-9A-Fa-f]{2} - two hex chars
$ - end of string.
Alternatively, to shorten the pattern a bit, you may also use
^([0-9A-Fa-f]{2})(?=([:-]))(?:\2(?1)){5}$
See this regex demo. You may also use a case insensitive modifier to "shrink" it even more: '~^([0-9A-F]{2})(?=([:-]))(?:\2(?1)){5}$~i'. The thing is that the first part of the pattern, ([0-9A-Fa-f]{2}), is captured, and (?1) recurses the pattern later (so that you do not need to write it again).

PHP preg_replace match numbers following a special character

I'm creating a comment board feature that allows users to reference post-ID's, which will be auto-configured by regex to hyperlink to the relevant post.
Posts references are formatted as the following, using the double-arrow ASCII symbol: »1234
6 numbers maximum can follow the double-arrow in order for the reference to be hyperlinked, so »1234567 would not hyperlink, but »1, »12, »123, etc would.
How would I go about doing this with regex?
Match the special character followed by 1-6 digits and then followed by a word boundary, so it won't match if it's concatenated with any other string.
»\d{1,6}\b
Here is one solution: » matches the arrow character, \d matches a number between 0 and 9 and {1,6} specifies, that at least 1 and maximal 6 numbers should follow. If you want to match only whole words, you can use a word boundary on front and on back of the regex (\b). If you want to check if the whole string consists only of this pattern, you can use an anchor (^ in the beginning, $ at the end).
»\d{1,6}

PHP regex pattern for matching username

I'm developing a laravel application where a user can refer to his profile by putting his username in the appropriate form.
Let's see an example:
A user named John can refer to his profile using the following text: #John
I spent several hours trying to understand how regex works, but this pattern is where i've got so far: #([A-Za-z0-9]+)
This pattern perfectly matches the example above, but it also matches other formats that it normally shouldn't.
I need some help creating the perfect pattern.
It should only match a string that starts with the # symbol.
For example: #John, #Sam, #Bill, etc.
It shouldn't match a string that doesn't start with the # symbol.
For example: a#John, something#Sam, 123#Bill, etc.
It should also match those formats that contain more than one # symbols.
For example: #John#, #Sam#something, #Bill##sometext, etc.
In this case the pattern should capture: John#, Sam#something, Bill##sometext
Thanks for your help and sorry for my bad english.
This should work:
(?<=\s|^)#([\w#]+)
There is a positive lookbehind assertion to make sure the tag is preceded by whitespace, or the start of the string. After that it's just a case of consuming the # character and putting the username inside a capturing group.
Regex demo
Your regex is almost correct.
Firstly, you want to say that your regex should match also the begining of the string. You can achieve that with caret symbol (^):
^#([A-Za-z0-9]+)
Secondly, you want to be able to put the # sign inside. Now it's easy - just add that symbol inside the brackets.
^#([A-Za-z0-9#]+)
Try /(?:\s#{1,3})([a-zA-Z#.]+)/i
Explain
# Character. Matches a "#" character (char code 64).
{1,3} Quantifier. Match between 1 and 3 of the preceding token.
\w Word. Matches any word character (alphanumeric & underscore).
+ Plus. Match 1 or more of the preceding token.
Here is regexr: http://regexr.com/3djhq

Using preg_match to validate a string format

I have an html form with an input for a sales order number which should have the format of K1234/5678. It should always start with the letter K then 4 numbers, a / and followed by another set of 4 numbers.
I'm trying to validate the formatting using preg_match and I'm getting lost in the syntax of preg_match. From http://php.net/manual/en/function.preg-match.php I've gotten close. With the following code I'm able to verify that it contains at least 1 letter, some numbers and at least 1 non- alphanumeric value.
$so= $_POST['so'];
if (preg_match(""/^(?=.*[a-z]{1})(?=.*[0-9]{4})(?=.*[^a-z0-9]{1})/i", $so))
{
print $so;
}
What is the correct syntax to use for this? Is preg_match even the best way to do this?
Try this:
preg_match("#^K[0-9]{4}/[0-9]{4}$#i", $so)
Explanation:
The # characters are regular expression delimiters - they indicate the start/end of the pattern. The ^ and $ indicate the start and end of the string - this means that it will only match if your sales order number is the only thing in the string. The letter K means match that letter, [0-9]{4} means match a digit exactly 4 times. The i at the end means a case-insensitive match - the K will match either "K" or "k".
When developing regular expressions, I often use regular expression testers - these allow you to enter your data and try a bunch of different things to refine your regex. Google PHP regex tester to find a list of tools. Also, there's a very complete reference to regular expressions at http://www.regular-expressions.info/.

Add minimum characters to 'bad word' regex?

I made a regex that captures 'bad words' and substitutes with *** so I can return to user in a form if bad words found, a simplified version can be found here:
https://regex101.com/r/alEb61/3
(?i)\b(Bitch)\b
I'd like to also require min 25 characters in the same regex instead of having to run two separate passes on it (e.g. 1) Bad Words 2) Enough Chars?) is that possible? I basically need to add to above some "less than 25 characters" pipe.
Regex minimum is {min,max} so {1,15} Min of 1 character, max of 15.
I'd do a list of "bad words" then say at least 1 must exist
As far as regex limit goes /^[word]{1,15}$/ - Must be 1 -> 15 "word" found
Check this post out Profanity Filter using a Regular Expression (list of 100 words)
If you plan to replace any bad word on your list and the whole string shorter than 25 chars, use
$s = preg_replace('~^.{0,24}$|\b(?:badWord1|badWordN)\b~i', 'CENSURED', $s);
See the regex demo.
Details
^.{0,24}$ - first alternative
| - or
\b(?:badWord1|badWordN)\b- the second alternative:
\b - leading word boundary
(?: - start of an alternation non-capturing group
badWord1 - bad word #1
| - or
badWordN - bad word N
) - end of the group
\b - a trailing word boundary.
If you plan to match any string longer than 24 chars and not having bad words in it, use
'/^(?!.*\bbadword\b).{25,}$/s'
It will match a string that has at least 25 chars and does not contain badword as a whole word.
See a regex demo.
Details
^ - start of string
(?!.*\bbadword\b) - a negative lookahead that fails the match if after any 0+ chars there is a whole word badword
.{25,} - any 25 or more chars'
$ - end of string.
In the end I created my own version as what I wanted to do was only capture matches IF there was a "bad word" or if there were less than X.
^(?i)(?P<Words>\bBadWord1|BadWordN\b)|(?P<Characters>^.{0,25}$)$
which can be tested here
This served my purpose as
if there are no bad words and > 25 chars it returns no matches and the substitution is not even needed (but can be used)
If there are bad words it indicates that and also substitutes them with * so I can replace the user input text with an alert to replace 'Bad Words' and I know this is the error since the Capture Group is named Words
If there are no bad words but not enough characters it will return the Capture Group as Characters so I can return that alert instead.

Categories