Using backreference wit php preg_match_all - php

I'am quite new in regex and php but I'm facing an issue I can't handle alone.
I've prepared this regex to find patterns starting with upper-case letter. It could sounds something like :
capture any pattern that
starts with one or more Upper-case letter
then one or more any letter or character in the list
then a space, or punctuation mark
and I use a backreference to set I want those pattern up to 3 times :
([A-ZÁÀÂÄÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ]{1,}[a-zàáâãäåçèéêëìíîïðòóôõöùúûüýÿ;:«0-9]{1,}[\s-….?,;]\1{1,3})
According to https://regex101.com/r/pB3nY7/2 it works as a javascript regex but not as a php regex.
I've rade the other posts and make sure :
I use single quotes instead of double quotes
and I "protected" the \ in my php script :
'#([A-ZÁÀÂÄÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ]{1,}[a-zàáâãäåçèéêëìíîïðòóôõöùúûüýÿ;:«0-9]{1,}[\\s-….?,;]\\1{1,3})#'
But it still can't match any pattern starting with a Upper-case letter.
Thank you in advance for all advice you may provide,
Regards,
Charles

i have tested it on this website http://www.phpliveregex.com/ :
(^[A-ZÁÀÂÄÉÈÊËÍÌÎÏÓÒÔÖÚÙÛÜ]{1,}[a-zàáâãäåçèéêëìíîïðòóôõöùúûüýÿ;:«0-9]{1,}[\s-….?,;]{1,1}){1,3}

To be more generalist, you could use unicode properties:
^(\p[Lu}+[\p{Ll};:«0-9]+[\s\p{P}]){1,3}
Where \p[Lu} stands for an uppercase letter, \p{Ll} a lowercase letter and \p{P} a punctuation.
preg_match('/^(\p[Lu}+[\p{Ll};:«0-9]+[\s\p{P}]){1,3}/', $string, $match);

Related

RegEx expression to hit only words with a-z and no aumlats

Can you help me out with this one? I have a list of words like this:
sachbearbeiter/-in
referent/-in
anlagenführer/-in
it-projektleiter/-in
I want to select only:
sachbearbeiter/-in
referent/-in
This is my current regex: ([a-z]+)/-(in)
The problem is it hits all even the ones with - and with ü
Thank you in advance.
You can use anchors to match the word you want:
^([a-z]+)/-(in)$
^---- Here ----^
Working demo
Update: for your comment, if you want to accept aumlats you can use unicode flag with \w like this:
^(\w+)/-(in)$
Working demo
You need to specify beginning & end of string so that it can match exact chars
change your regex to
^([a-z]+)/-(in)$
^ -> stands for beginning of string
$-> for end of string
Your current regex i.e. ([a-z]+)/-(in) does escape the / character and also trying to look into substrings that matches the pattern, so it'll show each of them.
Regex should be : ^([a-z]+)\/-(in) i.e. it should start with only small case alphabets with escaped /

Match 2 or more uppercase characters in entire string

I'm trying to create a pattern in PHP that matches 2 or more upper case characters in a string.
I've tried the following, but it only matches 2 or more upper case characters in a row, not the entire string:
preg_match('/[A-Z]{2,}/', $string);
For example, the string "aBcDe" or "Red Apple" should return true.
You just have to allow other characters between your uppercase letters:
^(?:.*?\p{Lu}){2}
Demo
I used \p{Lu} here to include Unicode characters as well. If you don't want that just use [A-Z] instead like you did in your pattern.
This simply means:
^ from the start of the pattern
(?: group:
.*? match anything, but as few chars as possible
\p{Lu} match an uppercase letter
){2} ... two times
If all you need to do is identify that a string contains at least 2 uppercase characters then you can use the following:
[A-Z].*?[A-Z]
Try it here.
If you need to identify the specific uppercase characters in the string then things get more complicated.
UPDATE: As Lucas mentioned, you need a different regex if you want unicode support.
\p{Lu}.*?\p{Lu}
^.*[A-Z].*[A-Z].*$
A simple pattern stating the same would do.See demo.
https://regex101.com/r/pT4tM5/23
[A-Z].*[A-Z]
is about as simple as it gets - match an uppercase followed by anything repeated any number of times followed by any other uppercase letter.
If you need to match the whole line/string that has at least 2 upper case letters, you can also use
^(?=(?:.*[A-Z]){2}).+$
Demo here.

Test if a word is composed of alpha characters, white spaces and periods?

I need a regex that would test if a word is composed of letters (alpha characters), white spaces, and periods (.). I need this to use for validating names that is entered in my database.
This is what I currently use:
preg_match('/^[\pL\s]+$/u',$foo)
It works fine for checking alpha characters and whitespaces, but rejects names with periods as well. I hope you guys can help as I have no idea how to use regex.
Add a dot to the character class so that it would match a literal dot also.
preg_match('/^[\p{L}.\s]+$/u',$foo)
OR
preg_match('/^[\pL.\s]+$/u',$foo)
Explanation:
^ Asserts that we are at the start.
[\pL.\s]+ Matches any character in the list one or more times. \pL matches any Kind of letter from any language.
$ Asserts that we are at the end.
The following regex should satisfy your condition:
preg_match('/^[a-zA-Z\s.]+$/',$foo)
In this link, you will find all the information you need to figure regex out with PHP. PHP Regex Cheat Sheet
Basically, if you want to add the period you add . :
preg_match('/^[\pL\s\.]+$/u',$foo)
Enjoy! :)

Regex: Replace Characters In-between Two Characters

I'm having trouble using Regex to replace strings that have a ? in between two characters. Two examples of what I'd like Regex to match are:
• Replace thi?s question mark but not this one?
• ? Replace the lonely question mark
What's the best way to:
a) Match a character surrounded by other characters
b) Match a character that is on it's own and has no characters before it or after it
I'm using PHP preg_match and MySQL REGEXP to preform these pattern matchings. For MySQL I've tried:
SELECT description
FROM locations
WHERE description
REGEXP '/|([^?]+)\/'
For PHP I've tried:
preg_match('/|([^?]+)\/', $string);
I suggest this one for PHP:
(?<!\w(?=\? ))\?(?!\s*$)\s*
(?!\s*$) is a negative lookahead that will prevent a ? from matching if it is at the end of a sentence (I added whitespaces just in case).
(?<!\w(?=\? )) is a little more complex. It will prevent a match if the ? is preceded by a \w character (typically read as [a-zA-Z0-9_]) and followed by a space.
regex101 demo
I don't know whether mysql supports lookbehinds though.
|([^?]+)\
This is your current regex and I don't think your PHP code runs. The \ at the end is not escaping anything (in fact, it's trying to escape the delimiter) so... :s
Check this Demo Code Viper
Pattern
/(\w+)?(\w+)/g
Test this Pattern
PHP
<?php
echo preg_replace("/(\w+)?(\w+)/i", "thi?s", "?");
?>
Result
?
Hope this help you!

Check a variable using regex

Im about to create a registration form for my website. I need to check the variable, and accept it only if contains letter, number, _ or -.
How can do it with regex? I used to work with them with preg_replace(), but i think this is not the case. Also, i know that the "ereg" function is dead. Any solutions?
this regex is pretty common these days.
if(preg_match('/^[a-z0-9\-\_]+$/i',$username))
{
// Ok
}
Use preg_match:
preg_match('/^[\w-]+$/D', $str)
Here \w describes letters, digits and the _, so [\w-]+ matches one or more letters, digits, _, and -. ^ and $ are so called anchors that denote the begin and end of the string respectively. The D modifier avoids that $ really matches the end of the string and is not followed by a line break.
Note that the letter and digits that are matched by \w depend on the current locale and might match other letter or digits than just [a-zA-Z0-9]. So if you just want these, use them explicitly. And if you want to allow more than these, you could also try character classes that are describes by Unicode character properties like \p{L} for all Unicode letters.
Try preg_match(). http://php.net/manual/en/function.preg-match.php

Categories