regex in php using preg_match - php

I am trying to make a php file so it can accept word for a regex (a+b)(ab+ba)*
i am using preg_match and so far ive come up with this
$subject = "a+b";
$pattern = '(([a]{1}\+[b]{1})?([ab]{1}\+[ba]{1}))';
preg_match($pattern, $subject, $matches);
print_r($matches);
I am not sure if i completely understand how it works but its been few hours now I am trying to figure out.. how do i make it so it fullfills my condition
I want to match (a+b)(ab+ba)* where the first bracket (a+b) is required and the * on the second bracket (ab+ba) means that there could be zero or multiple instances of it.
so it should work like this
$subject= "(a+b)"
Match
$subject= "(a+b)(ab+ba)"
Match
$subject= "(a+b)(ab+ba)(ab+ba)"
Match
$subject= "(ab+ba)"
No Match
$subject= ""
No Match

In (([a]{1}\+[b]{1})?([ab]{1}\+[ba]{1}))
[a]{1} (1 character in a character class) can be written as a
[b]{1} can be written as b
[ab] (2 characters in a character class) means a or b
I think you have to escape the opening parenthesis \(, or else you would start a capturing group.
If I am not mistaken, this might match what you are looking for:
^\(a\+b\)(?:\(ab\+ba\))*$
The second part is in a non capturing group (?:\(ab\+ba\))* and repeated zero or more times.
Php test ouput

Related

PHP preg_match_all no character or not a certain character

Right now the test seems to be working for avoiding the characters that I don't want but it's only returning a count of 2. I know why, I just don't know how to address it. The problem is the last ? is being excluded because the actual match for the 2nd match is (?+ so it's not matching the 3rd since there is no "starting" character for that pattern, it would just be ?).
$pattern = "/([^\w\d'\"`]\?[^\w\d'\"`])/";
$subject = "`test` = ? and `other` = (?+?)";
$count = preg_match_all($pattern, $subject, $matches);
echo "Count: $count\n"; // echoes 2 instead of 3
Basically, I want to count up all the parameters used, so match all ? in the $subject with a ? not surrounded by letters, numbers, quotes, and ticks.
This is the actual pattern that matters:
[^\w\d'\"'`]
Update:
For others, miken32's solution is to convert the above pattern to:
(?=[^\w\d'\"'`])
Try using a lookahead assertion:
$pattern = "/((?<=[^\w\d'\"`])\?(?=[^\w\d'\"`]))/";
It will look ahead without moving the search forward.
Edited to add the lookbehind assertion as well.

Move multiple letters in string using regex

Using a regular expression I want to move two letters in a string.
W28
L36
W29-L32
Should be changed to:
28W
36L
29W-32L
The numbers vary between 25 and 44. The letters that need to be moved are always "W" and/or "L" and the "W" is always first when they both exist in the string.
I need to do this with a single regular expression using PHP. Any ideas would be awesome!
EDIT:
I'm new to regular expressions and tried a lot of things without success. The closest I came was using "/\b(W34)\b/" for each possibility. I also found something about using variables in the replace function but had no luck using these.
Your regex \b(W34)\b matches exactly W34 as a whole word. You need a character class to match W or L, and some alternatives to match the numeric range, and use the most of capturing groups.
You can use the following regex replacement:
$re = '/\b([WL])(2[5-9]|3[0-9]|4[0-4])\b/';
$str = "W28\nL36\nW29-L32";
$result = preg_replace($re, "$2$1", $str);
echo $result;
See IDEONE demo
Here, ([WL]) matches and captures either W or L into group 1, and (2[5-9]|3[0-9]|4[0-4]) matches integer numbers from 25 till 44 and captures into group 2. Backreferences are used to reverse the order of the groups in the replacement string.
And here is a regex demo in case you want to adjust it later.

Getting all URLs on multiple lines

I'm trying to get all these URLs from a website, but I only seem to be able to get the first URL. How can I match all the URLs?
So far I've tried
auto">(.*?)<\/pre>
and:
auto">(.*?)\s<\/pre>
I've tried adding several modifiers such as m and i, but it didn't seem to help.
This is what I'm searching:
auto">http://url-one.com
http://url-two.com
http://url-three.com
http://url-four.com
http://url-five.com</pre>
Can someone help me understand what I am missing?
Quick Answer
As Jonny5 hinted in his comment, . does not match newline characters by default: so (.*?) will not match beyond the first line without the s regex modifier, and his suggestion is then the quick answer:
/auto">(.*?)<\/pre>/s
You can check out his Regex101 demo or related PHP code...
$re = "/auto\">(.*?)<\\/pre>/s";
$str = "auto\">http://url-one.com\nhttp://url-two.com\nhttp://url-three.com\nhttp://url-four.com\nhttp://url-five.com</pre>";
preg_match($re, $str, $matches);
...for reference.
Digging Deeper
However, there is a little more going on here.
i and m Modifiers
First, regardless whether you use the i or m modifier(s), no line of the sample text would match with auto"> at the beginning and <\/pre> at the end of the pattern. You would have to group and follow each with a quantifier to make it optional (e.g. (?:auto">)? and (?:<\/pre>)?) to match each line of the sample text.
m Requires Matching Globally
Second, the m modifier would necessitate matching globally – and further tweaks to the pattern to avoid the last URL match ending with </pre>:
/(?:auto">)?(.+)(?=(?:\n|<\/pre>))/m
You can also check out a second Regex101 demo of this twist or try it out in PHP:
$re = "/(?:auto\">)?(.+)(?=(?:\\n|<\\/pre>))/m";
$str = "auto\">http://url-one.com\nhttp://url-two.com\nhttp://url-three.com\nhttp://url-four.com\nhttp://url-five.com</pre>";
preg_match_all($re, $str, $matches); // NOTE: preg_match_all to match globally
^^^^
Which Approach to Choose
The choice between simply adding the s modifier or tweaking the pattern, adding the m modifier, and matching globally mostly comes down to whether you want a single match with all the URLs (separated by newlines) or many matches, each with one of the URLs.
The latter yields the matches below...
MATCH 1
1. [6-24] `http://url-one.com`
MATCH 2
1. [25-43] `http://url-two.com`
MATCH 3
1. [44-64] `http://url-three.com`
MATCH 4
1. [65-84] `http://url-four.com`
MATCH 5
1. [85-104] `http://url-five.com`
...versus the single match that the original pattern and the s modifier yield:
MATCH 1
1. [6-104] `http://url-one.com
http://url-two.com
http://url-three.com
http://url-four.com
http://url-five.com`

Regex to match everything between the first and last occurrence of two distinct characters

I have the following code:
$str_val = "L(ine 1(
L(ine 2)
Line 3
Line 4)";
$regex = '/\(([^\)]*?)\)/i';
preg_match($regex, $str_val, $matches_arr);
print_r($matches_arr);
This code matches everything between the first ( and the first ).
I'm looking for what I would put in $regex that would match everything between the first ( and the last ).
I'd appreciate the assistance.
Thanks in advance.
You can use this: -
'/\((.*)\)/s'
/s modifier is used to enable the dot metacharacter to match everything including a newline. And, since .* is a greedy quantifier, it will match the longest string possible. So, it will match till the last ).
Try this regular expression:
\([^\)]*\)
The first match is what you need.
Just do a greedy search
$regex = '/\(.*\)/s';
If you really want to have everything between (...) use this one
$regex = '/\((.*)\)/s';
preg_match($regex, $str_val, $matches_arr);
echo $matches_arr[1];

PHP regex for #[mention]

Can someone help me:
$pattern = "/^(?:[a-zA-Z0-9?. ]?)+#([a-zA-Z0-9]+)(.+)?$/";
$str = "Hey #[14256] hey how are you?";
preg_match($pattern, $title, $matches);
print_r($matches);
The print result works fine if I remove the brackets (#[14256]) of the # mention, however I can't figure out how to do the regex to work with the brackets. So I will get the result 14256 in my array.
You need to include the brackets in your regex:
"/^(?:[a-zA-Z0-9?. ]?)+#(\\[?[a-zA-Z0-9]+\\]?)(.+)?$/"
Notice the \\[? and \\]? I've added; those will match the [] characters, and will also match if there is no [].
Keep in mind, the above will match #[14256 and #14256]. If you want to only match one or the other, you need to do it a little differently.
"/^(?:[a-zA-Z0-9?. ]?)+#([a-zA-Z0-9]+|\\[[a-zA-Z0-9]+\\])(.+)?$/"
This will match EITHER #aA1 or #[aA1], but not the bad examples as I showed above.
One last thing to include: This regex will only match one instance of the #[mention]. If you want to match ALL instances of it (such as in "hey #123, how is #456 these days?"), use the following with preg_match_all():
"/#([a-zA-Z0-9]+|\\[[a-zA-Z0-9]+\\])/"
Then $matches[1] will contain both 123 and 456.
You need to escape the brackets in your regex so they don't get interpreted as a new character class. Try this instead (it will only capture the number, not the brackets. Place the escaped brackets in the parentheses to capture them as part of a backreference):
$pattern = "/^(?:[a-zA-Z0-9?. ]?)+#\[([a-zA-Z0-9]+)\](.+)?$/";

Categories