php preg_match_all and recursive rexep - php

I have some trouble with a generated file, and I like to make some substitution
Say I have got this pattern :
<ul/><htmlelement>some text</htmlelement>
I want to find with my regexep the value of some text, since I can find the element htmlelement with a regexp, i want to recursively include it in the regex like
preg_match_all("#<ul/><([^><])>(.)*</(first capuring match)>#", $string, $matches);
Do you have a solution?

You miss the + quantifier for the "htmlelement" opening tag.
You need the * inside the capture group
and better make it non-greedy with ?.
Refer the "first capturing match" with \1.
So the regex should be:
<ul\/><([^><]+)>(.*?)<\/\1>
^ ^^ ^
1 23 4
Demo: https://regex101.com/r/f25N9J/1

Related

PHP preg_replace_callback match string but exclude urls

What I'm trying to do is find all the matches within a content block, but ignore anything that is inside tags, for use inside preg_replace_callback().
For example:
test
test title
test
In this case, I want the first line to match, and the third line to match, but NOT the url match, nor the title match in between the a tags.
I've got a regex that I feel like is close:
#(?!<.*?)(\btest\b)(?![^<>]*?>)#si
(and this will not match the url part)
But how do I modify the regex to also exclude the "test" between a and /a?
If it's always the same pattern you can use [A-Z] or a combination like [A-Za-z]
I ended up solving it myself. This regex pattern will do what I wanted:
#(?!<a[^>]*?>)(\btest\b)(?![^<]*?<\/a>)#si

How to write such url pattern?

I need URL pattern for my router which would match with:
/page_name.html
/page_name.html/1
/page_name.html/2
....
/page_name.html/999
And preg_match() must put page_name into matches[1] and digit after slash into matches[2] (or empty string, index [2] must always be present!).
I need this to not match my patern:
/page_name.html/
/page_name.html131
I wrote this:
^\/([\w\-]+)\.html[\/]?([\d]{1,3})?$/
But it mathces URLs like /page_name.html123 and doesn't put anything into matches[2] if there is no digit.
You can use this regex:
preg_match('~^/([\w-]+)\.html(?|/(\d{1,3})|())$~', $matches, $input);
RegEx Demo
(?|...) - Subpatterns declared within each alternative of this construct will start over from the same index. This is to make sure to always populate $matches[2] with something, even an empty string.

preg_replace with Regex - find number-sequence in URL

I'm a regex-noobie, so sorry for this "simple" question:
I've got an URL like following:
http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx
what I'm going to archieve is getting the number-sequence (aka Job-ID) right before the ".aspx" with preg_replace.
I've already figured out that the regex for finding it could be
(?!.*-).*(?=\.)
Now preg_replace needs the opposite of that regular expression. How can I archieve that? Also worth mentioning:
The URL can have multiple numbers in it. I only need the sequence right before ".aspx". Also, there could be some php attributes behind the ".aspx" like "&mobile=true"
Thank you for your answers!
You can use:
$re = '/[^-.]+(?=\.aspx)/i';
preg_match($re, $input, $matches);
//=> 146370543
This will match text not a hyphen and not a dot and that is followed by .aspx using a lookahead (?=\.aspx).
RegEx Demo
You can just use preg_match (you don't need preg_replace, as you don't want to change the original string) and capture the number before the .aspx, which is always at the end, so the simplest way, I could think of is:
<?php
$string = "http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-146370543.aspx";
$regex = '/([0-9]+)\.aspx$/';
preg_match($regex, $string, $results);
print $results[1];
?>
A short explanation:
$result contains an array of results; as the whole string, that is searched for is the complete regex, the first element contains this match, so it would be 146370543.aspx in this example. The second element contains the group captured by using the parentheeses around [0-9]+.
You can get the opposite by using this regex:
(\D*)\d+(.*)
Working demo
MATCH 1
1. [0-100] `http://stellenanzeige.monster.de/COST-ENGINEER-AUTOMOTIVE-m-w-Job-Mainz-Rheinland-Pfalz-Deutschland-`
2. [109-114] `.aspx`
Even if you just want the number for that url you can use this regex:
(\d+)

preg_match for select string like #__james_name

I need a regular expression for select some text like #__james_name in PHP
I tried with :
(^#__[a-z]*)*
But I did not succeed.
help please
UPDATE
I tried with :
\#__([a-z]*)_([a-z]*)
How to using this in preg_match ?
Your grouping is a bit wrong, try
^#_(_[a-z]+)*
see it here on Regexr.
^ is the anchor to the start of the string, you don't want to repeat that. I replaced also the * with a + inside the group, so it requires at least one letter.
Now the string has to start with "#_" and then there can be 0 or more parts starting with an underscore followed by one or more (lowercase) letters.
This regex will match:
#_
#__a
#__a_b
#__a_b_ccccc_d_efadsfaksdjh
preg_match('/(^#__[a-z_]*)/', '#__james_name', $matches);
Do like this
$str=preg_replace('/^#__([\w]+)/', '$1', $str);

How can use a match in the same regex in php?

I have this string (that is a serialized variable in php):
s:12:"hello "world";
and I wanna to find "hello "world" only with regex, I try this, but seems it is stupid :P
(s:(?P<num>[0-9]+):".{\k{num}}";)
I only want to know how I can use "num" result in the its regex?
this regex is used in a big regex so I can't check for end of string.
thanks advance!
You can use your named capturing groups as backreference like this
Back references to the named subpatterns can be achieved by (?P=name)
or, since PHP 5.2.2, also by \k or \k'name'. Additionally PHP
5.2.4 added support for \k{name} and \g{name}.
According to php.net
But I think this can be used only to match the found pattern again, but not as a number in a quantifier. (At least I didn't got it to work.)
You can use preg_match function, which will populate an array of matches:
If matches is provided, then it is filled with the results of search. $matches[0] will contain the text that matched the full pattern, $matches1 will have the text that matched the first captured parenthesized subpattern, and so on.
More information about preg_match: PHP: preg_match
$text = 's:12:"hello "world";s:12:"good bue world";';
$pattern = "(.*:[0-9]+:\"(.*)\";.*)U";
preg_match_all($pattern,$text,$r);

Categories