preg_match_all not ignoring characters after pattern [duplicate] - php

This question already has answers here:
Regular Expression Word Boundary and Special Characters
(3 answers)
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
If my string is
<?php $str = 'hello world!'; function func_hello($str) { echo $str; }
I want to find the name of any functions in the string
I'm using the code
preg_match_all('%func_.* ?\(%', $c, $matches);
This is a basic example of what I'm doing. In the real world I'm getting results like this
func_check_error($ajax_action_check, array(
func_post('folder') == '/' || func_post(
func_check_error($fvar_ajax_action_check, array(
Whereas I want the result to be
func_check_error(
func_post(
func_check_error(
I've tried \b to set a boundary but it's not working. i.e.
preg_match_all('%\bfunc_.* ?\(\b%', $c, $matches);

The .* capture the opening parenthesis, and then the first parenthesis (after the function name) is captured, because there is the following parenthesis (the one of the array) which correspond to the \( of your pattern.
You should try a more restrictive condition on the function name, such as alphanumeric only or anything but a parenthesis, maybe replace the func_.* by func_[^(]* wich will stop at the first parenthesis match

Simple regex should work fine:
~func_.*\(~
If this is not giving the results you expect, it may be due to another issue with your code and how you're using the regex.

Related

PHP preg_replace a pattern multiple times in the same string [duplicate]

This question already has answers here:
My regex is matching too much. How do I make it stop? [duplicate]
(5 answers)
Regular expression to stop at first match
(9 answers)
Closed 2 years ago.
I have a text string in which I'd like to replace a pattern that occurs multiple times with a tag, like this:
<?php
$str = 'A *word* is a *word*, and this is another *word*.';
preg_replace('/\*([^$]+)\*/', '<b>$1</b>', $str);
?>
However, the function is replacing the whole range from the first asterisk to the last asterisk, i.e. it doesn't separately enclose each pattern with the tag, which is what I'm trying to accomplish.
Try making the pattern lazy:
$str = 'A *word* is a *word*, and this is another *word*.';
$out = preg_replace('/\*([^$]+?)\*/', '<b>$1</b>', $str);
echo $out; // ^^ change here
This prints:
A <b>word</b> is a <b>word</b>, and this is another <b>word</b>.
For an explanation of the minor change to your regex, the updated pattern says to:
\* match *
([^$]+?) followed by any character which is NOT $, one or more times,
until reaching the
\* FIRST closing *

Regex not matching when used with PHP [duplicate]

This question already has answers here:
How do I match any character across multiple lines in a regular expression?
(26 answers)
Closed 3 years ago.
Given a fairly simple regex, I'd like to match a text between to delimiters:
___MANUAL_TICKET___
###_CLIENT_START_###
TEST
###_CLIENT_END_###
###_PROBLEM_START_###
TEST2
###_PROBLEM_END_###
###_EMAIL_START_###
xyz#test.com
###_EMAIL_END_###
To get the client I am using this regex:
###_CLIENT_START_###\s(.*?)\s###_CLIENT_END_###
which works as seen HERE.
But when I use it in my PHP Code it does not find any matches:
preg_match('####_CLIENT_START_###\s(.*?)\s###_CLIENT_END_####', $source, $matches);
(tried different regex delimiter such as / and ~, same result)
What am I doing wrong?
Note that the dot (.) by default matches any symbol but the line feed (See the documentation).
Since you have multiple line input, you need to use the PCRE_DOTALL option, which can be enabled just adding symbol s at the very end of the pattern
preg_match('####_CLIENT_START_###\s(.*?)\s###_CLIENT_END_####s', $source, $matches);
^ here

Parsing html string in php using regular expression [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
I want to parse a html string using php (Simple number matching).
<i>1002</i><i>999</i><i>344</i><i>663</i>
and I want the result as an array. eg: [1002,999,344,633,...]
I tried like this :
<?php
$html="<i>1002</i><i>999</i><i>344</i><i>663</i>";
if(preg_match_all("/<i>[0-9]*<\/i>/",$html, $matches,PREG_SET_ORDER))
foreach($matches as $match) {
echo strip_tags($match[0])."<br/>";
}
?>
and I got the exact output which I want.
1002
999
344
663
But when I try the same code by making a small change in regular expression I'm getting different answer.
Like this:
<?php
$html="<i>1002</i><i>999</i><i>344</i><i>663</i>";
if(preg_match_all("/<i>.*<\/i>/",$html, $matches,PREG_SET_ORDER))
foreach($matches as $match) {
echo strip_tags($match[0])."<br/>";
}
?>
Output :
1002999344663
(The regular expression matched the entire string.)
Now I want to know why I'm getting like this?
What is the difference if use .* (zero or more) instead of [0-9]* ?
The .* in your regex matches any character ([0-9]* only matches numbers and </i><i> isn't a number). The regex /<i>.*<\/i>/ matches:
<i>1002</i><i>999</i><i>344</i><i>663</i>
^ from here ------------------- to here ^
Since, the whole string is inside <i></i>.
This is because * is greedy. It takes the max amount of characters it can match.
To fix your problem, you need to use .*?. This makes it takes the minimum amount of characters it can match.
The regex /<i>.*?<\/i>/ will work as you want.

PHP preg_match (.*) not matching past line breaks [duplicate]

This question already has answers here:
How do I match any character across multiple lines in a regular expression?
(26 answers)
Closed 2 years ago.
I have this data in a LONGTEXT column (so the line breaks are retained):
Paragraph one
Paragraph two
Paragraph three
Paragraph four
I'm trying to match paragraph 1 through 3. I'm using this code:
preg_match('/Para(.*)three/', $row['file'], $m);
This returns nothing. If I try to work just within the first line of the paragraph, by matching:
preg_match('/Para(.*)one/', $row['file'], $m);
Then the code works and I get the proper string returned. What am I doing wrong here?
Use the s modifier.
preg_match('/Para(.*)three/s', $row['file'], $m);
Pattern Modifiers
Add the multi-line modifier.
Eg:
preg_match('/Para(.*)three/m', $row['file'], $m)
Try setting the regex to dot-all (PCRE_DOTALL), so it includes line breaks (the extra 's' parameter at the end):
preg_match('/Para(.*)three/s', $row['file'], $m);
If you don't like / at the start and and, use T-Regx
$m = Pattern::of('Para(.*)three')->match($row['file'])->first();

How to convert this deprecated ereg() using pointer to preg_match() [duplicate]

This question already has answers here:
How can I convert ereg expressions to preg in PHP?
(4 answers)
Closed 3 years ago.
Using php 5.3 - ereg() deprecated...
I'm trying to convert this function (to preg_match), but I don't understand the "pointer"...
function gethostbyaddr_new($ip)
{
$output = `host -W 1 $ip`;
if (ereg('.*pointer ([A-Za-z0-9.-]+)\..*', $output, $regs))
{
return $regs[1];
}
return $ip;
}
pointer is just a bit of text to be matched
when I run host -W 1 I get
4.4.8.8.in-addr.arpa domain name pointer google-public-dns-b.google.com.
So you can use:
function gethostbyaddr_new($ip)
{
$output = `host -W 1 $ip`;
if (preg_match('/.*pointer ([A-Za-z0-9.-]+)\..*/', $output, $regs))
{
return $regs[1];
}
return $ip;
}
the first parameter of ereg is regular expression. So, .*pointer match anything (.*), then the word "pointer" (pointer), then the rest of expression.
Not much to it really. All you need to do is add a marker character to the start and end of the regex string.
Typically a marker character would be a slash (/), but it can be others (tilde ~ is used quite commonly and would work well for you here), as long as it's the same character at the start and end of the string, and doesn't appear within the string (you'd need to escape it with a backslash if it does).
So your code could look like this:
preg_match('~.*pointer ([A-Za-z0-9.-]+)\..*~', $output, $regs)
Note, if you use a slash as your regex marker character, you will need to double-it up, as slash is also an escape character in a PHP string.
In terms of explaining the actual expression:
.* - this is any number of any characters at the start of the string (you could actually leave this off this expression; it won't affect the matching)
pointer - this is looking for the actual word 'pointer' in the string being matched.
([A-Za-z0-9.-]+) - looks for one or more characters which are alpha-numeric or dot or hyphen. In addition, because these are enclosed in brackets, they become a 'matching group', which means that the result of this part of the search ends up in $regs[1].
\..* - looks for a dot character, followed by any number of any characters. As with the begining of the match, the .* can be dropped as it won't affect the matching.
So the whole expression is looking for a string which looks something like this:
blahblahblahpointer blah123-.blah.blahblahblah
and from that, you will get blah123-.blah in $regs[1].

Categories