php preg_match_all numbers in parentheses - php

I am trying to load a remote website and get all numbers that are inside of parentheses. But what ends up happening is it only matches the last value.
Is my regex wrong? Am I using the correct flags?
I have added the example of what it should match on in the second $html variable.
//$html = file_get_contents("http://example.com/test.html");
$html = "(1234) (12) (1) \r\n (1346326)";
preg_match_all("^[(\d)]+$^", $html, $matches, PREG_PATTERN_ORDER);
print_r($matches);
echo "<br>";
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
Thanks.

How about:
preg_match_all("/\((\d+)\)/", $html, $matches, PREG_PATTERN_ORDER);
print_r($matches[1]);

I see two possible issues.
First you are matching from start(^) to end($) it it will only match what fits exactly between start of line and end of line.
Second, you most likely want to use the /gs regex paramter to slurp it all in.
preg_match_all("/\b(\d+)\b/gs" ...

Related

Extract multiple Strings between two String in PHP

I am trying to make this work but I can't:
$str = "<html>1</html><html>2</html><html>3</html>";
$matches = array();
preg_match_all("(?<=<html>)(.*?)(?=</html>)", $str, $matches);
foreach ($matches as $d)
{
echo $d;
}
What I am doing wrong? The output must be:
123
This should work for you:
$str = "<html>1</html><html>2</html><html>3</html>";
preg_match_all("~(?<=<html>).*?(?=</html>)~", $str, $matches);
foreach ($matches[0] as $d) {
echo $d;
}
Output:
123
Changes are:
Use missing regex delimiters ~ in preg_match_all function pattern
Remove capturing group since you are already using lookahead and lookbehind so entire match can be used in further processing
Using $matches[0] in foreach loop instead of $matches
There is no need to declare/initialize $matches before preg_match_all call
You need to add delimiters (I've used !) and you can skip the lookaheads/lookbehinds. Just make a capture group around the numbers you want. Nothing else.
Then look in the second matches element for the array with the individual values.
Example:
preg_match_all("!<html>(.*?)</html>!", $str, $matches);
foreach ($matches[1] as $d)
{
echo $d;
}

php get grouped values in Regex

i am using php Regular expressions but i can't retrieve values that i group using ()
this is my input
<img src="http://www.example.com/image.jpg" title="title" />
i need only src value , this is my regex '"<img src=\"(.*?)\".*?\/>"'
if i can retrieve First group just like java patterns my problem is sloved
preg_match_all('"<img src=\"(.*?)\".*?\/>"', $source, $re);
print_r($re);
and it return full image tag like this <img src="http://www.example.com/image.jpg" title="title" />
To match a single string, preg_match function is enough. You don't need to go for preg_matchall function. If you want to match more number of strings then you could use preg_matchall function. And also first try to match the exact string through the pattern rather than to go for grouping. If it's impossible to match a particular string then go for grouping.
In the below, matching the exact value of src attribute is done.
You could get the value of src in two ways,
1. positive lookbehind
Regex:
(?<=src=\")[^\"]*
PHP code:(Through match_all)
<?php
$string = "<img src=\"http://www.example.com/image.jpg\" title=\"title\" />";
$regex = '~(?<=src=\")[^\"]*~';
preg_match_all($regex, $string, $matches);
print_r($matches);
?>
PHP code:(Through match)
<?php
$string = "<img src=\"http://www.example.com/image.jpg\" title=\"title\" />";
$regex = '~(?<=src=\")[^\"]*~';
if (preg_match($regex, $string, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> http://www.example.com/image.jpg
Explanation:
(?<=src=\") Positive look-behind is used here. So the regex engine puts the match marker just after to the src=".
[^\"]* Now it starts matching any character zero or more times but not of ". When it finds a ", it stops matching characters.
2. Using \K
Regex:
src=\"\K[^\"]*
PHP code (through match)
<?php
$string = "<img src=\"http://www.example.com/image.jpg\" title=\"title\" />";
$regex = '~src=\"\K[^\"]*~';
if (preg_match($regex, $string, $m)) {
$yourmatch = $m[0];
echo $yourmatch;
}
?> //=> http://www.example.com/image.jpg
Explanation:
\K resets the starting point of the reported match. Any previously consumed characters are no longer included in the final match.
src=\"\K So it discards the previously matched src=".
[^\"]* Matches any character zero or more times but not of "
You're using preg_match_all so that you need to pass index as well, use print_r($re[1]); to get results.
I Got It Accidently !
we can Code Like this for first Grouped
print_r($re[1]);

How to get the value of groups in preg_match_all php?

Hi, My pattern is:
'<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+/>'
And the string:
<div class="nwstxtpic">
<span id="bodyHolder_newstextDetail_nwstxtPicPane"><a href="xxxxx" target="_blank"><img alt="xxxxx" title="xxxxx" src='xxxxx' />
Well, my php code for finding and getting the value of 4 groups that i have defined in patern is:
$picinfo=preg_match_all('/<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+/>/',$newscontent,$matches);
foreach ($matches[0] as $match) {
echo $match;
}
I dont know how to get the value of these 4 groups
href="(.*)"
alt="(.*)"
title="(.*)"
src=\'(.*)\'
Whould you please Help me?
Thank you.
preg_match_all() by default returns the result in pattern order, which is not very convenient. Pass the PREG_SET_ORDER flag so that the data is arranged in a more logical way:
$newscontent='<span id="bodyHolder_newstextDetail_nwstxtPicPane"><a href="xxxxx" target="_blank"><img alt="xxxxx" title="xxxxx" src=\'xxxxxbb\' />';
$picinfo=preg_match_all('/<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+\/>/',$newscontent,$matches,PREG_SET_ORDER);
foreach ($matches as $match) {
$href = $match[1];
$alt = $match[2];
$title = $match[3];
$src = $match[4];
echo $title;
}
Your RegEx is correct, as the manual says, by default PREG_PATTERN_ORDER is followed which orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.
So as in your case, $matches1 will contain the href, $matches2 will contain the alt and so on. Like,
for($i = 0; $i <= count($matches[0]); $i++ )
echo "href = {$matches[1][$i]}, alt = {$matches[2][$i]}";
$matches[0] will contain the full matched strings.
BTW, it is always advisable to use an XML parser, try DOMDocument. The obligatory.

Regular expression for between two dynamic patterns

I want to find anything that matches
[^1] and [/^1]
Eg if the subject is like this
sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]
I want to get back an array with abcdef and 12345 as the elements.
I read this
And I wrote this code and I am unable to advance past searching between []
<?php
$test = '[12345]';
getnumberfromstring($test);
function getnumberfromstring($text)
{
$pattern= '~(?<=\[)(.*?)(?=\])~';
$matches= array();
preg_match($pattern, $text, $matches);
var_dump($matches);
}
?>
Your test checks the string '[12345]' which does not apply for the rule of having an "opening" of [^digit] and a "closing" of [\^digit]. Also, you're using preg_match when you should be using: preg_match_all
Try this:
<?php
$test = 'sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]';
getnumberfromstring($test);
function getnumberfromstring($text)
{
$pattern= '/(?<=\[\^\d\])(.*?)(?=\[\/\^\d\])/';
$matches= array();
preg_match_all($pattern, $text, $matches);
var_dump($matches);
}
?>
That other answer doesn't really apply to your case; your delimiters are more complex and you have to use part of the opening delimiter to match the closing one. Also, unless the numbers inside the tags are limited to one digit, you can't use a lookbehind to match the first one. You have to match the tags in the normal way and use a capturing group to extract the content. (Which is how I would have done it anyway. Lookbehind should never be the first tool you reach for.)
'~\[\^(\d+)\](.*?)\[/\^\1\]~'
The number from the opening delimiter is captured in the first group and the backreference \1 matches the same number, thus insuring that the delimiters are correctly paired. The text between the delimiters is captured in group #2.
I have tested following code in php 5.4.5:
<?php
$foo = 'sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]';
function getnumberfromstring($text)
{
$matches= array();
# match [^1]...[/^1], [^2]...[/^2]
preg_match_all('/\[\^(\d+)\]([^\[\]]+)\[\/\^\1\]/', $text, $matches, PREG_SET_ORDER);
for($i = 0; $i < count($matches); ++$i)
printf("%s\n", $matches[$i][2]);
}
getnumberfromstring($foo);
?>
output:
abcdef
123456

How to use preg match to get the contents of this html tag?

I am trying to get at the desired content in this tag:
<p class="address">
desired content
</p>
this is my attempt:
preg_match_all("/\<p class=\"address\">(.*)\<\/p\>/", $contents, $matches);
But the $matches array is empty. Please help.
Thanks
You could try this:
$contents = '<p class="address">desired content</p>';
$res = preg_match_all("/\<p class=\"address\">(.*)\<\/p\>/s", $contents, $matches);
var_dump($res);
var_dump($matches);
$matches is not empty, right?
you need to set the PCRE_DOTALL flag to make . also match \n - see modifiers
use .*? (ungreedy operator) to not prevent matching </p> in the capture group
you can use a different pattern delimiter and single quotes to remove all the backslashes, makes your pattern much more readable!
like so:
<?php
$contents = '<p class="address">
desired content
</p>';
$res = preg_match_all('#<p class="address">(.*?)</p>#s', $contents, $matches);
var_dump($matches);

Categories