Hi, My pattern is:
'<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+/>'
And the string:
<div class="nwstxtpic">
<span id="bodyHolder_newstextDetail_nwstxtPicPane"><a href="xxxxx" target="_blank"><img alt="xxxxx" title="xxxxx" src='xxxxx' />
Well, my php code for finding and getting the value of 4 groups that i have defined in patern is:
$picinfo=preg_match_all('/<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+/>/',$newscontent,$matches);
foreach ($matches[0] as $match) {
echo $match;
}
I dont know how to get the value of these 4 groups
href="(.*)"
alt="(.*)"
title="(.*)"
src=\'(.*)\'
Whould you please Help me?
Thank you.
preg_match_all() by default returns the result in pattern order, which is not very convenient. Pass the PREG_SET_ORDER flag so that the data is arranged in a more logical way:
$newscontent='<span id="bodyHolder_newstextDetail_nwstxtPicPane"><a href="xxxxx" target="_blank"><img alt="xxxxx" title="xxxxx" src=\'xxxxxbb\' />';
$picinfo=preg_match_all('/<span\s+id="bodyHolder_newstextDetail_nwstxtPicPane"><a\s+href="(.*)"\s+target="_blank"><img\s+alt="(.*)"\s+title="(.*)"\s+src=\'(.*)\'\s+\/>/',$newscontent,$matches,PREG_SET_ORDER);
foreach ($matches as $match) {
$href = $match[1];
$alt = $match[2];
$title = $match[3];
$src = $match[4];
echo $title;
}
Your RegEx is correct, as the manual says, by default PREG_PATTERN_ORDER is followed which orders results so that $matches[0] is an array of full pattern matches, $matches[1] is an array of strings matched by the first parenthesized subpattern, and so on.
So as in your case, $matches1 will contain the href, $matches2 will contain the alt and so on. Like,
for($i = 0; $i <= count($matches[0]); $i++ )
echo "href = {$matches[1][$i]}, alt = {$matches[2][$i]}";
$matches[0] will contain the full matched strings.
BTW, it is always advisable to use an XML parser, try DOMDocument. The obligatory.
Related
I am trying to make this work but I can't:
$str = "<html>1</html><html>2</html><html>3</html>";
$matches = array();
preg_match_all("(?<=<html>)(.*?)(?=</html>)", $str, $matches);
foreach ($matches as $d)
{
echo $d;
}
What I am doing wrong? The output must be:
123
This should work for you:
$str = "<html>1</html><html>2</html><html>3</html>";
preg_match_all("~(?<=<html>).*?(?=</html>)~", $str, $matches);
foreach ($matches[0] as $d) {
echo $d;
}
Output:
123
Changes are:
Use missing regex delimiters ~ in preg_match_all function pattern
Remove capturing group since you are already using lookahead and lookbehind so entire match can be used in further processing
Using $matches[0] in foreach loop instead of $matches
There is no need to declare/initialize $matches before preg_match_all call
You need to add delimiters (I've used !) and you can skip the lookaheads/lookbehinds. Just make a capture group around the numbers you want. Nothing else.
Then look in the second matches element for the array with the individual values.
Example:
preg_match_all("!<html>(.*?)</html>!", $str, $matches);
foreach ($matches[1] as $d)
{
echo $d;
}
I need to print all matches using preg_match_all.
$search = preg_match_all($pattern, $string, $matches);
foreach ($matches as $match) {
echo $match[0];
echo $match[1];
echo $match[...];
}
The problem is I don't know how many matches there in my string, and even if I knew and if it was 1000 that would be pretty dumb to type all those $match[]'s.
The $match[0], $match[1], etc., items are not the individual matches, they're the "captures".
Regardless of how many matches there are, the number of entries in $matches is constant, because it's based on what you're searching for, not the results. There's always at least one entry, plus one more for each pair of capturing parentheses in the search pattern.
For example, if you do:
$matches = array();
$search = preg_match_all("/\D+(\d+)/", "a1b12c123", $matches);
print_r($matches);
Matches will have only two items, even though three matches were found. $matches[0] will be an array containing "a1", "b12" and "c123" (the entire match for each item) and $matches[1] will contain only the first capture for each item, i.e., "1", "12" and "123".
I think what you want is something more like:
foreach ($matches[1] as $match) {
echo $match;
}
Which will print out the first capture expression from each matched string.
Does print_r($matches) give you what you want?
You could loop recursively. This example requires SPL and PHP 5.1+ via RecursiveArrayIterator:
foreach( new RecursiveArrayIterator( $matches ) as $match )
print $match;
I want to find anything that matches
[^1] and [/^1]
Eg if the subject is like this
sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]
I want to get back an array with abcdef and 12345 as the elements.
I read this
And I wrote this code and I am unable to advance past searching between []
<?php
$test = '[12345]';
getnumberfromstring($test);
function getnumberfromstring($text)
{
$pattern= '~(?<=\[)(.*?)(?=\])~';
$matches= array();
preg_match($pattern, $text, $matches);
var_dump($matches);
}
?>
Your test checks the string '[12345]' which does not apply for the rule of having an "opening" of [^digit] and a "closing" of [\^digit]. Also, you're using preg_match when you should be using: preg_match_all
Try this:
<?php
$test = 'sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]';
getnumberfromstring($test);
function getnumberfromstring($text)
{
$pattern= '/(?<=\[\^\d\])(.*?)(?=\[\/\^\d\])/';
$matches= array();
preg_match_all($pattern, $text, $matches);
var_dump($matches);
}
?>
That other answer doesn't really apply to your case; your delimiters are more complex and you have to use part of the opening delimiter to match the closing one. Also, unless the numbers inside the tags are limited to one digit, you can't use a lookbehind to match the first one. You have to match the tags in the normal way and use a capturing group to extract the content. (Which is how I would have done it anyway. Lookbehind should never be the first tool you reach for.)
'~\[\^(\d+)\](.*?)\[/\^\1\]~'
The number from the opening delimiter is captured in the first group and the backreference \1 matches the same number, thus insuring that the delimiters are correctly paired. The text between the delimiters is captured in group #2.
I have tested following code in php 5.4.5:
<?php
$foo = 'sometext[^1]abcdef[/^1]somemoretext[^2]12345[/^2]';
function getnumberfromstring($text)
{
$matches= array();
# match [^1]...[/^1], [^2]...[/^2]
preg_match_all('/\[\^(\d+)\]([^\[\]]+)\[\/\^\1\]/', $text, $matches, PREG_SET_ORDER);
for($i = 0; $i < count($matches); ++$i)
printf("%s\n", $matches[$i][2]);
}
getnumberfromstring($foo);
?>
output:
abcdef
123456
I am trying to load a remote website and get all numbers that are inside of parentheses. But what ends up happening is it only matches the last value.
Is my regex wrong? Am I using the correct flags?
I have added the example of what it should match on in the second $html variable.
//$html = file_get_contents("http://example.com/test.html");
$html = "(1234) (12) (1) \r\n (1346326)";
preg_match_all("^[(\d)]+$^", $html, $matches, PREG_PATTERN_ORDER);
print_r($matches);
echo "<br>";
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
Thanks.
How about:
preg_match_all("/\((\d+)\)/", $html, $matches, PREG_PATTERN_ORDER);
print_r($matches[1]);
I see two possible issues.
First you are matching from start(^) to end($) it it will only match what fits exactly between start of line and end of line.
Second, you most likely want to use the /gs regex paramter to slurp it all in.
preg_match_all("/\b(\d+)\b/gs" ...
I need to print all matches using preg_match_all.
$search = preg_match_all($pattern, $string, $matches);
foreach ($matches as $match) {
echo $match[0];
echo $match[1];
echo $match[...];
}
The problem is I don't know how many matches there in my string, and even if I knew and if it was 1000 that would be pretty dumb to type all those $match[]'s.
The $match[0], $match[1], etc., items are not the individual matches, they're the "captures".
Regardless of how many matches there are, the number of entries in $matches is constant, because it's based on what you're searching for, not the results. There's always at least one entry, plus one more for each pair of capturing parentheses in the search pattern.
For example, if you do:
$matches = array();
$search = preg_match_all("/\D+(\d+)/", "a1b12c123", $matches);
print_r($matches);
Matches will have only two items, even though three matches were found. $matches[0] will be an array containing "a1", "b12" and "c123" (the entire match for each item) and $matches[1] will contain only the first capture for each item, i.e., "1", "12" and "123".
I think what you want is something more like:
foreach ($matches[1] as $match) {
echo $match;
}
Which will print out the first capture expression from each matched string.
Does print_r($matches) give you what you want?
You could loop recursively. This example requires SPL and PHP 5.1+ via RecursiveArrayIterator:
foreach( new RecursiveArrayIterator( $matches ) as $match )
print $match;