preg_match acting very strange - php

I am using preg_match() to extract pieces of text from a variable, and let's say the variable looks like this:
[htmlcode]This is supposed to be displayed[/htmlcode]
middle text
[htmlcode]This is also supposed to be displayed[/htmlcode]
i want to extract the contents of the [htmlcode]'s and input them into an array. i am doing this by using preg_match().
preg_match('/\[htmlcode\]([^\"]*)\[\/htmlcode\]/ms', $text, $matches);
foreach($matches as $value){
return $value . "<br />";
}
The above code outputs
[htmlcode]This is supposed to be displayed[/htmlcode]middle text[htmlcode]This is also supposed to be displayed[/htmlcode]
instead of
[htmlcode]This is supposed to be displayed[/htmlcode]
[htmlcode]This is also supposed to be displayed[/htmlcode]
and if have offically run out of ideas

As explained already; the * pattern is greedy. Another thing is to use preg_match_all() function. It'll return you a multi-dimension array of matched content.
preg_match_all('#\[htmlcode\]([^\"]*?)\[/htmlcode\]#ms', $text, $matches);
foreach( $matches[1] as $value ) {
And you'll get this: http://codepad.viper-7.com/z2GuSd

A * grouper is greedy, i.e. it will eat everything until last [/htmlcode]. Try replacing * with non-greedy *?.

* is by default greedy, ([^\"]*?) (notice the added ?) should make it lazy.
What do lazy and greedy mean in the context of regular expressions?

Look at this piece of code:
preg_match('/\[htmlcode\]([^\"]*)\[\/htmlcode\]/ms', $text, $matches);
foreach($matches as $value){
return $value . "<br />";
}
Now, if your pattern works fine and all is ok, you should know:
return statement will break all loops and will exit the function.
The first element in matches is the whole match, the whole string. In your case $text
So, what you did is returned the first big string and exited the function.
I suggest you can check for desired results:
$matches[1] and $matches[2]

Related

how do i get ALL the value inside the parenthesis?

i need to get all the string/content inside each parenthesis inside of a string.
example: $string = "hello (cool) how are you (i am okay) where are you from (i am from earth)"; i am using preg_match_all like this preg_match_all('#\((.*?)\)#', $nonumber, $match); and then once i get the content of whatever is inside the parenthesis i want to put it in array
so i want each content inside parenthesis and put it into array like this.
the first array will have the value of cool
the second one will have value of i am okay
i tried to echo $match[0] but i get error Array to string conversion.
to put it simply i want the value to be inside an array. like this
$match[0] will have value of cool
$match[1] will have value of i am okay
and so on
like an array that you can call the value
i am desperate and dont know how to fix this. please help
You are on the right track to use preg_match_all, but you should use the pattern \((.*?)\):
$regexp = "/\((.*?)\)/";
$string = "hello (cool) how are you (i am okay) where are you from (i am from earth)";
preg_match_all($regexp, $string, $matches);
foreach ($matches[1] as $value) {
echo $value . "\n";
}
cool
i am okay
i am from earth
Demo

PHP:preg_replace function

$text = "
<tag>
<html>
HTML
</html>
</tag>
";
I want to replace all the text present inside the tags with htmlspecialchars(). I tried this:
$regex = '/<tag>(.*?)<\/tag>/s';
$code = preg_replace($regex,htmlspecialchars($regex),$text);
But it doesn't work.
I am getting the output as htmlspecialchars of the regex pattern. I want to replace it with htmlspecialchars of the data matching with the regex pattern.
what should i do?
You're replacing the match with the pattern itself, you're not using the back-references and the e-flag, but in this case, preg_replace_callback would be the way to go:
$code = preg_replace_callback($regex,'htmlspecialchars',$text);
This will pass the mathces groups to htmlspecialchars, and use its return value as replacement. The groups might be an array, in which case, you can try either:
function replaceCallback($matches)
{
if (is_array($matches))
{
$matches = implode ('', array_slice($matches, 1));//first element is full string
}
return htmlspecialchars($matches);
}
Or, if your PHP version permits it:
preg_replace_callback($expr, function($matches)
{
$return = '';
for ($i=1, $j = count($matches); $i<$j;$i++)
{//loop like this, skips first index, and allows for any number of groups
$return .= htmlspecialchars($matches[$i]);
}
return $return;
}, $text);
Try any of the above, until you find simething that works... incidentally, if all you want to remove is <tag> and </tag>, why not go for the much faster:
echo htmlspecialchars(str_replace(array('<tag>','</tag>'), '', $text));
That's just keeping it simple, and it'll almost certainly be faster, too.
See the quickest, easiest way in action here
If you want to isolate the actual contents as defined by your pattern, you could use preg_match($regex,$text,$hits);. This will give you an array of hits those bits that were between the paratheses in the pattern, starting at $hits[1], $hits[0] contains the whole matched string). You can then start manipulating these found matches, possibly using htmlspecialchars ... and combine them again into $code.

Locate text in string in PHP

I have a string of varying length taken from a MySQL database and in that string is a value (in bold below):
s:1:"4", s:2:"53", s:3:"7", s:4:"5"
I need a way to find whatever is in quotes following the s:3:. So in this example, it would be 7. I've looked around and I think I need to use the explode function but I am having trouble implementing it. The string may contain multiple values of this in which case I'd like to get them all into an array.
Use preg_match_all() for that:
$str = 's:1:"4", s:2:"53", s:3:"7", s:4:"5"';
if(preg_match_all('/s:3:"(.*?)"/', $str, $matches)) {
var_dump($matches[1]);
}
Non-greedy method, includes multi-lines.
<?php
$str = 's:1:"4", s:2:"53", s:3:"7", s:4:"5"';
if(preg_match_all('!s:3:"([^"]+)"!s', $str, $matches)) {
print_r($matches);
}
?>

How to handle a miss with regex / PHP / preg_match_all

I'm using the code at the bottom to grab parameters from a wordpress shortcode. The shortcode itself looks like this:
[FLOWPLAYER=http://www.tvovermind.com/wp-content/uploads/2013/01/pll-316-21.jpg|http://www.tvovermind.com/wp-content/uploads/2013/01/PLL316_fv2.h264HD-Clip2.flv,440,280]
Or
[FLOWPLAYER=http://www.tvovermind.com/wp-content/uploads/2013/01/pll-316-21.jpg|http://www.tvovermind.com/wp-content/uploads/2013/01/PLL316_fv2.h264HD-Clip2.flv,440,280,false]
What I would like to have happen is that if the extra parameter (false/true) is missing then that match becomes "false", however with the current code if the parameter is missing a match is never made. Any ideas?
function legacy_hook($content){
$regex = '/\[FLOWPLAYER=([a-z0-9\:\.\-\&\_\/\|]+)\,([0-9]+)\,([0-9]+)\,([a-z0-9\:\.\-\&\_\/\|]+)\]/i';
$matches = array();
preg_match_all($regex, $content, $matches);
if($matches[0][0] != '') {
foreach($matches[0] as $key => $data) {
$content = str_replace($matches[0][$key], flowplayer::build_player($matches[2][$key], $matches[3][$key], $matches[1][$key],$matches[4][$key]),$content);
}
}
return $content;
}
your regex is looking for the last comma to be there and one or more of the characters in the last set of brackets. Something like
/\[FLOWPLAYER=([a-z0-9\:\.\-\&\_\/\|]+)\,([0-9]+)\,([0-9]+)(\,[a-z]+)?\]/i
only issue is you'll get the comma in the match too.
might be what you're after, then you have to test for the last match being present. preg_match_all returns the number of matches so you might be able to use that, or you could do an inline if...
(count($matches) > 4 ? $matches[4][$key] : false)
You can add OR at the end of your expression
(,true|,false|$)
I didn't check does it work but you get the idea.

regex question redux regarding definition list

Trying to figure out a way to throw out attributes in this data that do not have any values. Thanks for helping.
My current regex code , thanks to Tomalak looks like this
Regex find
([^=|]+)=([^|]+)(?:\||$)
Regex replace
<dt>$1</dt><dd>$2</dd>
Data looks like this
Bristle Material=|Wire Material=Steel|Dia.=4 in|Grit=|Bristle Diam=|Wire Size=0.0095 in|Arbor Diam=|Arbor Thread - TPI or Pitch=1/2 - 3/8 in|No. of Knots=|Face Width=1/2 in|Face Plate Thickness=7/16 in|Trim Length=7/8 in|Stem Diam=|Speed=6000 rpm [Max]|No. of Rows=|Color=|Hub Material=|Structure=|Tool Shape=|Applications=Cleaning rust, scale and dirt, Light Deburring, Edge Blending, Roughening for adhesion, Finish preparation prior to plating or painting|Applicable Materials=|Type=|Used With=Straight Grinders, Bench/Pedestal Grinders, Right Angle Grinders|Packing Type=|Quantity=1 per pack|Wt.=
End result should like this
<dt>Wire Material</dt><dd>Steel</dd><dt>Dia.</dt><dd>4 in</dd><dt>Wire Size</dt><dd>0.0095 in</dd>
Not this
Bristle Material=|<dt>Wire Material</dt><dd>Steel</dd><dt>Dia.</dt><dd>4 in</dd>Grit=|Bristle Diam=|<dt>Wire Size</dt><dd>0.0095 in
Here is how you can do it in PHP without using regular expressions:
$parts_list = explode('|', "Bristle Material=|Wire M....");
$parts = "";
foreach( $parts_list as $part ){
$p = explode('=', $part);
if(!empty($p[1])) $parts .= "<dt>$p[0]</dt>\n<dd>$p[1]</dd>\n";
}
echo $parts;
And here is how you can do it with regular expressions:
$parts = preg_replace(
array('/([^=|]*)=(?:\||$)/','/([^=|]*)=([^|]+)(?:\||$)/'),
array('', '<dt>$1</dt><dd>$2</dd>'),
$inputString
);
echo $parts;
Update
This is using a special replace feature of the PHP preg_replace which takes an array of regex expressions, and an array of replacement strings. The array() syntax of the function basically equates to this:
If I can match this: /([^=|]*)=(?:\||$)/ then replace it with an empty string.
If I can match this: /([^=|]*)=([^|]+)(?:\||$)/ then replace it with <dt>$1</dt><dd>$2</dd>
To test it in a Regex editor, you would run the first expression first (/([^=|]*)=(?:\||$)/) then run the second expression on the result of the first expression.
([^=|]*)=([^|]*)(?:\||$)
to skip the ones with out a value, try this:
(?:[^=|]*=|([^=|]*)=([^|]+))(?:\||$)
looks like you want preg_match here rather than preg_replace
preg_match_all('~([^|]+)=([^|\s][^|]*)~', $str, $matches, PREG_SET_ORDER);
foreach($matches as $match)
echo "<dt>{$match[1]}</dt><dd>{$match[2]}</dd>\n";

Categories