Having problems getting a PHP regex to match

Having problems getting a PHP regex to match - php

Here is my problem. It's probably a simple fix. I have a regex that I am using to replace a url BBCode. What I have right now that is not working looks like this.
<?php
$input_string = '[url=www.test.com]Test[url]';
$regex = '/\[url=(.+?)](.+?)\[\/url]/is';
$replacement_string = '$2';
echo preg_replace($regex, $replacement_string, $input_string);
?>
This currently outputs the original $input_string, while I would like it to output the following.
Test
What am I missing?

<?php
$input_string = '[url=www.test.com]Test[/url]';
$regex = '/\[url=(.+?)\](.+?)\[\/url\]/is';
$replacement_string = '$2';
echo preg_replace($regex, $replacement_string, $input_string);
?>
In your BBCode string, I closed the
[url] properly.
I escaped a ] in the regex (not sure if that was an actual problem).
Note that [url]http://example.org[/url] is also a valid way to make a link in BBCode.
You should listen to the comments suggesting you use an existing BBCode parser.

Change this line as follows:
$regex = '/[url=(.+?)](.+?)[url]/is';
OK, the formatting is not proper. While I figure it out, see this: http://pastebin.com/6pF0FEbA

Related

preg_replace, str_replace and substr_replace not working in special condition

I have the following code:
this code finds all html tags in a string and replaces them with [[0]], [[1]] ,[[2]] and so on.(at least that is intented but not workinng);
$str = "some text <a href='/review/'>review</a> here <a class='abc' href='/about/'>link2</a> hahaha";
preg_match_all("|<[^>]+>(.*)</[^>]+>|U",$str, $out, PREG_OFFSET_CAPTURE);
$count = 0;
foreach($out[0] as $result) {
$temp=preg_quote($result[0],'/');
$temp ="/".$temp."/";
preg_replace($temp, "[[".$count."]]", $str,1);
$count++;
}
var_dump($str);
This code finds all the tags in a string and replaces them with [[0]], [[1]] and [[2]] and so on. I have used preg_match_all with PREG_OFFSET_CAPTURE.
The output of preg_match_all is as expected. However, preg_replace, substr_replace, and str_replace do not work when substituting the tags with [[$count]].
I have tried all three string replacement methods and none of them work. Please point me in the right direction.
Can something in php.ini cause this?
Thanks in advance.

preg_replace does not substitute $str. Assign it to the string again:
$str = preg_replace($temp, "[[".$count."]]", $str);
Also, I'm not sure what you want exactly, but this I changed some things in the code, which seems to be what you were tying to do. I changed the regex a bit, especially the (.*?) part to ([^<>]+).

the problem may be in this line
foreach($out[0] as $result) {
change it to this
foreach($out as $result) {
because i think you are accessing an index that doesn't exists

PHP Preg_Replace REGEX BB-Code

So I have created this function in PHP to output text in the required form. It is a simple BB-Code system. I have cut out the other BB-Codes from it to keep it shorter (Around 15 cut out)
My issue is the final one [title=blue]Test[/title] (Test data) does not work. It outputs exactly the same. I have tried 4-5 different versions of the REGEX code and nothing has changed it.
Does anyone know where I am going wrong or how to fix it?
function bbcode_format($str){
$str = htmlentities($str);
$format_search = array(
'#\[b\](.*?)\[/b\]#is',
'#\[title=(.*?)\](.*?)\[/title\]#i'
);
$format_replace = array(
'<strong>$1</strong>',
'<div class="box_header" id="$1"><center>$2</center></div>'
);
$str = preg_replace($format_search, $format_replace, $str);
$str = nl2br($str);
return $str;
}

Change the delimiter # to /. And change "/[/b\]" to "\[\/b\]". You need to escape the "/" since you need it as literal character.
Maybe the "array()" should use brackets: "array[]".
Note: I borrowed the answer from here: Convert BBcode to HTML using JavaScript/jQuery
Edit: I forgot that "/" isn't a metacharacter so I edited the answer accordingly.
Update: I wasn't able to make it work with function, but this one works. See the comments. (I used the fiddle on the accepted answer for testing from the question I linked above. You may do so also.) Please note that this is JavaScript. You had PHP code in your question. (I can't help you with PHP code at least for awhile.)
$str = 'this is a [b]bolded[/b], [title=xyz xyz]Title of something[/title]';
//doesn't work (PHP function)
//$str = htmlentities($str);
//notes: lose the single quotes
//lose the text "array" and use brackets
//don't know what "ig" means but doesn't work without them
$format_search = [
/\[b\](.*?)\[\/b\]/ig,
/\[title=(.*?)\](.*?)\[\/title\]/ig
];
$format_replace = [
'<strong>$1</strong>',
'<div class="box_header" id="$1"><center>$2</center></div>'
];
// Perform the actual conversion
for (var i =0;i<$format_search.length;i++) {
$str = $str.replace($format_search[i], $format_replace[i]);
}
//place the formatted string somewhere
document.getElementById('output_area').innerHTML=$str;

Update2: Now with PHP... (Sorry, you have to format the $replacements to your liking. I just added some tags and text to demostrate the changes.) If there's still trouble with the "title", see what kind of text you are trying to format. I made the title "=" optional with ? so it should work properly work texts like: "[title=id with one or more words]Title with id[/title]" and "[title]Title without id[/title]. Not sure thought if the id attribute is allowed to have spaces, I guess not: http://reference.sitepoint.com/html/core-attributes/id.
$str = '[title=title id]Title text[/title] No style, [b]Bold[/b], [i]emphasis[/i], no style.';
//try without this if there's trouble
$str = htmlentities($str);
//"#" works as delimiter in PHP (not sure abut JS) so no need to escape the "/" with a "\"
$patterns = array();
$patterns = array(
'#\[b\](.*?)\[/b\]#',
'#\[i\](.*?)\[/i\]#', //delete this row if you don't neet emphasis style
'#\[title=?(.*?)\](.*?)\[/title\]#'
);
$replacements = array();
$replacements = array(
'<strong>$1</strong>',
'<em>$1</em>', // delete this row if you don't need emphasis style
'<h1 id="$1">$2</h1>'
);
//perform the conversion
$str = preg_replace($patterns, $replacements, $str);
echo $str;

regex for breadcrumb in php

I am currently building breadcrumb. It works for example for
http://localhost/researchportal/proposal/
<?php
$url_comp = explode('/',substr($url,1,-1));
$end = count($url_comp);
print_r($url_comp);
foreach($url_comp as $breadcrumb) {
$landing="http://localhost/";
$surl .= $breadcrumb.'/';
if(--$end)
echo '
<a href='.$landing.''.$surl.'>'.$breadcrumb.'</a>»';
else
echo '
'.$breadcrumb.'';
};?>
But when I typed in http://localhost////researchportal////proposal//////////
All the formatting was gone as it confuses my code.
I need to have the site path in an array like ([1]->researchportal, [2]->proposal)
regardless of how many slashes I put.
So can $url_comp = explode('/',substr($url,1,-1)); be turned into a regular expression to get my desired output?

You don't need regex. Look at htmlentities() and stripslashes() in the PHP manual. A regex will return a boolean value of whatever it says, and won't really help you achieve what you are trying to do. All the regex can let you do is say if the string matches the regex do something. If you put in a regex requiring at least 2 characters between each slash, then any time anyone puts more than one consecutive slash in there, the if statement will stop.
http://ca3.php.net/manual/en/function.stripslashes.php
http://ca3.php.net/manual/en/function.htmlentities.php
Found this on the php manual.
It uses simple str_replace statements, modifying this should achieve exactly what your post was asking.
<?
function stripslashes2($string) {
$string = str_replace("\\\"", "\"", $string);
$string = str_replace("\\'", "'", $string);
$string = str_replace("\\\\", "\\", $string);
return $string;
}
?>

preg_replace need help with expression

This is my code:
$string = '« PreviousNext »';
$string = htmlspecialchars($string, ENT_COMPAT, 'UTF-8');
$string = preg_replace('#(<a).*?(nextlink)#s', '', $string);
echo $string;
I am trying to remove the last link:
Next »';
My current output:
">Next »</a>
It removes everything from the start.
I want it to remove only the one with strpos, is this possible with preg_replace and how?
Thanks.

quite a tricky question to solve
first off,
the .*? will not match like you are expecting it to.
its starts from the left finds the first match for <a, then searches until it finds nextlink, which is essentially picking up the entire string.
for that regex to work as you wanted, it would need to match from the righthand side first and work backwards through the string, finding the smallest (non-greedy) match
i couldn't see any modifiers that would do this
so i opted for a callback on each link, that will check and remove any link with nextlink in it
<?php
$string = '« PreviousNext »';
echo "RAW: $string\r\n\r\n";
$string = htmlspecialchars($string, ENT_COMPAT, 'UTF-8');
echo "SRC: $string\r\n\r\n";
$string = preg_replace_callback(
'#&lt\;a.+?</a>#',
'remove_nextlink',
$string
);
function remove_nextlink($matches) {
// if you want to see each line as it works, uncomment this
// echo "L: $matches[0]\r\n\r\n";
if (strpos($matches[0], 'nextlink') === FALSE) {
return $matches[0]; // doesn't contain nextlink, put original string back
} else {
return ''; // contains nextlink, replace with blank
}
}
echo "PROCESSED: $string\r\n\r\n";

Note: This is not a direct answer, but a suggestion to another approach.
I was told once; if you can do it in any other way, stay away from regex. I don't though, it's my white whale. Have you heard of phpQuery? It's jQuery implemented in PHP and very powerful. It would be able to do what you want in a very easy way. I know it's not regex, but perhaps it's of use to you.
If you really want to go ahead, I can recommend http://gskinner.com/RegExr/ . I think it's a great tool.

Supposedly valid regular expression doesn't return any data in PHP

I am using the following code:
<?php
$stock = $_GET[s]; //returns stock ticker symbol eg GOOG or YHOO
$first = $stock[0];
$url = "http://biz.yahoo.com/research/earncal/".$first."/".$stock.".html";
$data = file_get_contents($url);
$r_header = '/Prev. Week(.+?)Next Week/';
$r_date = '/\<b\>(.+?)\<\/b\>/';
preg_match($r_header,$data,$header);
preg_match($r_date, $header[1], $date);
echo $date[1];
?>
I've checked the regular expressions here and they appear to be valid. If I check just $url or $data they come out correctly and if I print $data and check the source the code that I'm looking for to use in the regex is in there. If you're interested in checking anything, an example of a proper URL would be http://biz.yahoo.com/research/earncal/g/goog.html
I've tried everything I could think of, including both var_dump($header) and var_dump($date), both of which return empty arrays.
I have been able to create other regular expressions that works. For instance, the following correctly returns "Earnings":
$r_header = '/Company (.+?) Calendar/';
preg_match($r_header,$data,$header);
echo $header[1];
I am going nuts trying to figure out why this isn't working. Any help would be awesome. Thanks.

Your regex doesn't allow for the line breaks in the HTML Try:
$r_header = '/Prev\. Week((?s:.*))Next Week/';
The s tells it to match the newline characters in the . (match any).

Problem is that the HTML has newlines in it, which you need to incorporate with the s regex modifier, as below
<?php
$stock = "goog";//$_GET[s]; //returns stock ticker symbol eg GOOG or YHOO
$first = $stock[0];
$url = "http://biz.yahoo.com/research/earncal/".$first."/".$stock.".html";
$data = file_get_contents($url);
$r_header = '/Prev. Week(.+?)Next Week/s';
$r_date = '/\<b\>(.+?)\<\/b\>/s';
preg_match($r_header,$data,$header);
preg_match($r_date, $header[1], $date);
var_dump($header);
?>

Dot does not match newlines by default. Use /your-regex/s
$r_header should probably be /Prev\. Week(.+?)Next Week/s
FYI: You don't need to escape < and > in a regex.

You want to add the s (PCRE_DOTALL) modifier. By default . doesn't match newline, and I see the page has them between the two parts you look for.
Side note: although they don't hurt (except readability), you don't need a backslash before < and >.

I think this is because you're applying the values to the regex as if it's plain text. However, it's HTML. For example, your regex should be modified to parse:
Prev. Week ...
Not to parse regular plain text like: "Prev. Week ...."

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Having problems getting a PHP regex to match - php

Change this line as follows: $regex = '/[url=(.+?)](.+?)[url]/is'; OK, the formatting is not proper. While I figure it out, see this: http://pastebin.com/6pF0FEbA

Related

preg_replace, str_replace and substr_replace not working in special condition

PHP Preg_Replace REGEX BB-Code

regex for breadcrumb in php

preg_replace need help with expression

Supposedly valid regular expression doesn't return any data in PHP

Categories

Resources