PHP syntax error on preg_replace method - php

I'm trying to do a bbcode parser class that can create personalyzed tags, but I have some problem with urls
I've did all I need without particular problems thanks to regular expressions but I have a problem when I try to create a special tag who point to a specified URL.
In my class I've added a method like this:
<?
private function check_url_from_bbcode ($tag = "url", $css = null, $url_path = null) {
$regex_url = " a-zA-Z0-9\:\/\-\?\&\.\=\_\~\#\'";
if (!isset ($css)) $css = $this->css_link;
$regex_url = $this->regex_url;
if (isset ($url_path)) $url_path = "$url_path/";
$this->bbcode = preg_replace ("/\[$tag\]([$regex_url]*)\[\/$tag\]/", "<a title=\"Vai al link\" class=\"$css\" href=\"$url_path$1\">$1</a>", $this->bbcode);
$this->bbcode = preg_replace ("(\[$tag\=([$regex_url]*)\](.+?)\[/$tag\])", "<a title=\"Vai al link\" class=\"$css\" href=\"$url_path$1\">$2</a>", $this->bbcode);
}
?>
the code works fine with classical urls [url]http://ciao.com[/url] & [url=http://ciao.com]ciao[/url]
but I have some problem with the case of a special url subject page as last.fm style, so [artist]Lemon Jelly[/artist].
The bblink is converted in < a href="http://ciao.com/artist/Lemon Jelly">Lemon Jelly< /a> (I've used the spaces < a> only to show the link code).
The link has the whitespaces on the href attribute so can't never work.
<?
private function check_url_from_bbcode ($tag = "url", $css = null, $url_path = null) {
$regex_url = " a-zA-Z0-9\:\/\-\?\&\.\=\_\~\#\'";
if (!isset ($css)) $css = $this->css_link;
$regex_url = $this->regex_url;
if (isset ($url_path)) $url_path = "$url_path/";
// begin of the problem
$this->bbcode = preg_replace ("/\[$tag\]([$regex_url]*)\[\/$tag\]/", "<a title=\"Vai al link\" class=\"$css\" href=\"$url_path".str_replace (" ", "+", $1)."\">$1</a>", $this->bbcode);
// end of the problem
$this->bbcode = preg_replace ("(\[$tag\=([$regex_url]*)\](.+?)\[/$tag\])", "<a title=\"Vai al link\" class=\"$css\" href=\"$url_path$1\">$2</a>", $this->bbcode);
}
?>
To avoid this, I've wrote a little part of code that change the href url portion with tha same url but with "+" in place of " " whitespace char, so [artist]Lemon Jelly[/artist] should became < a href="http://ciao.com/artist/Lemon+Jelly">Lemon Jelly< /a>
I'm not experienced with PHP and I'm not sure what is the problem, I've uses this syntax in other situation without encounter the problem.
can someone help me to find where I'm wrong?
the error type is PHP Parse error: syntax error, unexpected T_LNUMBER, expecting T_VARIABLE or '$' in /...

Please provide the 2-3 lines before and after the line that has the error (line number should be in the PHP parse error.
This doesn't sound like a regex problem... more like an escaping issue.
I'm rather new to SO, why couldn't i just commented to the question to ask this (as what i wrote isn't really an answer)

The parse error is due to the fact that $1 is not a valid PHP variable name. They must start with a letter or an underscore. The $1 variable populated in the preg_replace parameters are just valid inside the parameters.
Try something more like this:
$this->bbcode = preg_replace("/\[$tag\]([$regex_url]*)\[\/$tag\]/e", "'<a title=\"Vai al link\" class=\"$css\" href=\"$url_path'. str_replace(' ', '+', '$1'). '\">$1</a>'", $this->bbcode);
The e modifier evaluates the replacement string as PHP code, so that should generate the string you want.
Note, I can't test it right now, so you may need to tweak it a bit. Sorry :]
Edit Never mind that, I got a hold of a FTP client. Fixed the regex. It works :)

To avoid this, I've wrote a little
part of code that change the href url
portion with tha same url but with "+"
in place of " "
Why not use urlencode on the contents of the tag?
Note that urlencode should only be used for query parameters; actual directory components should use rawurlencode, as HTTP itself doesn't use + instead of spaces.

I believe the problem is with the $1 reference, which is used in the str_replace function outside of the preg_replace arguments. The $1 backreference only works within the confines of preg_replace arguments, so you can't pass it like a variable to str_replace; it tries to use $1 as a variable, but in PHP that is invalid for a variable name.

Related

How to use preg_replace with url encoded $_GET?

I have a an url which looks like this https://URL.DOMAIN/blog.php?id=43&q=echo%20%27test%27.
When I use <?php echo $_GET['q'] ?> it displays echo 'test' which is what I want.
I am using this variable inside a preg_replace function which is basically made to apply a yellow background under matched strings:
preg_replace('/\b('.$_GET['q'].')\b/iu', '<span class="research-news-found">$1</span>', $news_content);
It works perfectly for "normal" strings like "apple" or whatever, but when there is a ' inside the search query it doesn't match anything.
Code example
$news_content = $news_display['news_description'];
if(isset($_GET['q'])){
$news_content = preg_replace('/\b('.$_GET['q'].')\b/iu', '<span class="research-news-found">$1</span>', $news_content);
}
$news_display['news_description'] contains the text output from DB.
Just make the pattern greedy ? and remove the trailing word boundary \b since ' is not a word character and is a word boundary:
$news_content = preg_replace('/\b('.$_GET['q'].'?)/iu',
'<span class="research-news-found">$1</span>',
$news_content);
Demo
But if you are hoping that it will actually echo test, then no. You would need to restructure your question to state what you want to achieve, not how to get this replacement to work.

How to add missing http:// to an anchor in a string - PHP

I wrote a code which adds hyperlink to all plain text where it finds http:// or https://. The code works pretty well for https://www.google.com and http://yahoo.com. It converts these text into clickable hyperlink with correct address.
<?php
function convert_text_to_link($str)
{
$pattern = "/(?:(https?):\/\/([^\s<]+)|(www\.[^\s<]+?\.[^\s<]+))(?<![\.,:])/i";
return preg_replace($pattern, "<a href='$0' target='_blank'>$0</a>", $str);
}
$str = "https://www.google.com is the biggest search engine. It's competitors are http://yahoo.com and www.bing.com.";
echo convert_text_to_link($str);
?>
But when my code sees www.bing.com, though it adds hyperlink to it but the href attribute also becomes www.bing.com. There is no http:// prepended it. Therefore the link becomes unusable without the link http://localhost/myproject/www.bing.com will go nowhere.
How can I add http:// to www.bing.com so that it should become http://www.bing.com?
Here is your function. Try this.
function convert_text_to_link($str) {
$pattern = '#(http)?(s)?(://)?(([a-zA-Z])([-\w]+\.)+([^\s\.]+[^\s]*)+[^,.\s])#';
return preg_replace($pattern, '$0', $str);
}
You should try and check if this works:
window.location = window.location.href.replace(/^www./, 'https:');
might be you will get your solution.
I just got to know about some other approaches too, you can try them out as per your code and requirements:
1.
str_replace("www.","http://","$str");
The test here is case-sensitive. This means that if the string is initially this will change it to http://Http://example.com which is probably not what you want.
try regex:
if (!$str.match(/^[a-zA-Z]+:\/\//))
{
$str = 'http://' + $str;
}.
hope this helps.

PHP wont recognise double line feed

I am running a RST to php conversion and am using preg_match.
this is the rst i am trying to identify:
An example of the **Horizon Mapping** dialog box is shown below. A
summary of the main features is given below.
.. figure:: horizon_mapping_dialog_horizons_tab.png
**Horizon Mapping** dialog box, *Horizons* tab
Some of the input values to the **Horizon Mapping** job can be changed
during a Workflow using the internal programming language, IPL. For
details, refer to the *IPL User Guide*.
and I am using this regex:
$match = preg_match("/.. figure:: (.*?)(\n{2}[ ]{3}.*\n)/s", $text, &$result);
however it is returning as false.
here is a link of the expression working on regex
http://regex101.com/r/oB3fW7.
Are you sure that the line break is \n, is doubt, use \R:
$match = preg_match("/.. figure:: (.*?)(\R{2}[ ]{3}.*\R)/s", $text, &$result);
\R stands for either \n, \r and \r\n
My instinct would be to do some troubleshooting around the s flag as well as the $result variable passed by reference. To achieve the same without any interference from dots and the return variable, can you please try this regex:
..[ ]figure::[ ]([^\r\n]*)(?:\n|\r\n){2}[ ]{3}[^\r\n]*\R
In code, please try exactly like this:
$regex = "~..[ ]figure::[ ]([^\r\n]*)(?:\n|\r\n){2}[ ]{3}[^\r\n]*\R~";
if(preg_match($regex,$text,$m)) echo "Success! </br>";
Finally:
If this does not working, you might have a weird Unicode line break that php is not catching. To debug, for each character of your string, iterate through all the string's characters
Iterate: foreach(str_split($text) as $c) {
Print the character: echo $c . " value = "
Print the value from this function: . _uniord($c) . "<br />"; }

PHP DomDocument to replace pattern

I need to find and replace http links to hyperlinks. These http links are inside span tags.
$text has html page. One of the span tags has something like
<span class="styleonetwo" >http://www.cnn.com/live-event</span>
Here is my code:
$doc = new DOMDocument();
$doc->loadHTML($text);
foreach($doc->getElementsByTagName('span') as $anchor) {
$link = $anchor->nodeValue;
if(substr($link, 0, 4) == "http")
{
$link = "$link";
}
if(substr($link, 0, 3) == "www")
{
$link = "$link";
}
$anchor->nodeValue = $link;
}
echo $doc->saveHTML();
It works ok. However...I want this to work even if the data inside span is something like:
<span class="styleonetwo" > sometexthere http://www.cnn.com/live-event somemoretexthere</span>
Obviously above code wont work for this situation. Is there a way we can search and replace a pattern using DOMDocument without using preg_replace?
Update: To answer phil's question regarding preg_replace:
I used regexpal.com to test the following pattern matching:
\b(?:(?:https?|ftp|file)://|(www|ftp)\.)[-A-Z0-9+&##/%?=~_|$!:,.;]*[-A-Z0-9+&##/%=~_|$]
It works great in the regextester provided in regexpal. When I use the same pattern in PHP code, I got tons of weird errors. I got unknown modifier error even for escape character! Following is my code for preg_replace
$httpRegex = '/\b(\?:(\?:https?|ftp|file):\/\/|(www|ftp)\.)[-A-Z0-9+&##/%\?=~_|$!:,.;]*[-A-Z0-9+&##/%=~_|$]/';
$cleanText = preg_replace($httpRegex, "<a href='$0'>$0</a>", $text);
I was so frustrated with "unknown modifiers" and pursued DOMDocument to solve my problem.
Regular expressions well suit this problem - so better use preg_replace.
Now you just have several unescaped delimiters in your pattern, so escape them or choose another character as the delimiter - for instance, ^. Thus, the correct pattern would be:
$httpRegex = '^\b(?:(?:https?|ftp|file):\/\/|(www|ftp)\.)[-A-Z0-9+&##\/%\?=~_|$!:,.;]*[-A-Z0-9+&##\/%=~_|$]^i';

Regex to deterime text 'http://...' but not in iframes, embeds...etc

This regex is used to replace text links with a clickable anchor tag.
#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i
My problem is, I don't want it to change links that are in things like <iframe src="http//... or <embed src="http://...
I tried checking for a whitespace character before it by adding \s, but that didn't work.
Or - it appears they're first checking that an href=" doesn't already exist (?) - maybe I can check for the other things too?
Any thoughts / explanations how I would do this is greatly appreciated. Main, I just need the regex - I can implement in CakePHP myself.
The actual code comes from CakePHP's Text->autoLink():
function autoLinkUrls($text, $htmlOptions = array()) {
$options = var_export($htmlOptions, true);
$text = preg_replace_callback('#(?<!href="|">)((?:https?|ftp|nntp)://[^\s<>()]+)#i', create_function('$matches',
'$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], $matches[0],' . $options . ');'), $text);
return preg_replace_callback('#(?<!href="|">)(?<!http://|https://|ftp://|nntp://)(www\.[^\n\%\ <]+[^<\n\%\,\.\ <])(?<!\))#i',
create_function('$matches', '$Html = new HtmlHelper(); $Html->tags = $Html->loadConfig(); return $Html->link($matches[0], "http://" . $matches[0],' . $options . ');'), $text);
}
You can expand the lookbehind at the beginning of those regexes to check for src=" as well as href=", like this:
(?<!href="|src="|">)

Categories