Problems with Regular expressions in PHP

Problems with Regular expressions in PHP - php

(I want to let my users tag other users with their names, problem: when someone edits his post again, he gets the link in his tinymce editor. when he saves his edits, the script will destroy the old link...)
I replace all words in a big string with words included in an array.
$users = {'this', 'car'}
$text = hello, this is <a title="this" href="">a test this</a>
$search = '!\b('.implode('|', $users).')\b!i';
$replace = '<a target="_blank" alt="$1" href="/user/$1">$1</a>';
$text = preg_replace($search, $replace, $text);
as you can see above, I try to replace 'this' and 'car' in $text with
<a target="_blank" alt="$1" href="/user/$1">$1</a>
the problem is, that my script also replaces 'this', when it's in my link:
<a title="this" href="">this</a>
im not completely sure, but I think, you know what I mean.
so my script destroys my links...
I don't need to detect, if the word is in a html element, because it should be able to replace words in other tags like h1 or p ...
I need something like
a pattern, which only matches, when the word looks like:
" this "
" this, "
",this "
" this: "...
(no problem, if i have to set these manually...)
another great solution: a string, where I can set the html tags which are not allowed.
$tags = 'a,e,article';
Greets

This should do it
<.*?this.*?>(*SKIP)(*FAIL)|\b(this)\b
Demo: https://regex101.com/r/fX0pT1/1
More on this regex approach, http://www.rexegg.com/regex-best-trick.html.
PHP Usage:
$users = array('this', 'car');
$text = 'hello, this is <a title="this" href="">a test this</a>';
$terms = '(' . implode('|', $users) . ')';
$search = '!<.*?'.$terms.'.*?>(*SKIP)(*FAIL)|\b(' . $terms . ')\b!i';
echo $search;
$replace = '<a target="_blank" alt="$2" href="/user/$2">$2</a>';
echo preg_replace($search, $replace, $text);
Output:
hello, <a target="_blank" alt="" href="/user/this">this</a> is <a title="this" href="">a test <a target="_blank" alt="" href="/user/this">this</a></a>
PHP Demo: https://eval.in/415964
...or if you only want it for links, https://regex101.com/r/fX0pT1/2, <a.*?this.*?>(*SKIP)(*FAIL)|\b(this)\b.

Related

Replacing Relative Links with External Links in PHP String

I am working with an editor that works purely with internal relative links for files which is great for 99% of what I use it for.
However, I am also using it to insert links to files within an email body and relative links don't cut the mustard.
Instead of modifying the editor, I would like to search the string from the editor and replace the relative links with external links as shown below
Replace
files/something.pdf
With
https://www.someurl.com/files/something.pdf
I have come up with the following but I am wondering if there is a better / more efficient way to do it with PHP
<?php
$string = 'A link, some other text, A different link';
preg_match_all('/<a[^>]+href=([\'"])(?<href>.+?)\1[^>]*>/i', $string, $result);
if (!empty($result)) {
// Found a link.
$baseUrl = 'https://www.someurl.com';
$newUrls = array();
$newString = '';
foreach($result['href'] as $url) {
$newUrls[] = $baseUrl . '/' . $url;
}
$newString = str_replace($result['href'], $newUrls, $string);
echo $newString;
}
?>
Many thanks
Lee

You can simply use preg_replace to replace all the occurrences of files starting URLs inside double quotes:
$string = 'A link, some other text, A different link';
$string = preg_replace('/"(files.*?)"/', '"https://www.someurl.com/$1"', $string);
The result would be:
A link, some other text, A different link

You really should use DOMdocument for such job, but if you want to use a regex, this one does the job:
$string = '<a some_attribute href="files/something.pdf" class="abc">A link</a>, some other text, <a class="def" href="files/somethingelse.pdf" attr="xyz">A different link</a>';
$baseUrl = 'https://www.someurl.com';
$newString = preg_replace('/(<a[^>]+href=([\'"]))(.+?)\2/i', "$1$baseUrl/$3$2", $string);
echo $newString,"\n";
Output:
<a some_attribute href="https://www.someurl.comfiles/something.pdf" class="abc">A link</a>, some other text, <a class="def" href="https://www.someurl.com/files/somethingelse.pdf" attr="xyz">A different link</a>

PHP : add a html tag around specifc words

I have a data base with texts and in each text there are words (tags) that start with # (example of a record : "Hi I'm posting an #issue on #Stackoverflow ")
I'm trying to find a solution to add html code to transform each tag into a link when printing the text.
So the text are stored as strings in MySQL database like this :
Some text #tag1 text #tag2 ...
I want to replace all these #abcd with
#abcd
And have a final result as follow:
Some text #tag1 text #tag2 ...
I guess that i should use some regex but it is not at all my strong side.

Try the following using preg_replace(..)
$input = "Hi I'm posting an #issue on #Stackoverflow";
echo preg_replace("/#([a-zA-Z0-9]+)/", "<a href='targetpage.php?val=$1'>#$1</a>", $input);
http://php.net/manual/en/function.preg-replace.php

A simple solution could look like this:
$re = '/\S*#(\[[^\]]+\]|\S+)/m';
$str = 'Some text #tag1 text #tag2 ...';
$subst = '#$1';
$result = preg_replace($re, $subst, $str);
echo "The result of the substitution is ".$result;
Demo
If you are actually after Twitter hashtags and want to go crazy take a look here how it is done in Java.
There is also a JavaScript Twitter library that makes things very easy.

Try this the function
<?php
$demoString1 = "THIS is #test STRING WITH #abcd";
$demoString2 = "Hi I'm posting an #issue on #Stackoverflow";
function wrapWithAnchor($link,$string){
$pattern = "/#([a-zA-Z0-9]+)/";
$replace_with = '<a href="'.$link.'?val=$1">$1<a>';
return preg_replace( $pattern, $replace_with ,$string );
}
$link= 'http://www.targetpage.php';
echo wrapWithAnchor($link,$demoString1);
echo '<hr />';
echo wrapWithAnchor($link,$demoString2);
?>

PHP Substring of a regex replacement

I have the following regex :
$string = preg_replace("/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i","<a target=\"_blank\" href=\"$1\">$1</A>",$string);
Using it to parse this string : http://www.ttt.com.ar/hello_world
Produces this new string :
<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/hello_world</A>
So far , soo good. What I want to do is to get replacement $1 to be a substring of $1 producing an output like :
<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/...</A>
Pseudocode of what I mean:
$string = preg_replace("/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i","<a target=\"_blank\" href=\"$1\">substring($1,0,24)..</A>",$string);
Is this even possible? Probably Im just doing all wrong :)
Thanks in advance.

Check out preg_replace_callback():
$string = 'http://www.ttt.com.ar/hello_world';
$string = preg_replace_callback(
"/([\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i",
function($matches) {
$link = $matches[1];
$substring = substr($link, 0, 24) . '..';
return "<a target=\"_blank\" href=\"$link\">$substring</a>";
},
$string
);
var_dump($string);
// <a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/...</a>
Note, you can also use the e modifier in PHP to execute functions in your preg_replace(). This has been deprecated in PHP 5.5.0, in favor of preg_replace_callback().

You can use a capturing group inside of a lookahead like this:
preg_replace(
"/((?=(.{24}))[\w]+:\/\/[\w-?&;#~=\.\/\#]+[\w\/])/i",
"<a target=\"_blank\" href=\"$1\">$2..</A>",
$string);
This will capture the entire URL in group 1, but it will also capture the first 24 characters of it in group 2.

You are showing bad practice. Regexes should not being used to parse or modify xml content from application's context.
Suggests:
Use a DOM parsing to read and modify the value
use parse_url() to get the protocol + domain name
Example:
$doc = new DOMDocument();
$doc->loadHTML(
'<a target="_blank" href="http://www.ttt.com.ar/hello_world">http://www.ttt.com.ar/hello_world</A>'#
);
$link = $doc->getElementsByTagName('a')->item(0);
$url = parse_url($link->nodeValue);
$link->nodeValue = $url['scheme'] . '://' . $url['host'] . '/...';
echo $doc->saveHTML();

How to use preg_replace to remove part of url address?

I have some HTML code like this:
<a href="http://mysite.com/documentos/Servicios/SUCRE/sucDoc19.pdf&sa=U&ei=sf0JUrmjIc3Nswb154CgDQ&ved=0CCkQFjAA&usg=AFQjCNGfXg_9x83U3pYr6JfkJcWuXv8X0Q">
I need to clean my code to get something like this
<a href="http://mysite.com/documentos/Servicios/SUCRE/sucDoc19.pdf">
using preg_replace.
My code is the following:
$serp = preg_replace('&sa=(.*)" ', '" ', $serp);
and it doesn't work.
BTW i need to restrict search with preg_replace until the FIRST entrance, i.e. i need to replace all html from &sa= to the FIRST ", but now it search from &sa= to the LAST "...

You're missing the regex delimiters.
$serp = preg_replace('/&sa=(.*)" /', '" ', $serp);
will give you this.

You missed the delimiter.
So your code looks like:
$serp = preg_replace('/&sa=(.*)" /', '" ', $serp);
okay, if you want to delete everything till the first quote then you can try the following instead of regex:
$temp = substr($serp,strpos($serp,'&sa='),strpos($serp,'"',strpos($serp,'&sa=')));
$serp = str_replace($temp,"",$serp);

Just another regex to do it :)
$text = '<a href="http://mysite.com/documentos/Servicios/SUCRE/sucDoc19.pdf&sa=U&ei=sf0JUrmjIc3Nswb154CgDQ&ved=0CCkQFjAA&usg=AFQjCNGfXg_9x83U3pYr6JfkJcWuXv8X0Q" target="_blank">';
$text = preg_replace('/(&sa=[^"]*)/', '', $text);
echo $text;
// Output:
<a href="http://mysite.com/documentos/Servicios/SUCRE/sucDoc19.pdf" target="_blank">
You can try it HERE (thks to hjpotter92 for this tool)

Add id attribute to hyperlinks through PHP Regular Expressions

I am still relatively new to Regular Expressions and feel My code is being too greedy. I am trying to add an id attribute to existing links in a piece of code. My functions is like so:
function addClassHref($str) {
//$str = stripslashes($str);
$preg = "/<[\s]*a[\s]*href=[\s]*[\"\']?([\w.-]*)[\"\']?[^>]*>(.*?)<\/a>/i";
preg_match_all($preg, $str, $match);
foreach ($match[1] as $key => $val) {
$pattern[] = '/' . preg_quote($match[0][$key], '/') . '/';
$replace[] = "<a id='buttonRed' href='$val'>{$match[2][$key]}</a>";
}
return preg_replace($pattern, $replace, $str);
}
This adds the id tag like I want but it breaks the hyperlink. For example:
If the original code is : Link
Instead of <a id="class" href="http://www.google.com">Link</a>
It is giving
<a id="class" href="http">Link</a>
Any suggestions or thoughts?

Do not use regular expressions to parse XML or HTML.
$doc = new DOMDocument();
$doc->loadHTML($html);
$all_a = $doc->getElementsByTagName('a');
$firsta = $all_a->item(0);
$firsta->setAttribute('id', 'idvalue');
echo $doc->saveHTML($firsta);

You've got some overcomplications in your regex :)
Also, there's no need for the loop as preg_replace() will hit all the instances of the search pattern in the relevant string. The first regex below will take everything in the a tag and simply add the id attribute on at the end.
$str = 'Link' . "\n" .
'Link' . "\n" .
'Link';
$p = "{<\s*a\s*(href=[^>]*)>([^<]*)</a>}i";
$r = "<a $1 id=\"class\">$2</a>";
echo preg_replace($p, $r, $str);
If you only want to capture the href attribute you could do the following:
$p = '{<\s*a\s*href=["\']([^"\']*)["\'][^>]*>([^<]*)</a>}i';
$r = "<a href='$1' id='class'>$2</a>";

Your first subpattern ([\w.-]*) doesn't match :, thus it stops at "http".
Couldn't you just use a simple str_replace() for this? Regex seems like overkill if this is all you're doing.
$str = str_replace('<a ', '<a id="someID" ', $str);

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Problems with Regular expressions in PHP - php

Related

Replacing Relative Links with External Links in PHP String

PHP : add a html tag around specifc words

PHP Substring of a regex replacement

How to use preg_replace to remove part of url address?

Add id attribute to hyperlinks through PHP Regular Expressions

Categories

Resources