php regex replace substring - php

I am trying to detect a url with php regex and replace all the &amp that is has with just &. I had run htmlspecialchars in all my input data but i want urls to readable. I did that which obviously doesnt work because the replace part is wrong.
preg_replace('!(http(s)?://((\S)|(&amp))*)!m', '&', $message);
Basically i want all the string to remain the same but change the &amp when it occurs within an url.I was thinking to use preg_match_all but if the values of the array are not passed by reference it wont work.
Any ideas on how i could do it ?

You may match the URLs with a relatively simple !https?://\S+! (matching http:// or https:// and then matching 1+ non-whitespace symbols) and modify the &amp inside each match using a preg_replace_callback:
$message = preg_replace_callback('!https?://\S+!', function ($m) {
return str_replace('&amp', '&', $m[0]);
}, $message);
See a PHP demo.

This may work for you:
preg_match_all('%https?://\S+%msi', $html, $matches, PREG_PATTERN_ORDER);
foreach ($matches[0] as $match)
{
$fixed = preg_replace('/&amp/i', '&', $match);
$match = preg_quote($match);
$html = preg_replace("#$match#", $fixed, $html);
}

Related

How to not perform preg_replace if subject starts with quote

I'm trying to convert plain links to HTML links using preg_replace. However it's replacing links that are already converted.
To combat this I'd like it to ignore the replacement if the link starts with a quote.
I think a positive lookahead may be needed but everything I've tried hasn't worked.
$string = 'test http://www.example.com';
$string = preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $string);
var_dump($string);
The above outputs:
http://www.example.com">test</a> http://www.example.com
When it should output:
test http://www.example.com
You might get along with lookarounds.
Lookarounds are zero-width assertions that make sure to match/not to match anything immediately around the string in question. They do not consume any characters.
That being said, a negative lookbehind might be what you need in your situation:
(?<![">])\bhttps?://\S+\b
In PHP this would be:
<?php
$string = 'I want to be transformed to a proper link: http://www.google.com ';
$string .= 'But please leave me alone ';
$string .= '(https://www.google.com).';
$regex = '~ # delimiter
(?<![">]) # a neg. lookbehind
https?://\S+ # http:// or https:// followed by not a whitespace
\b # a word boundary
~x'; # verbose to enable this explanation.
$string = preg_replace($regex, "<a href='$0'>$0</a>", $string);
echo $string;
?>
See a demo on ideone.com. However, maybe a parser is more appropriate.
Since you can use Arrays in preg_replace, this might be convenient to use depending on what you want to achieve:
<?php
$string = 'test http://www.example.com';
$rx = array("&(<a.+https?:\/\/[\w]+[^ \,\"\n\r\t<]*>)(.*)(<\/a\>)&si", "&(\s){1,}(https?:\/\/[\w]+[^ \,\"\n\r\t<]*)&");
$rp = array("$1$2$3", "$2");
$string = preg_replace($rx,$rp, $string);
var_dump($string);
// DUMPS:
// 'testhttp://www.example.com'
The Idea
You can split your string at the already existing anchors, and only parse the pieces in between.
The Code
$input = 'test http://www.example.com';
// Split the string at existing anchors
// PREG_SPLIT_DELIM_CAPTURE flag includes the delimiters in the results set
$parts = preg_split('/(<a.*?>.*?<\/a>)/is', $input, PREG_SPLIT_DELIM_CAPTURE);
// Use array_map to parse each piece, and then join all pieces together
$output = join(array_map(function ($key, $part) {
// Because we return the delimiter in the results set,
// every $part with an uneven key is an anchor.
return $key % 2
? preg_replace("/((https?:\/\/[\w]+[^ \,\"\n\r\t<]*))/is", "$1", $part)
: $part;
}, array_keys($parts), $parts);

Why is my regex rejecting apostrophes?

I'm making a regex which should match everything like that : [[First example]] or [[I'm an example]].
Unfortunately, it doesn't match [[I'm an example]] because of the apostrophe.
Here it is :
preg_replace_callback('/\[\[([^?"`*%#\\\\:<>]+)\]\]/iU', ...)
Simple apostrophes (') are allowed so I really do not understand why it doesn't work.
Any ideas ?
EDIT : Here is what's happening before I'm using this regex
// This match something [[[like this]]]
$contents = preg_replace_callback('/\[\[\[(.+)\]\]\]/isU',function($matches) {
return '<blockquote>'.$matches[1].'</blockquote>';
}, $contents);
// This match something [[like that]] but doesn't work with apostrophe/quote when
// the first preg_replace_callback has done his job
$contents = preg_replace_callback('/\[\[([^?"`*%#\\\\:<>]+)\]\]/iU', ..., $contents);
try this:
$string = '[[First example]]';
$pattern = '/\[\[(.*?)\]\]/';
preg_match ( $pattern, $string, $matchs );
var_dump ( $matchs );
You can use this regex:
\[\[.*?]]
Working demo
Php code
$re = '/\[\[.*?]]/';
$str = "not match this but [[Match this example]] and not this";
preg_match_all($re, $str, $matches);
Btw, if you want to capture the content within brackets you have to use capturing groups:
\[\[(.*?)]]

preg_replace everything but # sign

I've searched for an example of this, but can't seem to find it.
I'm looking to replace everything for a string but the #texthere
$Input = this is #cool isn't it?
$Output = #cool
I can remove the #cool using preg_replace("/#(\w+)/", "", $Input); but can't figure out how to do the opposite
You could match #\w+ and then replace the original string. Or, if you need to use preg_replace, you should be able to replace everything with the first capture group:
$output = preg_replace('/.*(#\w+).*/', '\1', $input);
Solution using preg_match (I assume this will perform better):
$matches = array();
preg_match('/#\w+/', $input, $matches);
$output = $matches[0];
Both patterns above do not address the issue how to handle inputs which match multiple times, such as this is #cool and #awesome, right?

Can't get PHP Regex working

I'm trying to use PHP regular expressions. I've tried this code:
$regex = "c:(.+),";
$input = "otherStuff094322f98c:THIS,OtherStuffHeree129j12dls";
$match = Array();
preg_match_all($regex, $input, $match);
It should return a sub-string THIS ("c" and ":" followed by any character combination followed by ",") from $input. But it returns a empty array. What am I doing wrong?
I think you need the slashes to make regex working.
and using .+ will match everything behind the comma too, which is you don't want. Use .+? or [^,]+
$regex = "/c:(.+?),/";
or
$regex = "/c:([^,]+),/";

PHP preg_replace() backreferences used as arguments of another function

I am trying to extract information from a tags using a regex, then return a result based on various parts of the tag.
preg_replace('/<(example )?(example2)+ />/', analyze(array($0, $1, $2)), $src);
So I'm grabbing parts and passing it to the analyze() function. Once there, I want to do work based on the parts themselves:
function analyze($matches) {
if ($matches[0] == '<example example2 />')
return 'something_awesome';
else if ($matches[1] == 'example')
return 'ftw';
}
etc. But once I get to the analyze function, $matches[0] just equals the string '$0'. Instead, I need $matches[0] to refer to the backreference from the preg_replace() call. How can I do this?
Thanks.
EDIT: I just saw the preg_replace_callback() function. Perhaps this is what I am looking for...
You can't use preg_replace like that. You probably want preg_replace_callback
$regex = '/<(example )?(example2)+ \/>/';
preg_match($regex, $subject, $matches);
// now you have the matches in $matches and you can process them as you want
// here you can replace all matches with modifications you made
preg_replace($regex, $matches, $subject);

Categories