PHP regex replace with diffrent action for each result - php

In PHP I'm trying to replace all iframe tags with paragraph tag that include the ifame tag.
in other words im trying to surround the iframe tag with <p> tag that have a random number.
everything is working fine except when $record contains more than one iframe tag, in that case it would give the same paragraph number for the all the <p> tags.
here is my code:
$x = rand(1, 99);
$replacement = '<p' . $x . '>$1</p' . $x . '><br>';
$record = preg_replace("/(<iframe.*<\/iframe>)/U", $replacement, $record);
i want to give a unique number for for the tag for each iframe tag

$s = <<<'HTML'
$re = "/(<iframe[^>]*>.*?<\/iframe>)/U";
echo preg_replace_callback($re, function ($a) {
$x = rand(1, 99);
return '<p' . $x . '>'.$a[1].'</p' . $x . '><br>';
}, $s);


PHP strip_tags html validation and bracket check?

I use at the moment strip_tags($content, '<a>') tag to clear html tags except <a> tag.
Example 1: Example "lorem ipsum dolor <sit amet....." it cuts everything after "<"
Example 2: If the content starts with "<test lorem ipsum" I get only empty string.
I tried to check it with regex but the outcome is the same.
preg_replace('/<[^>]*>/', '', $content) it returns the same result for validation.
I need somehow to clear html and keep correct using of "<" bracket inside the content.
If you want to clear every tag except plain <a> and </a>, you could just filter them, replace them, then clear the HTML and replace them back, like this:
$text = "<a> ahahahasjusjhcbzdeu <div>JEY ssjisuj</div>jn<p> here somehing else </p></a>";
$EndText = str_replace("<a>", "&ATL", $text);
$EndText = str_replace("</a>", "&ATR", $EndText);
$EndText = strip_tags($EndText);
$EndText = str_replace("&ATL", "<a>", $EndText);
$EndText = str_replace("&ATR", "</a>", $EndText);
echo htmlspecialchars($EndText);
But if you want to get something like here , the link would get deleted, too.
So you need to filter the text between <a and > out (that can be done with explode, sub_str and str_replace), then do the same as in the solution above and then paste it in again.
A code that would do this is:
$text = "<a>Here something</a><div>Again<a href=''>That's a better link</a> Here</div>";
$Texts = explode("<a", $text);
$Begin = strip_tags(array_shift($Texts));
$Middles = [];
foreach ($Texts as &$value) {
$Middle = explode(">", $value)[0];
array_push($Middles, $Middle);
$Position = strpos($value, ">");
$value = substr($value, $Position+1);
$value = str_replace("</a>", "&htlENDA&", $value);
$value = strip_tags($value);
$EndText = $Begin;
for ($i = 0; $i < count($Texts); $i++) {
$EndText = $EndText."<a".$Middles[$i].">".$Texts[$i];
$EndText = str_replace("&htlENDA&", "</a>", $EndText);
echo "<br><br>Ende: ".htmlspecialchars($EndText);
That would solve your problem, as it deletes every html tag except <a ... > and </a>

php remove all attributes from a tag

Here is my code:
$content2= preg_replace("/<([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $content1);
This code removes all attributes from all tags in my website, but what I want is to only remove attributes from the form tag. This is what I have tried:
$content2 = preg_replace("/<form([a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $content1);
$content2 = preg_replace("/<(form[a-z][a-z0-9]*)[^>]*?(\/?)>/i",'<$1$2>', $content1);
This should do it for you.
$content1 = '<form method="post">test</form><form>2</form><form action=\'test\' method="post" type="blah"><img><b>bold</b></form>';
$content2 = preg_replace("~<form\s+.*?>~i",'<form>', $content1);
echo $content2;
Explanation and demo:
The \s+ is requiring whitespace after the opening form tag if we have that we presume there is an attribute after so we use .*? which takes everything until the next >. We don't need capture groups because the only thing you want is an empty form element, right?
Answer from a related question:
function stripArgumentFromTags( $htmlString ) {
$regEx = '/([^<]*<\s*[a-z](?:[0-9]|[a-z]{0,9}))(?:(?:\s*[a-z\-]{2,14}\s*=\s*(?:"[^"]*"|\'[^\']*\'))*)(\s*\/?>[^<]*)/i'; // match any start tag
$chunks = preg_split($regEx, $htmlString, -1, PREG_SPLIT_DELIM_CAPTURE);
$chunkCount = count($chunks);
$strippedString = '';
for ($n = 1; $n < $chunkCount; $n++) {
$strippedString .= $chunks[$n];
return $strippedString;
Then use call call it like this
$strippedTag = stripArgumentFromTags($initialTag);
Related question with more answers

Multiple occurances of delimeters within a HTML template

I am facing a problem that I can't get my head around. I thought I would turn to the experts once again to shine some light.
I have a HTML template and within the template I have delimiters like:
[has_image]<p>The image is <img src="" /></p>[/has_image]
These delimiters may have multiple occurances within the template and below is what I am trying to achieve:
Find all occurances of these delimiters and replace the content between these delimiters with an image source or replace it empty if image doesn't exist but still keep the value/content of the remaining template.
Below is my code that works only for one occurance but struggling to accomplish it for multiple occurances.
function replace_text_template($template_body, $start_tag, $end_tag, $replacement = ''){
$occurances = substr_count($template_body, $start_tag);
$x = 1;
while($x <= $occurances) {
$start = strpos($template_body, $start_tag);
$stop = strpos($template_body, $end_tag);
$template_body = substr($template_body, 0, $start) . $start_tag . $replacement . substr($template_body, $stop);
return $template_body;
$template_body will have HTML code with delimiters
replace_text_template($template_body, "[has_image]", "[/has_image]");
Whether I remove the while loop it still works for a single delimiter.
I have managed to solve the problem. If anybody finds this useful please feel free to use the code. However, if anyone finds a better way please do share it.
function replace_text_template($template_body, $start_tag, $end_tag, $replacement = ''){
$occurances = substr_count($template_body, $start_tag);
$x = 1;
while($x <= $occurances) {
$start = strpos($template_body, $start_tag);
$stop = strpos($template_body, $end_tag);
$template_body = substr($template_body, 0, $start) . $start_tag . $replacement . substr($template_body, $stop);
$template_body = str_replace($start_tag.''.$end_tag, '', $template_body); // replace the tags so on next loop the position will be correct
return $template_body;
function replace_text_template($template_body, $start_tag, $replacement = '') {
return preg_replace_callback("~\[".preg_quote($start_tag)."\].*?\[\/".preg_quote($start_tag)."\]~i", function ($matches) use ($replacement) {
if(preg_match('~<img.*?src="([^"]+)"~i', $matches[0], $match)) {
if (is_array(getimagesize($match[1]))) return $match[1];
return $replacement;
}, $template_body);
$template_body = <<<EOL
[has_image]<p>The image is <img src="" /></p>[/has_image]
abc [has_image]<p>The image is <img src="" /></p>[/has_image]xyz
echo replace_text_template($template_body, "has_image", "replacement");

How to wrap user mentions in a HTML link on PHP?

Im working on a commenting web application and i want to parse user mentions (#user) as links. Here is what I have so far:
$text = "#user is not #user1 but #user3 is #user4";
$pattern = "/\#(\w+)/";
$sql = "SELECT *
FROM users
WHERE username IN ('" .implode("','",$matches[1]). "')
$users = $this->getQuery($sql);
foreach($users as $i=>$u){
$text = str_replace("#{$u['username']}",
"<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a> ", $text);
$echo $text;
The problem is that user links are being overlapped:
<a rel="11327" class="ct-userLink" href="#">
<a rel="21327" class="ct-userLink" href="#">#user</a>1
How can I avoid links overlapping?
Answer Update
Thanks to the answer picked, this is how my new foreach loop looks like:
foreach($users as $i=>$u){
$text = preg_replace("/#".$u['username']."\b/",
"<a href='#' title='{$u['user_id']}'>#{$u['username']}</a> ", $text);
Problem seems to be that some usernames can encompass other usernames. So you replace user1 properly with <a>user1</a>. Then, user matches and replaces with <a><a>user</a>1</a>. My suggestion is to change your string replace to a regex with a word boundary, \b, that is required after the username.
The Twitter widget has JavaScript code to do this. I ported it to PHP in my WordPress plugin. Here's the relevant part:
function format_tweet($tweet) {
// add #reply links
$tweet_text = preg_replace("/\B[#@]([a-zA-Z0-9_]{1,20})/",
"#<a class='atreply' href='$1'>$1</a>",
// make other links clickable
$matches = array();
$link_info = preg_match_all("/\b(((https*\:\/\/)|www\.)[^\"\']+?)(([!?,.\)]+)?(\s|$))/",
$tweet_text, $matches, PREG_SET_ORDER);
if ($link_info) {
foreach ($matches as $match) {
$http = preg_match("/w/", $match[2]) ? 'http://' : '';
$tweet_text = str_replace($match[0],
"<a href='" . $http . $match[1] . "'>" . $match[1] . "</a>" . $match[4],
return $tweet_text;
instead of parsing for '#user' parse for '#user ' (with space in the end) or ' #user ' to even avoid wrong parsing of email addresses (eg: maybe ' #user: ' should also be allowed. this will only work, if usernames have no whitespaces...
You can go for a custom str replace function which stops at first replace.. Something like ...
function str_replace_once($needle , $replace , $haystack){
$pos = strpos($haystack, $needle);
if ($pos === false) {
// Nothing found
return $haystack;
return substr_replace($haystack, $replace, $pos, strlen($needle));
And use it like:
foreach($users as $i=>$u){
$text = str_replace_once("#{$u['username']}",
"<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a> ", $text);
You shouldn’t replace one certain user mention at a time but all at once. You could use preg_split to do that:
// split text at mention while retaining user name
$parts = preg_split("/#(\w+)/", $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$n = count($parts);
// $n is always an odd number; 1 means no match found
if ($n > 1) {
// collect user names
$users = array();
for ($i=1; $i<$n; $i+=2) {
$users[$parts[$i]] = '';
// get corresponding user information
$sql = "SELECT *
FROM users
WHERE username IN ('" .implode("','", array_keys($users)). "')";
$users = array();
foreach ($this->getQuery($sql) as $user) {
$users[$user['username']] = $user;
// replace mentions
for ($i=1; $i<$n; $i+=2) {
$u = $users[$parts[$i]];
$parts[$i] = "<a href='#' class='ct-userLink' rel='{$u['user_id']}'>#{$u['username']}</a>";
// put everything back together
$text = implode('', $parts);
I like dnl solution of parsing ' #user', but maybe is not suitable for you.
Anyway, did you try to use strip_tags function to remove the anchor tags? That way you have the string without the links, and you can parse it building the links again.

Convert clickable anchor tags to plain text in html document

I am trying to match <a> tags within my content and replace them with the link text followed by the url in square brackets for a print-version.
The following example works if there is only the "href". If the <a> contains another attribute, it matches too much and doesn't return the desired result.
How can I match the URL and the link text and that's it?
Here is my code:
$content = 'This is a text link';
$result = preg_replace('/<a href="(http:\/\/[A-Za-z0-9\\.:\/]{1,})">([\\s\\S]*?)<\/a>/',
'<strong>\\2</strong> [\\1]', $content);
echo $result;
Desired result:
<strong>This is a text link </strong> []
You should be using DOM to parse HTML, not regular expressions...
Edit: Updated code to do simple regex parsing on the href attribute value.
Edit #2: Made the loop regressive so it can handle multiple replacements.
$content = '
<p>This is a text link</p>
I wont change
$dom = new DOMDocument();
$anchors = $dom->getElementsByTagName('a');
$len = $anchors->length;
if ( $len > 0 ) {
$i = $len-1;
while ( $i > -1 ) {
$anchor = $anchors->item( $i );
if ( $anchor->hasAttribute('href') ) {
$href = $anchor->getAttribute('href');
$regex = '/^http/';
if ( !preg_match ( $regex, $href ) ) {
$text = $anchor->nodeValue;
$textNode = $dom->createTextNode( $text );
$strong = $dom->createElement('strong');
$strong->appendChild( $textNode );
$anchor->parentNode->replaceChild( $strong, $anchor );
echo $dom->saveHTML();
You can make the match ungreedy using ?.
You should also take into account there may be attributes before the href attribute.
$result = preg_replace('/<a [^>]*?href="(http:\/\/[A-Za-z0-9\\.:\/]+?)">([\\s\\S]*?)<\/a>/',
'<strong>\\2</strong> [\\1]', $content);
