How could I replace some markup in this format:
[a href="/my_page" style="font-size: 13px"]click me[/a]
to
click me
using preg_replace()?
I will need to allow for more attributes as well.
$s = '[a href="/my_page" style="font-size: 13px"]click me[/a]';
$ret = preg_replace('~\[([^\[\]]+)\]([^\[\]]++)\[/([^\[\]]++)\]~', '<\1>\2</\3>', $s);
Related
I have a string like this:
[<span style="font-size: 12.1599998474121px; line-height: 15.8079996109009px;">heading </span>heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/<span style="font-size: 12.1599998474121px; line-height: 15.8079996109009px;">heading</span>]
I want to remove HTML tags which are inside of brackets using PHP preg_replace etc. Final string should be like this:
[heading heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/heading]
I searched a lot for finding the solution but no success.
This should work for you:
Here I just use strip_tags() in every brackets of your string and return it.
echo $str = preg_replace_callback("/\[(.*?)\]/", function($m){
return strip_tags($m[0]);
}, $str);
You can use a callback with the following regular expression and utilize strip_tags() ...
$str = preg_replace_callback('~\[[^]]*]~',
function($m) {
return strip_tags($m[0]);
}, $str);
eval.in
Depends really how much you want to remove.
Example:
Pattern: '<.*?>'
Result: [heading heading="h1"]Its a subject.[/heading]
But judging from your answer you want to keep the html tags that are inside your heading. I don't understand, based on which rule exactly ? Why is this an exception ?
You can use a single regex to get what you want:
$re = "#][^\[\]]*(*SKIP)(*F)|<\/?[a-z].*?>#si";
$str = "[<span style=\"font-size: 12.1599998474121px; line-height: 15.8079996109009px;\">heading </span>heading=\"h1\"]Its a <span style=\"text-decoration: line-through;\">subject</span>.[/<span style=\"font-size: 12.1599998474121px; line-height: 15.8079996109009px;\">heading</span>]";
$result = preg_replace($re, '', $str);
echo $result;
Ouput of the sample code:
[heading heading="h1"]Its a <span style="text-decoration: line-through;">subject</span>.[/heading]
I want to remove string like below from a html code
<span style="font-size: 0.8px; letter-spacing: -0.8px; color: #ecf6f6">3</span>
so I came up with regex.
$pattern = "/<span style=\"font-size: \\d(\\.\\d)?px; letter-spacing: -\\d(\\.\\d)?px; color: #\\w{6}\">\\w\\w?</span>/um";
However, regex doesn’t work. Can someone point me what i did wrong. I'm new to PHP.
when I tested with a simple regex, it works so problem remains with the regex.
$str = $_POST["txtarea"];
$pattern = $_POST["regex"];
echo preg_replace($pattern, "", $str);
As much as I would advocate DOMDocument to do the job here, you would still need some regular expression down the line, so ...
The expression for the px numeric value can be simply [\d.-]+, since you're not trying to validate anything.
The contents of the span can be simplified to [^<]* (i.e. anything but a opening bracket):
$re = '/<span style="font-size: [\d.-]+px; letter-spacing: [\d.-]+px; color: #[0-9a-f]{3,6}">[^<]*<\/span>/';
echo preg_replace($re, '', $str);
Do not use regex for this problem. Use an html parser. Here is a solution in python with BeautifulSoup, because I like this library for these tasks:
from BeautifulSoup import BeautifulSoup
with open('Path/to/file', 'r') as content_file:
content = content_file.read()
soup = BeautifulSoup(content)
for div in soup.findAll('span', {'style':re.compile("font-size: \d(\.\d)?px; letter-spacing: -\d(\.\d)?px; color: #\w{6}")}):
div.extract()
with open('Path/to/file.modified', 'w') as output_file:
output_file.write(str(soup))
you have a slash ( / ) in your ending tag ( closing span )
you need to escape it or to use a different delimiter than slash
I'm new to PHP so I don't know whether this is possible.
I need to add brackets to different timestamps so that this:
<span class="time">2:26</span>
<span class="time">2:51</span>
<span class="time">3:37</span>
<span class="time">1:19</span>
becomes this:
<span class="time">(2:26)</span>
<span class="time">(2:51)</span>
<span class="time">(3:37)</span>
<span class="time">(1:19)</span>
EDIT
The above HTML is generated using a simple DOM parser to grab info from a webpage.
If this is part of a larger HTML string or if the syntax might differ, it's a better idea to use a DOM parser.
However, if that isn't the case you can do this:
$string = str_replace('<span class="time">', '<span class="time">(', $string);
$string = str_replace('</span>', ')</span>', $string);
Or you can use regex:
$string = preg_replace('/<span class=\"time\">(\d+\:\d+)<\/span>/', '<span class="time">($1)</span>', $string);
Assuming your timestamp string is in $timestamp variable, you could use concatenation, ie.:
$output = '(' . $timestamp . ')';
Or as you mentioned, using a regular expression to validate the string before addind brackets:
$output = preg_replace("/(\d+):(\d+)/", "($0)", $timestamp);
For instance I have a string:
$string = '<div class="ImageRight" style="width:150px">';
which I want to transform into this:
$string = '<div class="ImageRight">';
I want to remove the portion
style="width:150px with preg_replace() where the
size 150 can vary, so the width can be
500px etc. aswell.
Also, the last part of the classname varies aswell, so the class can be ImageRight, ImageLeft, ImageTop etc.
So, how can I remove the style attribute completely from a string with the above mentioned structure, where the only things that varies is the last portion of the classname and the width value?
EDIT: The ACTUAL string I have is an entire html document and I don't want to remove the style attribute from the entire html, only from the tags which match the string I've shown above.
I think this is what you're after...
$modifiedHtml = preg_replace('/<(div class="Image[^"]+") style="[^"]+">/i', '<$1>', $html);
Remove completely.
$string = preg_replace("/style=\"width:150px\"/", "", $string);
Replace:
$string = preg_replace("/style=\"width:150px\"/", "style=\"width:500px\"", $string);
You can do it in two steps with
$place = 'Left';
$size = 500;
$string = preg_replace('/(?<=class="image)\W(?=")/',$place,$string);
$string = preg_replace('/(?<=style="width:)[0-9]+(?=")/',$size,$string);
Note: (?=...) is called a lookahead.
How about:
$string = preg_replace('/(div class="Image.+?") style="width:.+?"/', "$1", $string);
Simple:
$string = preg_replace('/<div class="Image(.*?)".*?>/i', '<div class="Image$1">', $string);
What is the easiest way of applying highlighting of some text excluding text within OCCASIONAL tags "<...>"?
CLARIFICATION: I want the existing tags PRESERVED!
$t =
preg_replace(
"/(markdown)/",
"<strong>$1</strong>",
"This is essentially plain text apart from a few html tags generated with some
simplified markdown rules: <a href=markdown.html>[see here]</a>");
Which should display as:
"This is essentially plain text apart from a few html tags generated with some simplified markdown rules: see here"
... BUT NOT MESS UP the text inside the anchor tag (i.e. <a href=markdown.html> ).
I've heard the arguments of not parsing html with regular expressions, but here we're talking essentially about plain text except for minimal parsing of some markdown code.
Actually, this seems to work ok:
<?php
$item="markdown";
$t="This is essentially plain text apart from a few html tags generated
with some simplified markdown rules: <a href=markdown.html>[see here]</a>";
//_____1. apply emphasis_____
$t = preg_replace("|($item)|","<strong>$1</strong>",$t);
// "This is essentially plain text apart from a few html tags generated
// with some simplified <strong>markdown</strong> rules: <a href=
// <strong>markdown</strong>.html>[see here]</a>"
//_____2. remove emphasis if WITHIN opening and closing tag____
$t = preg_replace("|(<[^>]+?)(<strong>($item)</strong>)([^<]+?>)|","$1$3$4",$t);
// this preserves the text before ($1), after ($4)
// and inside <strong>..</strong> ($2), but without the tags ($3)
// "This is essentially plain text apart from a few html tags generated
// with some simplified <strong>markdown</strong> rules: <a href=markdown.html>
// [see here]</a>"
?>
A string like $item="odd|string" would cause some problems, but I won't be using that kind of string anyway... (probably needs htmlentities(...) or the like...)
You could split the string into tag/no-tag parts using preg_split:
$parts = preg_split('/(<(?:[^"\'>]|"[^"<]*"|\'[^\'<]*\')*>)/', $str, -1, PREG_SPLIT_DELIM_CAPTURE);
Then you can iterate the parts while skipping every even part (i.e. the tag parts) and apply your replacement on it:
for ($i=0, $n=count($parts); $i<$n; $i+=2) {
$parts[$i] = preg_replace("/(markdown)/", "<strong>$1</strong>", $parts[$i]);
}
At the end put everything back together with implode:
$str = implode('', $parts);
But note that this is really not the best solution. You should better use a proper HTML parser like PHP’s DOM library. See for example these related questions:
Highlight keywords in a paragraph
Regex / DOMDocument - match and replace text not in a link
First replace any string after a tag, but force your string is after a tag:
$t=preg_replace("|(>[^<]*)(markdown)|i",'$1<strong>$2</strong>',"<null>$t");
Then delete your forced tag:
$show=preg_replace("|<null>|",'',$show);
You could split your string into an array at every '<' or '>' using preg_split(), then loop through that array and replace only in entries not beginning with an '>'. Afterwards you combine your array to an string using implode().
This regex should strip all HTML opening and closing tags: /(<[.*?]>)+/
You can use it with preg_replace like this:
$test = "Hello <strong>World!</strong>";
$regex = "/(<.*?>)+/";
$result = preg_replace($regex,"",$test);
actually this is not very efficient, but it worked for me
$your_string = '...';
$search = 'markdown';
$left = '<strong>';
$right = '</strong>';
$left_Q = preg_quote($left, '#');
$right_Q = preg_quote($right, '#');
$search_Q = preg_quote($search, '#');
while(preg_match('#(>|^)[^<]*(?<!'.$left_Q.')'.$search_Q.'(?!'.$right_Q.')[^>]*(<|$)#isU', $your_string))
$your_string = preg_replace('#(^[^<]*|>[^<]*)(?<!'.$left_Q.')('.$search_Q.')(?!'.$right_Q.')([^>]*<|[^>]*$)#isU', '${1}'.$left.'${2}'.$right.'${3}', $your_string);
echo $your_string;