I have an HTML document where I would like to remove block from (starting with date 20170908 ending with next script tag), however preg_replace can't detect anything that lies below the newline. If I manually erase newlines, reg expression works, but I'd like to trim them programmatically. A part of the HTML document:
<script type="text/javascript" src="iam.js"></script><script
type="text/javascript"src="/search.js"></script><script
type="text/javascript" > /* 20170908 */ function uabpd4(){
//some function
}
</script>
In PHP I do the following:
$content = trim(preg_replace('/\s+/', ' ', $content)); // just trying to get rid of newlines, but nothing from this works
$content = preg_replace( "/\r|\n/is", "", $content);
$content = str_replace(array("\n", "\t", "\r"), '', $content);
$content = preg_replace("/\/\* $date(.*?)(((?!script>).)uabpd4(.*?script>))/is", "WORKS </script>", $content);
Thank you.
If I understand you correct you want to remove the javascript part with the date in it.
This is one method, match the part you want to remove and use str_replace to remove it.
$re = '/.*script type.*<script.*type.*?>(.*?uabpd4.*})/s';
$str = '<script type="text/javascript" src="iam.js"></script><script
type="text/javascript"src="/search.js"></script><script
type="text/javascript" > /* 20170908 */ function uabpd4(){
//some function
}
</script>';
preg_match($re, $str, $m);
echo str_replace($m[1], "", $str);
https://3v4l.org/ktcXo
Related
I'm trying to create a simple PHP find and replace system by looking at all of the images in the HTML and add a simple bit of code at the start and end of the image source. The image source has something like this:
<img src="img/image-file.jpg">
and it should become into this:
<img src="{{media url="wysiwyg/image-file.jpg"}}"
The Find
="img/image-file1.jpg"
="img/file-2.png"
="img/image3.jpg"
Replace With
="{{media url="wysiwyg/image-file.jpg"}}"
="{{media url="wysiwyg/file-2.png"}}"
="{{media url="wysiwyg/image3.jpg"}}"
The solution is most likely simple yet from all of the research that I have done. It only works with one string not a variety of unpredictable strings.
Current Progress
$oldMessage = "img/";
$deletedFormat = '{{media url="wysiwyg/';
$str = file_get_contents('Content Slots/Compilied Code.html');
$str = str_replace("$oldMessage", "$deletedFormat",$str);
The bit I'm stuck at is find the " at the end of the source to add the end of the required code "}}"
I don't like to build regular expressions to parse HTML, but it seems that in this case, a regular expression will help you:
$reg = '/=["\']img\/([^"\']*)["\']/';
$src = ['="img/image-file1.jpg"', '="img/file-2.png"', '="img/image3.jpg"'];
foreach ($src as $s) {
$str = preg_replace($reg, '={{media url="wysiwyg/$1"}}', $s);
echo "$str\n";
}
Here you have an example on Ideone.
To make it works with your content:
$content = file_get_contents('Content Slots/Compilied Code.html');
$reg = '/=["\']img\/([^"\']*)["\']/';
$final = preg_replace($reg, '={{media url="wysiwyg/$1"}}', $content);
Here you have an example on Ideone.
In my opinion what you are doing is not the best way this can be done. I would use abstract template for this.
<?php
$content = file_get_contents('Content Slots/Compilied Code.html');
preg_match_all('/=\"img\/(.*?)\"/', $content, $matches);
$finds = $matches[1];
$abstract = '="{{media url="wysiwyg/{filename}"}}"';
$concretes = [];
foreach ($finds as $find) {
$concretes[] = str_replace("{filename}", $find, $abstract);
}
// $conretes[] will now have all matches formed properly...
Edit:
To return full html use this:
<?php
$content = file_get_contents('Content Slots/Compilied Code.html');
preg_match_all('/=\"img\/(.*)\"/', $content, $matches);
$finds = $matches[1];
$abstract = '="{{media url="wysiwyg/{filename}"}}"';
foreach ($finds as $find) {
$content = preg_replace('/=\"img\/(.*)\"/', str_replace("{filename}", $find, $abstract), $content, 1);
}
echo $content;
Im trying to get the content what is is the javascript script tag.
<script type="text/javascript"> [CONTENT HERE] </script>
currently i have something like this:
$start = preg_quote('<script type="text/javascript">', '/');
$end = preg_quote('</script>', '/');
preg_match('/ '.$start. '(.*?)' .$end.' /', $test, $matches);
But when i vardump it, its empty
Try the following instead:
$test = '<script type="text/javascript"> [CONTENT HERE] </script>';
$start = preg_quote('<script type="text/javascript">', '/');
$end = preg_quote('</script>', '/');
preg_match("/$start(.*?)$end/", $test, $matches);
var_dump($matches);
Output:
array (size=2)
0 => string '<script type="text/javascript"> [CONTENT HERE] </script>' (length=56)
1 => string ' [CONTENT HERE] ' (length=16)
The problem is actually the spaces after and before / in the preg_match
Your regular expression requires there to be a space before the script start tag:
'/ '
The code works fine when I add that space to the data:
<?php
$test = ' <script type="text/javascript"> [CONTENT HERE] </script> ';
$matches = [];
$start = preg_quote('<script type="text/javascript">', '/');
$end = preg_quote('</script>', '/');
preg_match('/ '.$start. '(.*?)' .$end.' /', $test, $matches);
print_r($matches);
?>
Changing '/ ' to '/' also works.
… but that said, you should avoid regular expressions when processing HTML and use a real parser instead.
An alternative approach would be to use strip_tags instead...
$txt='<script type="text/javascript"> [CONTENT HERE] </script>';
echo strip_tags($txt);
Outputs
[CONTENT HERE]
i have a problem with a function in php i want to convert all the "\n" in a clear space, i've tried with this, but it doesn't work
function clean($text) {
if ($text === null) return null;
if (strstr($text, "\xa7") || strstr($text, "&")) {
$text = preg_replace("/(?i)(\x{00a7}|&)[0-9A-FK-OR]/u", "", $text);
}
$text = htmlspecialchars($text, ENT_QUOTES, "UTF-8");
if (strstr($text, "\n")) {
$text = preg_replace("\n", "", $text);
}
return $text;
}
This is wat i want remove
The site: click here
If you literally have "\n" in your text, which appears to be the case from your screenshots, then do the following:
$text = str_replace("\\n", '', $text);
\n is a special character in PHP that creates new lines, so we need to add the escape character \ in front of it in order to remove text instances of "\n".
preg_replace() seems to work better this way:
$text = preg_replace('/\n/',"",$text);
Single quotes enforce no substitution when sending your pattern to the parser.
I have a function that compress the php output but it gives me problems with the inline javascript.
I found a page with related topic but the sample from there it's not work: http://jeromejaglale.com/doc/php/codeigniter_compress_html
The problem occur when use // shorten multiple whitespace sequences
If I remove this part from the array the Javascript is working well.
Question:
It's there a possibility to achieve the desired result without the need to use other libraries like Minify or Tidy?
The function:
/**
* [sanitize_output Compress the php output]
* #param [type] $buffer [ Buffer = ob_get_contents(); ]
* #return [type] [ string ]
*/
function sanitize_output($buffer)
{
$search = array(
'/\>[^\S ]+/s', //strip whitespaces after tags, except space
'/[^\S ]+\</s', //strip whitespaces before tags, except space
'/(\s)+/s', // shorten multiple whitespace sequences
'/<!--(.|\s)*?-->/', //strip HTML comments
'#(?://)?<!\[CDATA\[(.*?)(?://)?\]\]>#s', //leave CDATA alone
);
$replace = array(
'>',
'<',
'\\1',
'',
"//<![CDATA[\n".'\1'."\n//]]>",
);
$buffer = preg_replace($search, $replace, $buffer);
return $buffer;
} // End sanitize_output()
The way I use it:
<?php
if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler"); else ob_start();
// ... Code ... Several CSS ... JS ..
$output = ob_get_contents();
ob_end_clean();
$output = sanitize_output($output);
echo $output;
For some reason str_replace() does not work with /. I am creating a function to accept a unique linking style in input and text area forms for a blog CMS that I am making. For instance, [{http://brannondorsey.com}My Website] will be translated to <a href='http://brannondorsey.com'>My Website</a> when passed through make_link($string);. Here is my code:
function make_link($input){
$double = str_replace( '"', '"', $input);
$single = str_replace("'", "'", $double);
$bracket_erase = str_replace('[', "", $single);
$link_open = str_replace('{', '<a href="', $bracket_erase);
$link_close = str_replace("}", ">", $link_open);
$link_value = str_replace(']', "</a>", $link_close);
echo $link_value;
}
Everything works correctly except for ] is not replaced with </a>. If I remove the slash it will successfully replace ] with <a>, however, as we all know, that does not properly close an anchor tag and therefor makes all html content between the {and the next closing anchor tag in my webpage a link.
You might want to go down the regular expression route for this.
function make_link($link){
return preg_replace('/\[{(.*?)}(.*?)\]/i', '$2', $link);
}
I personally suggest the preg_replace answer of Marcus Recck below rather than mine here.
its there just not seen because the browser wont show html, but you can use the below to see it, and\or use the browsers view source option
$link_close ="]";
$link_value = str_replace(']', "</a>", $link_close);
echo htmlspecialchars($link_value);//= </a>
var_dump ($link_value); //=string(4) "" [invisible due to browser, but the 4 tells you its there]
the finial version of the OP's function:
function make_link($input){
$double = str_replace( '"', '"', $input);
$single = str_replace("'", "'", $double);
$bracket_erase = str_replace('[', "", $single);
$link_open = str_replace('{', '<a href="', $bracket_erase);
$link_close = str_replace("}", '">', $link_open);
$link_value = str_replace(']', "</a>", $link_close);
return $link_value;
}
echo htmlspecialchars(make_link('[{http://brannondorsey.com}My Website]'));