Short and clear. I'm not good at regex because I never understood it. But when you look at this simple template, would you say it's possible to replace the %% + the content of it with php brackets + add function brackets to it like this:
Template:
<html>
<head>
<title>%getTitle%</title>
</head>
<body>
<div class='mainStream'>
%getLatestPostsByDate%
</div>
</body>
</html>
It should replace it to this:
<html>
<head>
<title><?php getTitle();?></title>
</head>
<body>
<div class='mainStream'>
<?php getLatestPostsByDate();?>
</div>
</body>
</html>
Is this possible? If yes, how? If anyone got a good tutorial for regEx which explains it even for not so smart guys, I'd be very thankful.
Thanks in advance.
This could be a good start. Get all between your custom tags (%%), and replace it with php code.
https://regex101.com/r/aC2hJ3/1
regex: /%(\w*?)%/g. Check explanation of regex at the right hand side (top), and generated code... it should help.
$template=file_get_contents('template.php');
$re = "/%(\\w*?)%/";
$subst = "<?php $1(); ?>";
$result = preg_replace($re, $subst, $template);
file_put_contents('template.php',$result);
Try this
$html = str_replace ('%getLatestPostsByDate%', '<?php getLatestPostsByDate();?>', $html);
If, however you are looking for a generic solution then you have to use regex
Related
I have a variable which consists of diffrent html tags:
$html = '<h1>Title</h1><u>Header</u><h2>Sub Title</h2><p>content</p><u>Footer</u>'
I want to find all the u tags in the $html variable and give them the id of their contents.
It should return:
$html = '<h1>Title</h1><u id="header" >Header</u><h2>Sub Title</h2><p>content</p><u id="footer" >Footer</u>'
You can use preg_replace() if you want it fast way, or learn about DOMDocument if you want to do it the proper way.
$pattern = '~<u>([^<]*)</u>~Ui';
$replace = '<u id="$1">$1</u>';
$html = preg_replace($pattern, $replace, $html);
You can use preg_replace.
$html = preg_replace('~<u>([^<]+)</u>~e','"<u id=\"".strtolower("$1")."\" >$1</u>"', $html);
The e means "evaluate", which allows you to cram the "strtolower" command into the replacement.
it will be good to do it using jquery if it suits your need else Forien answer is good to go
here it goes to do it in jquery
your html
<div id='specialString'>
<h1>Title</h1><u>Header</u><h2>Sub Title</h2><p>content</p><u>Footer</u>
</div>
your js
<script type="text/javascript">
$('#specialString > ul').each(function() {
$(this).attr('id', $(this).text());
});
</script>
I have a code that searches tags in .html file but I have problem executing the script it leads me to undefined index.
on my previous QUESTION I ask about searching id tags and I't leads me to used it as a reference. Enhancing the code and executing the code correctly but it shows me an error. The error searches every id tags in a .html file
CODE:
<?php
function getElementById($matches)
{
global $data;
return $matches[1].$matches[3].$matches[4].$data[$matches[3]].$matches[6];
}
$data['test'] = 'A';
$filename = 'test.html';
$html = file_exists($filename) ? file_get_contents($filename) : die('can\'t open the file');
$_HTML = preg_replace_callback('#(<([a-zA-Z]+)[^>]*id=")(.*?)("[^>]*>)([^<]*?)(</\\2>)#ism', 'getElementById', $html);
echo $_HTML;
?>
HTML:
<html>
<head>
<title>TEST</title>
</head>
<body>
<div id="test"></div>
<div id="test2"></div>
</body>
</html>
OUTPUT: PRINTSCREEN
Here's how you can implement a default:
$data3 = isset($data[$matches[3]]) ? $data[$matches[3]] : 'default';
return $matches[1].$matches[3].$matches[4].$data3.$matches[6];
Disclaimer: You shouldn't be doing all this regex stuff with HTML, blah, blah...
But if you insist
function getElementById($matches)
{
global $data;
return $matches[1]
.$matches[3]
.$matches[4]
.isset($data[$matches[3]]) ? $data[$matches[3]] : 'DEFAULT_VALUE'
.$matches[6];
}
Why not use regex?
https://stackoverflow.com/a/1732454/156811
Using regular expressions to parse HTML: why not?
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
I'm sure you can find more if you do a quick search
Some alternatives:
http://us1.php.net/dom
http://simplehtmldom.sourceforge.net/
etc.
Regex is not really the correct way todo this. I suggeset you using XPATH or anything like that. You can also use something like this:
https://code.google.com/p/phpquery/
I don't understand what is your problem?
It's 100% normal that you got an undefined index error.
In your HTML, 2 IDs are defined: 'test' and 'test2'.
Your PHP code finds those two ID and look for an entry inside $data,
however, $data contains only an entry for 'test',
so PHP tells you that there is no entry for 'test2': Notice: Undefined index: test2
That's it :-)
I have the following html content:
<p>My name is way2project</p>
Now I want this text as <p>My name is way2project</p>
Is there any way to do this? Please help me thanks
I used preg_replace but in vain.
Thanks again
You can use the strip tags function
$string = '<p>My name is way2project</p>';
echo strip_tags($string,'<p>');
note the second parameter is the list of allowed tags you wont to ignore.
This seems strange, but not knowing the complete scope of your issue and seeing that you want to do this in PHP, you can try:
$origstring = '<p>My name is way2project</p>';
$newstring = str_replace('way2project', 'way2project', $origstring);
echo $newstring;
Checkout Simple Html Dom Parser
$html = str_get_html('<html><body>Hello!SO</body></html>');
echo $html->find('a',0)->innertext; //prints "SO"
strip_tags you can use this, to remove html tags.
Ive tried to use the Dom model with no bloody luck, getElementByID just doesnt work for me.
I loathe to resort to a regex but not sure what else to do.
The idea is to replace a <div id="content_div"> all sorts </div> with a new <div id="content_div"> NEW ALL SORTS HERE </div> and keep anything that was before or after it in the string.
The string is a partial HTML string and more specifically out of the wordpress Posts DB.
Any ideas?
UPDATE: I tagged this question PHP but probably should of mentioned Im looking for a PHP solution only.
Update: Code Example
$content = ($wpdb->get_var( "SELECT `post_content` FROM $wpdb->posts WHERE ID = {$article[post_id]}" ));
$doc = new DOMDocument();
$doc->validateOnParse = true;
$doc->loadHTMLFile($content);
$element = $doc->getElementById('div_to_edit');
So Ive tried a whole lot of code and this is what Ive got so far, probably not right but Ive been hacking at it for a little while now.
You're right: getElementByID doesn't work.
Try getElementById() instead. Javascript is case sensitive.
ok i assume $content is a snippet? then this may be all you need:
$doc = new DOMDocument();
$doc->validateOnParse = true;
$doc->loadHTMLFile('<html><body>' . $content . '</body></html>');
$element = $doc->getElementById('div_to_edit');
remember that you must have a valid webpage and that means full HTML tree:
<html>
<head>
<script type="text/javascript">
function changeText(div){
document.getElementById(div).innerHTML = 'my new text';
}
</script>
</head>
<body>
--- your body ---
<div id="DIV_ID">old text</div>
<input type="button" onclick="chageText('DIV_ID');" value="Click me" />
</body>
</html>
remember that you have to call getElementById with:
<script type="text/javascript"> // as from HTML5 you can use only <script>
document.getElementById("DIV_ID").innerHTML = 'my new text';
</script>
Try this..
var myDiv = document.getElementById('content_div');
myDiv.innerHTML = "NEW ALL SORTS HERE";
Looking for a regexp sequence of matches and replaces (preferably PHP but doesn't matter) to change this (the start and end is just random text that needs to be preserved).
IN:
fkdshfks khh fdsfsk
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
<!--eg1-->
<div class="autoit" style="font-family:monospace;">
<span class="kw3">msgbox</span>
</div>
<!--gc2-->
<!--bXNnYm94-->
<!--egc2-->
<!--g2-->
</div>
<!--eg2-->
fdsfdskh
to this OUT:
fkdshfks khh fdsfsk
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
<div class="autoit" style="font-family:monospace;">
<span class="kw3">msgbox</span>
</div>
</div>
fdsfdskh
Thanks.
Are you just trying to remove the comments? How about
s/<!--[^>]*-->//g
or the slightly better (suggested by the questioner himself):
<!--(.*?)-->
But remember, HTML is not regular, so using regular expressions to parse it will lead you into a world of hurt when somebody throws bizarre edge cases at it.
preg_replace('/<!--(.*)-->/Uis', '', $html)
This PHP code will remove all html comment tags from the $html string.
A better version would be:
(?=<!--)([\s\S]*?)-->
It matches html comments like these:
<!--
multi line html comment
-->
or
<!-- single line html comment -->
and what is most important it matches comments like this (the other regex shown by others do not cover this situation):
<!-- this is my blog: <mynixworld.inf> -->
Note
Although syntactically the one below is a html comment your browser might parse it somehow differently and thus it might have a special meaning. Stripping such strings might break your code.
<!--[if !(IE 8) ]><!-->
Do not forget to consider conditional comments, as
<!--(.*?)-->
will remove them. Try this instead:
<!--[^\[](.*?)-->
This will also remove downlevel-revealed conditional comments, though.
EDIT:
This won't remove downlevel-revealed or downlevel-hidden comments.
<!--(?!<!)[^\[>].*?-->
Ah I've done it,
<!--(.*?)-->
With next:
/( )*<!--((.*)|[^<]*|[^!]*|[^-]*|[^>]*)-->\n*/g
Can remove multiline comments using test string:
fkdshfks khh fdsfsk
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
<!--eg1-->
<div class="autoit" style="font-family:monospace;">
<span class="kw3">msgbox</span>
</div>
<!--gc2-->
<!--bXNnYm94-->
<!--egc2-->
<!--g2-->
</div>
<!--eg2-->
fdsfdskh
<!-- --
> test
- -->
<!-- --
<- test <
>
- -->
<!--
test !<
- <!--
-->
<script type="text/javascript">//<![CDATA[
var xxx = 'a';
//]]></script>
ok
Try the following if your comments contain line breaks:
/<!--(.|\n)*?-->/g
<!--([\s\S]*?)-->
Works in javascript and VBScript also as "." doesn't match line breaks in all languages
Here is my attempt:
<!--(?!<!)[^\[>][\s\S]*?-->
This will also remove multi line comments and won't remove downlevel-revealed or downlevel-hidden comments.
I know that this is quite an old post, but I felt that it would be useful to add to this post in case anyone wants an easy to implement PHP function that directly answers the original question.
/**
* Strip all the html comments from $text
*
* #param $text - text to modify
* #param string $new replacement string
* #return array|string|string[]|null
*/
function strip_html_comments($text, $new=''){
$search = array ("|<!--[\s\S]*?-->|si");
$replace = array ($new);
return preg_replace($search, $replace, $text);
}
these code is also remove javascript code.
that's too bad :|
here's the example javascript code will be remove with this code:
<script type="text/javascript"><!--
var xxx = 'a';
//-->
</script>
function remove_html_comments($html) {
$expr = '/<!--[\s\S]*?-->/';
$func = 'rhc';
$html = preg_replace_callback($expr, $func, $html);
return $html;
}
function rhc($search) {
list($l) = $search;
if (mb_eregi("\[if",$l) || mb_eregi("\[endif",$l) ) {
return $l;
}
}
// Remove multiline comment
$mlcomment = '/\/\*(?!-)[\x00-\xff]*?\*\//';
$code = preg_replace ($mlcomment, "", $code);
// Remove single line comment
$slcomment = '/[^:]\/\/.*/';
$code = preg_replace ($slcomment, "", $code);
// Remove extra spaces
$extra_space = '/\s+/';
$code = preg_replace ($extra_space, " ", $code);
// Remove spaces that can be removed
$removable_space = '/\s?([\{\};\=\(\)\\\/\+\*-])\s?/';
$code = preg_replace ('/\s?([\{\};\=\(\)\/\+\*-])\s?/', "\\1", $code);
If you just want the text or text with specific tags you can handle this with PHP strip_tags it also delete HTML comment and you can save HTML tags you need like this:
$text = '<p>Test paragraph.</p><!-- Comment --> Other text';
echo strip_tags($text, ['p', 'a']);
the output will be:
<p>Test paragraph.</p> Other text
I hope it helps somebody!
You can achieve this with modern JavaScript.
function RemoveHtmlComments() {
let children = document.body.childNodes;
for (let child in children) {
if (children[child].nodeType === Node.COMMENT_NODE) children[child].remove();
}
}
It should be safer than RegEx.