Is it possible to change original html text in php? - php

I am trying to make "manner friendly" website. We use different declination dependent on gender and other factors. For example:
You did = robili
It did = robilo
She did = robila
Linguisticaly this is very simplified (and unlucky) example! I would like to change html text in php file where appropriate. For example
<? php
something
?>
html text of the page and somewhere is the word "robil"
<div>we tried to robil^i|o|a^</div>
<? php something ?>
Now I would like to replace all occurences of different tokens ^characters|characters|characters^ and replace them by one of their internal values according to "gender".
It is easy in javascript on the client side, but you will see all this weird "tokenizing" before javascript replace it.
Here I do not know the elegant solution.
Or do you have better idea?
Thanks for advice.

You can add these scripts before and after the HTML:
<?php
// start output buffering
ob_start();
?>
<html>
<body>
html text of the page and somewhere is the word "robil"
<div>we tried to robil^i|o|a^, but also vital^si|sa|ste^, borko^mal|mala|malo^ </div>
</body>
</html>
<?php
$use = 1; // indicate which declination to use (0,1 or 2)
// get buffered html
$html = ob_get_contents();
ob_end_clean();
// match anything between '^' than's not a control chr or '^', min 5 and max 20 chrs.
if (preg_match_all('/\^[^[:cntrl:]\^]{3,20}\^/',$html,$matches))
{
// replace all
foreach (array_unique($matches[0]) as $match)
{
$choices = explode('|',trim($match,'^'));
$html = str_replace($match,$choices[$use],$html);
}
}
echo $html;
This returns:
html text of the page and somewhere is the word "robil" we tried to
robilo, but also vitalsa, borkomala

Related

How to extract HTML element from a source file

I need to replace a HTML section identified by a tag id in a source code, which is combination of HTML and PHP using PHP. In case it's pure HTML, DOM parser could be used; in case there is no DIV in DIV, I can imagine how to use preg_match. This is what I am trying to do - I have a code (loaded into a string) like:
<div>
<img >
</div>
<? include(); ?>
<div id="mydiv">
<div>
<div>
<img >
</div>
</div>
</div>
and my task is to replace content of "mydiv" DIV with a new one e.g.
<div id="newdiv>
some text
</div>
so the string will look like this after the change:
<div>
<img >
</div>
<? include(); ?>
<div id="mydiv">
<div id="newdiv>
some text
</div>
</div>
I have already tried:
1) parsing the code using DOMdocument's loadHTML => it produces a lot of errors in case PHP code is included.
2) I played around a bit with regexes like preg_match_all('/<div id="myid"([^<]*)<\/div>/', $src, $matches), which fails in case more child divs are included.
The best approach I have found so far is:
1) find id="mydiv" string
2) search for '<' and '>' chars and count them like '<'=1 and '>'=-1 (not exactly, but it gives the idea)
3) once I get sum == 0 I should be on position of the closing tag, so I know, which portion string I should exchange
This is quite "heavy" solution, which can stop working in some cases, where the code is different (e.g. onpage PHP code contains the chars as well instead of just simple "include"). So I am looking so some better solution.
You could try something like this:
$file = 'filename.php';
$content = file_get_contents($file);
$array_one = explode( '<div id="mydiv">' , $content );
$my_div_content = explode("</div>" , $array_one[1] )[0];
Or use preg_match like you said:
preg_match('/<div id="mydiv"(.*?)<\/div>/s', $content, $matches)
Yes there is. First you need to use a function that will get the content of the file. Lets call the file homepage.php:
$homepageString = file_get_contents('homepage.php');
Now you have a string with all the content. The next thing you would do is use the preg_replace() function to take out the part of code that you want to take out:
$newHomepageString = preg_replace('/id="mydiv"/',"", $homepageString);
Now you overwrite the existing homepage.php file with the new source code:
file_put_contents("homepage.php", $newHomepageString);
Let me know if it worked for you! :)

How to chain in phpquery (almost everything can be a chain)

Good day everyone,
I'm very new with phpquery and this is my first post here at stackoverflow for a reason that i cant find the correct for syntax for the phpquery chaining. I know someone knows what i been looking for.
I only want to remove the a certain div inside a div.
<div id = "content">
<p>The text that i want to display</p>
<div class="node-links">Stuff i want to remove</div>
</content>
This few lines of codes works perfect
pq('div.node-links')->remove();
$text = pq('div#content');
print $text; //output: The text that i want to display
But when I tried
$text = pq('div#content')->removeClass('div.node-links'); //or
$text = pq('div#content')->remove('div.node-links');
//output: The text that i want to display (+) Stuff i want to remove
Can someone tell me why the second block of code is not working?
Thanks!
The first line of code will only work if your trying to remove the class from div.node-links, it won't remove the node.
If you are trying to remove the class you need to change it from:
$text = pq('div#content')->removeClass('div.node-links');
// to
$text = pq('div#content')->find('.node-links')->removeClass('node-links')->end();
which will output:
<div id="content">
<p>The text that i want to display</p>
<div>Stuff i want to remove</div>
</div>
As for the second line of code.. I'm not exactly sure why it is not working, it seems like your not selecting .node-links but I was able to get the desired results using these.
// $markup = file_get_contents('test.html');
// $doc = phpQuery::newDocumentHTML($markup);
$text = $doc->find('div#content')->children()->remove('.node-links')->end();
// or
$text = pq('div#content')->find('.node-links')->remove()->end();
// or
$text = pq('div#content > *')->remove('.node-links')->parent();
Hope that helps
Since remove() does not take any parameter, you can do:
$text = pq('div#content div.node-links')->remove();

Get content in faster way from url using php

I am using php, I want to get the content from url in faster way.
Here is a code which I use.
Code:(1)
<?php
$content = file_get_contents('http://www.filehippo.com');
echo $content;
?>
Here is many other method to read files like fopen(), readfile() etc. But I think file_get_contents() is faster than these method.
In my above code when you execute it you see that it give every thing from this website even images and ads. I want to get only plan html text no css-style, images and ads. How can I get this.
See this to understand.
CODE:(2)
<?php
$content = file_get_contents('http://www.filehippo.com');
// do something to remove css-style, images and ads.
// return the plain html text in $mod_content.
echo $mod_content;
?>
If I do that like above then I am going in wrong way, because I already get the full content in variable $content and then modify it.
Can here is any function method or anything else which get the directly plain html text from url.
Below code is written only to understanding, this is not the original php code.
IDEAL CODE:(3);
<?php
$plain_content = get_plain_html('http://www.filehippo.com');
echo $plain_content; // no css-style, images and ads.
?>
If I can get this function it will be much faster than others. Can it is possible.
Thanks.
Try this.
$content = file_get_contents('http://www.filehippo.com');
$this->html = $content;
$this->process();
function process(){
// header
$this->_replace('/.*<head>/ism', "<?xml version='1.0' encoding='UTF-8'?><!DOCTYPE html PUBLIC '-//WAPFORUM//DTD XHTML Mobile 1.0//EN' 'http://www.wapforum.org/DTD/xhtml-mobile10.dtd'><html xmlns='http://www.w3.org/1999/xhtml'><head>");
// title
$this->_replace('/<head>.*?(<title>.*<\/title>).*?<\/head>/ism', '<head>$1</head>');
// strip out divs with little content
$this->_stripContentlessDivs();
// divs/p
$this->_replace('/<div[^>]*>/ism', '') ;
$this->_replace('/<\/div>/ism','<br/><br/>');
$this->_replace('/<p[^>]*>/ism','');
$this->_replace('/<\/p>/ism', '<br/>') ;
// h tags
$this->_replace('/<h[1-5][^>]*>(.*?)<\/h[1-5]>/ism', '<br/><b>$1</b><br/><br/>') ;
// remove align/height/width/style/rel/id/class tags
$this->_replace('/\salign=(\'?\"?).*?\\1/ism','');
$this->_replace('/\sheight=(\'?\"?).*?\\1/ism','');
$this->_replace('/\swidth=(\'?\"?).*?\\1/ism','');
$this->_replace('/\sstyle=(\'?\"?).*?\\1/ism','');
$this->_replace('/\srel=(\'?\"?).*?\\1/ism','');
$this->_replace('/\sid=(\'?\"?).*?\\1/ism','');
$this->_replace('/\sclass=(\'?\"?).*?\\1/ism','');
// remove coments
$this->_replace('/<\!--.*?-->/ism','');
// remove script/style
$this->_replace('/<script[^>]*>.*?\/script>/ism','');
$this->_replace('/<style[^>]*>.*?\/style>/ism','');
// multiple \n
$this->_replace('/\n{2,}/ism','');
// remove multiple <br/>
$this->_replace('/(<br\s?\/?>){2}/ism','<br/>');
$this->_replace('/(<br\s?\/?>\s*){3,}/ism','<br/><br/>');
//tables
$this->_replace('/<table[^>]*>/ism', '');
$this->_replace('/<\/table>/ism', '<br/>');
$this->_replace('/<(tr|td|th)[^>]*>/ism', '');
$this->_replace('/<\/(tr|td|th)[^>]*>/ism', '<br/>');
// wrap and close
}
private function _replace($pattern, $replacement, $limit=-1){
$this->html = preg_replace($pattern, $replacement, $this->html, $limit);
}
for more - https://code.google.com/p/phpmobilizer/
you can use regular expression to delete css-script's tags and image's tags, just replace those codes with blank space
preg_replace($pattern, $replacement, $string);
for more detail of function go here: http://php.net/manual/en/function.preg-replace.php

How to replace code between the specific tags and apply php eval on it

I have a variable which calls content from database, sample is below
$content = '<div><h1>content here</h1>
<img src = 'image.jpg' /><br />
[code]echo 'welcome';[/code]
<h2>some content here</h2>
<p> some large content here</p>
[code]echo 'Click Here';[/code]
Thank you.';
echo 'headers here' .$content . 'footers here';
how can i execute PHP for the content in between [code] and [/code] tags?
remaining text would written as html execpt the codes used in [code] some php code [/code] tags
If this is for a templateting type of system for your site I would suggest to go a different route.
First your above code would change so that the portion between you code would look like:
[code]{{msg}}}[/code]
Now, you can still store your stuff in the database as php code if you want, but it makes more sense to eval it before you put it into the database, but I strongly suggest otherwise and go with a find/replace system.
Now, you would want to have a function that would do the following:
function output_template( $name, $data ) {
$template_string = get_template_from_db( $name );
for( $data as $k )
{
$template_string = str_replace( $k, $data[$k], $template_string );
}
return $template_string;
}
Then you can echo out the return value of this function. If you think you need to use eval, rethink what you are doing, especially for what appears to be a templating system.

Tag stripping allowing some html tags - Facebook-ish

I am doing something like posting function in a local app it's working fine really but it lacks with validation and not to mention the validation I made was a mess. I'm using jQuery oEmbed.
What I wanted is to print the illegal html tag(s) as is and activate/perform(I don't know the right term) the html tags I have allowed.
Any suggestions?
This is the best solution i came up.
First replaced all the < and > for html code then replaced back the allowed tags.
<?php
$original_str = "<html><b>test</b><strong>teste</strong></html>";
$allowed_tags = array("b", "strong");
$sans_tags = str_replace(array("<", ">"), array("<",">"), $original_str);
$regex = sprintf("~<(/)?(%s)>~", implode("|",$allowed_tags));
$with_allowed = preg_replace($regex, "<\\1\\2>", $sans_tags);
echo $with_allowed;
echo "\n";
Result:
guax#trantor:~$ php teste.php
<html><b>test</b><strong>teste</strong></html&gt
I wonder if there's any solution for replacing all at once. But it works.

Categories