I'd like to use this function:
ob_start('no_returns');
function no_returns($a) {
return str_replace(
array("\r\n","\r","\n","\t",'','',''),
'', $a);
}
But when I do, it completely kills Disqus comments so I'd like to ignore the DIV "disqus_thread". How would I go about doing that without using some heavy search?
If you are looking to speed up the download of the web page, you might try another method:
<?php
ob_start('ob_gzhandler');
// html code here
This will compress the output in a much more efficient manner and your browser will automatically decompress the output in real-time before the visitor sees it.
A related thread on-line is here: http://bytes.com/topic/php/answers/621308-compress-html-output-php
(This is the PHP way to compress web pages without using the webserver configuration. For example apache+gzip/mod_deflate on apache as mentioned above)
Try Regular Expression and preg_replace
Related
Is there a way to perform search and replace on PHP code before it is interpreted by PHP engine ?
Desired timeline:
PHP code is <?php echo("hello"); ?>.
Search and replace operation is hello → good bye
PHP code is now <?php echo("good bye"); ?>.
PHP engine interprets the code (output is good bye).
It is possible to manipulate the output of the PHP engine, using ob_start or even mod_substitute as an output filter of Apache. However, is it possible to manipulate the input of the PHP engine (PHP code, request, etc)?
Edit:
I'm working with thousands of PHP files and I don't want them to be modified. I would like to replace host1 with host2 in these files, since the files were copied from host1 and they have to be executed on host2. Indeed, several tests are made on the host name.
You could use a php script that opens a .php file, makes the necessary replacement, then includes that file.
Request is unusual)
Looks stupid. But if you really need it try this..
For example you need to change file test.php before execute it.
Try following steps (in index.php for example):
1) Open test.php for reading and get its content.
2) Replace everything you want.
3) Save changed text as test2.php
4) Include test2.php.
EDIT
Better thin why you need it? Try using variables.
It will help)
You should study template engines like smarty and various forms of prefilters.
Modifying code like this isn't good idea, however to answer your question... You may load original code, preprocess it and store in new file:
$contents = file_get_contents( 'original.php');
// ...
file_put_contents( 'new.php', $contents);
include( 'new.php')
But I don't see any valid use of this...
You also may possibly use eval() but every time you do that a kitty dies (no really, it's dangerous function and there's really just a few valid uses of it).
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Save full webpage
I need to save page source of external link using PHP , Like we saving in PC.
p.s :saved folder has images and html contents.
I tried below code...it just puts the source in tes.html , i need to save all images too.So we access if offline.
<?php
include 'curl.php';
$game = load("https://otherdomain.com/");
echo $game;
?>
<?php
file_put_contents('tes.html', $game);
?>
What you are trying to do is mirroring a web site.
I would use the program wget to do so instead of reinventing the wheel.
exec( 'wget -mk -w 20 http://www.example.com/' );
See:
http://en.wikipedia.org/wiki/Wget
http://fosswire.com/post/2008/04/create-a-mirror-of-a-website-with-wget/
Either write your own solution to parse all the CSS, image and JS links (and save them) or check this answer to a similar question: https://stackoverflow.com/a/1722513/143732
You need to write a scraper and, by the looks of it, you're not yet skilled for such an endeavor. Consider studying:
Web Scraping (cURL, StreamContext in PHP, HTTP theory)
URL paths (relative, absolute, resolving)
DOMDocument and DOMXPath (for parsing HTML and easy tag querying)
Overall HTML structure (IMG, LINK, SCRIPT and other tags that load external content)
Overall CSS structure (like url('...') in CSS that loads resources the page depends on)
And only then will you be able to mirror a site, properly. But if they load content dynamically, like with Ajax, you're out of luck.
file_get_contents() also supports http(s). Example:
$game = file_get_contents('https://otherdomain.com');
Wondering how to reach a css file like this one from css-tricks.com
http://cdn.css-tricks.com/wp-content/themes/CSS-Tricks-9/style.css?v=9.5
Not sure if he is using php to accomplish this or not. I've been reading countless articles with no luck.
Also, is it something automated that spits out the version number after the .css? Been seeing it around and wondered how to achieve a clean css file.
Any help is appreciated! Thanks.
It's simple enough to use an editor with Search/Replace and strip out all the unnecessary spaces. For instance, when I write CSS I only use spaces to separate keywords - I use newlines and tabs to format it legibly. So I could just replace all tabs and newlines with the empty string and the result is "minified" CSS like the one above.
The version number is a fairly common cache trick. It doesn't affect anything server-side, but the browser sees it as a new file, and caches it as such. This makes it easy to purge the cache of all users when an update is made. Personally, though, I use a PHP function to append "?t=".filemtime($file) (in other words, the timestamp that the file was modified) automatically, which saves me the trouble of manually updating version numbers.
Here is the exact code I use to automatically append modification time to JS and CSS files:
<?php
ob_start(function($data) {
chdir($_SERVER['DOCUMENT_ROOT']);
return preg_replace_callback(
"(/(?:js|css)/.*?\.(?:js|css))",
// all the relevant files are in /js and /css folders of the root
function($m) {
if( file_exists(substr($m[0],1)))
return $m[0]."?t=".filemtime(substr($m[0],1));
else return $m[0];
},
$data
);
});
?>
I would avoid to do it manually because you may corrupt your css.
There are good tools available which will solve such problems for you without to be tricky.
An excellent solution is Assetic which is an assets manager and allow you to filter (minify, compress) using various tools (yuicompressor, google closure, etc..).
It is currently bundle by default with Symfony2 but may be used standalone in any PHP Project.
I've successfully implemented it in a Zend Framework project.
I had a big PHP script written out to scrape images from this site: "http://www.mcso.us/paid/", but when it didn't work I butchered my code to simply echo the whole page.
I found that the table with the image links I want doesn't show up. I believe it's because the remote site uses ASP to generate the table. Is there a way around this? Am I wrong? Please help.
<?php
include("simple_html_dom.php");
set_time_limit(0);
$baseURL = "http://www.mcso.us/paid/";
$html = file_get_html($baseURL);
echo $html;
?>
There's no obvious reason why them using ASP would cause this, have you tried navigating the page with JavaScript turned off? It's a more likely scenario that the tables are generated through JS.
Do note that the search results are retrieved through ajax ( page http://www.mcso.us/paid/default.aspx ) by making a POST request, you can use cURL http://php.net/manual/en/book.curl.php , use chrome right-click-->inspect element---> network and make a search you will see all the info there (post variables etc ...)
I need to compare a webpage's DOM structure at various points in point. What are the ways to retrieve and snapshot it.
I need the DOM on server-side for processing.
I basically need to track structural changes to a webpage. Such as removing of a div tag, or inserting a p tag. Changing data (innerHTML) on those tags should not be seen as a difference.
$html_page = file_get_contents("http://awesomesite.com");
$html_dom = new DOMDocument();
$html_dom->loadHTML($html_page);
That uses PHP DOM. Very simple and actually a bit fun to use. Reference
EDIT: After clarification, a better answer lies here.
Perform the following steps on server-side:
Retrieve a snapshot of the webpage via HTTP GET
Save consecutive snapshots of a page with different names for later comparison
Compare the files with an HTML-aware diff tool (see HtmlDiff tool listing page on ESW wiki).
As a proof-of-concept example with Linux shell, you can perform this comparison as follows:
wget --output-document=snapshot1.html http://example.com/
wget --output-document=snapshot2.html http://example.com/
diff snapshot1.html snapshot2.html
You can of course wrap up these commands into a server-side program or a script.
For PHP, I would suggest you to take a look at daisydiff-php. It readily provides a PHP class that enables you to easily create an HTML-aware diff tool. Example:
<?
require_once('HTMLDiff.php');
$file1 = file_get_contents('snapshot1.html');
$file2 = file_get_contents('snapshot1.html');
HTMLDiffer->htmlDiffer( $file1, $file2 );
?>
Note that with file_get_contents, you can also retrieve data from a given URL as well.
Note that DaisyDiff itself is very fine tool for visualisation of structural changes as well.
If you use firefox, firebug lets you view the DOM structure of any web page.