Comparing XML documents for changes in PHP - php

Currently I'm using PHP to load multiple XML files from around the web (non-local) using simplexml_load_file(). This, as you can imagine, is quite a clunky process and is slowing load time significantly (7 seconds to load 7 files), and there could possibly be more files to load. These files don't change often, but changes should be displayed on the page as soon as they are made.
One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
My two concerns with this are:
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Speed: The main goal here is to increase the speed of the overall page load. Would the process described above increase the speed, or would it just bog down the server with more to do? Thanks for your help!

How about having a cron job crawl through every external XML source, say, hourly or quarter-hourly and update it if necessary?
It wouldn't be in 100% real time, but would take the load off your web page - that would always be using cached files. I don't think there is a reliable way of polling external sources for updates other than actually downloading the file (in theory, it should be possible to get the correct cache headers, but I wouldn't rely on them being configured correctly.)
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Hardly. To make totally sure, store the cached XML files outside the web root. The any threat that remains then is the same as if you were passing the stream through live.

One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
Rather than caching the XML file yourself, you should set the If-None-Match or If-Modified-Since fields in the request header. This way you can check to see if the files have changed without necessarily downloading them.
This can be done by setting a stream context for libxml before running simplexml_load_file(). If the file hasn't changed, you'll get a 304 Not Modified response, and simplexml_load_file will fail.
You could also use stream_context_get_default to set the general stream context, then retrieve the XML file into a string with file_get_contents and pass it to simplexml_load_string().
Here's an example of the first way:
Class CachedXml {
public $element,$url;
private $mod_date, $etag;
public function __construct($url){
$this->url = $url;
$this->element = NULL;
$this->mod_date = FALSE;
$this->etag = FALSE;
}
public function updateXml(){
if($this->mod_date || $this->etag){
$opts = array(
'http'=>array(
'header'=>"If-Modified-Since: $this->mod_date\r\n" .
"If-None-Match: $this->etag\r\n"
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);
}
if($attempt = # simplexml_load_file($this->url)){
$this->element = $attempt;
$headers = get_headers($this->url,1);
$this->mod_date = $headers['Last-Modified'];
$this->etag = $headers['ETag'];
return TRUE;
}
return FALSE;
}
}
$bob = new CachedXml('http://example.com/xml/test.xml');
if($bob->updateXml()){
echo "Bob was just updated.<br />";
echo " Bob's name is " . $bob->element->getName() . ".<br />";
}
else{
echo "Bob was not updated.<br />";
}

Related

PHP: Displaying an image from a web service

I'm using an external web service that will return an image URL which i will display in my website, For example :
$url = get_from_web_service();
echo '<img url="'.$url.'" />';
everything is working fine except if i have 100 images to show then calling the web service become time & resources consuming.
//the problem
foreach($items as $item) {
$url = get_from_web_service($item);
echo '<img url="'.$url.'" />';
}
So now i'm considering two options:
//Option1: Using php get_file_contents():
foreach($items as $item)
{
echo '<img url="url_to_my_website/get_image.php?id='.$item->id.'" />'
}
get_image.php :
$url = get_from_web_service($id);
header("Content-Type: image/png");
echo file_get_contents($url);
//Option2: Using ajax:
echo '<img scr="dummy_image_or_website_logo" data-id="123" />';
//ajax call to the web service to get the id=123 and get the url then add the src attribute to that image.
THOUGHTS
First option seems more straight forward, but my server might be
overloaded and involved in every single image request.
Second option it's all done by browser & web service so my server is not involved at all. but for each image i'm making 2 calls 1 ajax call to get the image URL and another one one to get the image. so loading time might be vary and ajax calls might fail for large number of calls.
Information
Around 50 Images will be displayed in that page.
This service will be used by around a 100 user at a given time.
I have no control over the web service so i can't change its functionality and it doesn't accept more than 1 image ID for each call.
My Questions
Any better option i should consider?
If not, which option should I follow? and most important why i should follow that one?
Thanks
Method 1: Rendering in PHP
Pros:
Allows for custom headers that're independent of any server software. If you're using something that's not generally cached (like a PHP file with a query string) or are adding this to a package that needs header functionality regardless of server software, this is a very good idea.
If you know how to use GD or Imagick, you can easily resize, crop, compress, index, etc. your images to reduce the image file size (sometimes drastically) and make the page load significantly faster.
If width and height are passed as variables to the PHP file, the dimensions can be set dynamically:
<div id="gallery-images">
<noscript>
<!-- So that the thumbnail is small for old mobile devices //-->
<img src="get-image.php?id=123&h=200&w=200" />
</noscript>
</div>
<script type="text/javascript">
/* Something to create an image element inside of the div.
* In theory, the browser height and width can be pulled dynamically
* on page load, which is useful for ensuring that images are no larger
* than they need to be. Having a function to load the full image
* if the borwser becomes bigger isn't a bad idea though.
*/
</script>
This would be incredibly considerate of mobile users on a page that has an image gallery. This is also very considerate of users with limited bandwidth (like almost everyone in Alaska. I say this from personal experience).
Allows you to easily clear the EXIF data of images if they're uploaded by users on the website. This is important for user privacy as well as making sure there aren't any malicious scripts living in your JPGs.
Gives potential to dynamically create a large image sprite and drastically reduce your HTTP requests if they're causing latency. It'd be a lot of work so this isn't a very strong pro, but it's still something you can do using this method that you can't do using the second method.
Cons:
Depending on the number and size of images, this could put a lot of strain on your server. When used with browser-caching, the dynamic images are being pulled from cache instead of being re-generated, however it's still very easy for a bot to be served the dynamic image a number of times.
It requires knowledge of HTTP headers, basic image manipulation skills, and an understanding of how to use image manipulation libraries in PHP to be effective.
Method 2: AJAX
Pros:
The page would finish loading before any of the images. This is important if your content absolutely needs to load as fast as possible, and the images aren't very important.
Is far more simple, easy and significantly faster to implement than any kind of dynamic PHP solution.
It spaces out the HTTP requests, so the initial content loads faster (since the HTTP requests can be sent based on browser action instead of just page load).
Cons:
It doesn't decrease the number of HTTP requests, it simply spaces them out. Also note that there will be at least one additional external JS file in addition to all of these images.
Displays nothing if the end device (such as older mobile devices) does not support JavaScript. The only way you could fix this is to have all of the images load normally between some <noscript> tags, which would require PHP to generate twice as much HTML.
Would require you to add loading.gif (and another HTTP request) or Please wait while these images load text to your page. I personally find this annoying as a website user because I want to see everything when the page is "done loading".
Conclusion:
If you have the background knowledge or time to learn how to effectively use Method 1, it gives far more potential because it allows for manipulation of the images and HTTP requests sent by your page after it loads.
Conversely, if you're looking for a simple method to space out your HTTP Requests or want to make your content load faster by making your extra images load later, Method 2 is your answer.
Looking back at methods 1 and 2, it looks like using both methods together could be the best answer. Having two of your cached and compressed images load with the page (one is visible, the other is a buffer so that the user doesn't have to wait every time they click "next"), and having the rest load one-by-one as the user sees fit.
In your specific situation, I think that Method 2 would be the most effective if your images can be displayed in a "slideshow" fashion. If all of the images need to be loaded at once, try compressing them and applying browser-caching with method 1. If too many image requests on page load is destroying your speed, try image spriting.
As of now, you are contacting the webservice 100 times. You should change it so it contacts the webservice only once and retrieves an array of all the 100 images, instead of each image separately.
You can then loop over this array, which will be very fast as no further webtransactions are needed.
If the images you are fetching from the webservice are not dynamic in nature i.e. do not get changed/modified frequently, I would suggest to setup a scheduled process/cron job on your server which gets the images from the webservice and stores locally (in your server itself), so you can display images on the webpage from your server only and avoid third party server round trip every time webpage is served to the end users.
Both of the 2 option cannot resolve your problem, may be make it worse.
For option 1:
The process where cost most time is "get_from_web_service($item)", and the code is only made it be executed by another script( if the file "get_image.php" is executed at the same server).
For option 2:
It only make the "get-image-resource-request" being trigger by browser, but your server has also need to process the "get_from_web_service($item)".
One thing must be clear is that the problem is about the performance of get_from_web_service, the most straight proposal is to make it have a better performance. On the other hand, we can make it reduce the number of concurrent connections. I haven't thought this through, only have 2 suggestion:
Asynchronous: The user didn't browse your whole page, they only notice the page at the top. If your mentioned images does not all displayed at the top, you can use jquery.lazyload extension, it can make the image resource at invisible region do not request the server until they are visible.
CSS Sprites : An image sprite is a collection of images put into a single image. If images on your page does not change frequency, you can write some code to merge them daily.
Cache Image : You can cache your image at your server, or another server (better). And do some key->value works: key is about the $item, value is the resource directory(url).
I am not a native english speaker, hope I made it clear and helpful to you.
im not an expert, but im thinking everytime you echo, it takes time. getting 100 images shouldnt be a problem (solely)
Also. maybe get_from_web_service($item); should be able to take an array?
$counter = 1;
$urls = array();
foreach($items as $item)
{
$urls[$counter] = get_from_web_service($item);
$counter++;
}
// and then you can echo the information?
foreach($urls as $url)
{
//echo each or use a function to better do it
//echo '<img url="url_to_my_website/get_image?id='.$url->id.'" />'
}
get_image.php :
$url = get_from_web_service($item);
header("Content-Type: image/png");
echo file_get_contents($url);
at the end, it would be mighty nice if you can just call
get_from_web_service($itemArray); //intake the array and return images
Option 3:
cache the requests to the web service
Option one is the best option. I would also want to make sure that the images are cached on the server, so that multiple round trips are not required from the original web server for the same image.
If your interested, this is the core of the code that I use for caching images etc (note, that a few things, like reserving the same content back to the client etc is missing):
<?php
function error404() {
header("HTTP/1.0 404 Not Found");
echo "Page not found.";
exit;
}
function hexString($md5, $hashLevels=3) {
$hexString = substr($md5, 0, $hashLevels );
$folder = "";
while (strlen($hexString) > 0) {
$folder = "$hexString/$folder";
$hexString = substr($hexString, 0, -1);
}
if (!file_exists('cache/' . $folder))
mkdir('cache/' . $folder, 0777, true);
return 'cache/' . $folder . $md5;
}
if (!isset($_GET['img']))
error404();
getFile($_GET['img']);
function getFile($url) {
// true to enable caching, false to delete cache if already cached
$cache = true;
$defaults = array(
CURLOPT_HEADER => FALSE,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_MAXCONNECTS => 15,
CURLOPT_CONNECTTIMEOUT => 30,
CURLOPT_TIMEOUT => 360,
CURLOPT_USERAGENT => 'Image Download'
);
$ch = curl_init();
curl_setopt_array($ch, $defaults);
curl_setopt($ch, CURLOPT_URL, $_GET['img']);
$key = hexString(sha1($url));
if ($cache && file_exists($key)) {
return file_get_contents($key);
} elseif (!$cache && file_exists($key)) {
unlink($key);
}
$data = curl_exec($this->_ch);
$info = curl_getinfo($this->_ch);
if ($cache === true && $info['http_code'] == 200 && strlen($data) > 20)
file_put_contents($key, $data);
elseif ($info['http_code'] != 200)
error404();
return $data;
}
$content = getURL($_GET['img']);
if ($content !== null or $content !== false) {
// Success!
header("Content-Type: image");
echo $content;
}
None of the two options will resolve server resources usage issue. Out of the two, though, I would recommend option 1. The second one will delay page loading, causing website speed to slow down, and reducing your SEO ratings.
Best option for you would be something like:
foreach($items as $item) {
echo '<img url="url_to_my_website/get_image.php?id='.$item->id.'" />'
}
Then where the magic happens is get_image.php:
if(file_exists('/path_to_local_storage/image_'.$id.'.png')) {
$url = '/path_to_images_webfolder/image_'.$id.'.png';
$img = file_get_contents($url);
} else {
$url = get_from_web_service($id);
$img = file_get_contents($url);
$imgname = end(explode('/', $url));
file_put_contents($imgname, $img);
}
header("Content-Type: image/png");
echo $img;
This was you will only run the request to web service once per image, and then store it on your local space. Next time the image is requested - you will serve it form your local space, skipping request to web service.
Of course, considering image IDs to be unique and persistent.
Probably not the best solution, but should work pretty well for you.
As we see that above you're including an URL to the web service provided image right in the <img> tag src attribute, one can safely assume that these URLs are not secret or confidential.
Knowing that above, the following snippet from the get_image.php will work with the least overhead possible:
$url = get_from_web_service($id);
header("Location: $url");
If you're getting a lot of subsequent requests to the same id from a given client, you can somewhat lessen number of requests by exploiting browser's internal cache.
header("Cache-Control: private, max-age=$seconds");
header("Expires: ".gmdate('r', time()+$seconds));
Else resort to server-side caching by means of Memcached, database, or plain files like so:
is_dir('cache') or mkdir('cache');
$cachedDataFile = "cache/$id";
$cacheExpiryDelay = 3600; // an hour
if (is_file($cachedDataFile) && filesize($cachedDataFile)
&& filemtime($cachedDataFile) + $cacheExpiryDelay > time()) {
$url = file_get_contents($cachedDataFile);
} else {
$url = get_from_web_service($id);
file_put_contents($cachedDataFile, $url, LOCK_EX);
}
header("Cache-Control: private, max-age=$cacheExpiryDelay");
header("Expires: ".gmdate('r', time() + $cacheExpiryDelay));
header("Location: $url");

How best to cache entire HTML output on the server

I have a fairly basic website, written in pure php, no framework was used, running in a basic LAMP environment.
The site dynamically generates markup based on the HTTP User Agent header, and some query string parameters. For example "itemdetail.php" would inspect the querystring param "itemid" and the User Agent header and produce some markup.
I want to cache this markup, so that the next time a device with the same User Agent and itemid in the query string tries to request the page, it simply dumps out whatever markup is in its cache.
I realise I could do this manually in php using memcache, and just write some code at the top of the page to inspect the relevant params, and either try serve from memcached or render the page and store the markup in memcached, but I was thinking it might be possible to avoid the PHP layer altogether, using something like what is described here http://httpd.apache.org/docs/2.2/caching.html
So, my question, which I realise might be vague and this post will get killed is:
What is the recommended caching implementation here? Is it indeed to use memcache at the php level, or are the apache modules sufficient to meet my needs?
Generating different pages depending on User Agents is just bad practice. You shouldn't do that.
If you want to cache entire pages because your website is slow, the problem probably has to be searched in your code.
On-topic: Write a simple function that hashes the uri being served with a small footprint hash function (md5, sha1,...)
e.g.
<?php
$hash = md5('itemdetail.php-'.$itemid);
if ( file_exist('cache/'.$hash.'.html') {
echo file_get_contents('cache/'.$hash.'.html');
die();
}
and then at the end of your script save the result to 'cache/'.$hash.'.html';
You can offcourse use different kind of extension or folder or...
If you want to cache without using PHP, take a look at Varnish. Or the other example posted here.
If you are familiar with OpenCart at all here is something I wrote to do just this. hopefully you will get the idea given the possible unfamiliar
context.
ob_start();
$enableCaching = false; // Boolean flag
$route = !isset($_GET['route']) ? 'home' : str_replace("/",'-',$_GET['route']);
$cacheFile = DIR_CACHE . $route . '.' . md5($_SERVER['QUERY_STRING']) . ".cache.tpl";
if ($enableCaching !== false && in_array($_GET['route'], $cachePages) && file_exists($cacheFile) ||
$enableCaching !== false && file_exists($cacheFile) && !isset($_GET['route'])) {
/**
* This block of code will output the contents of the cache file.
*/
require ($cacheFile);
}
else {
/**
* Cache file doesn't exist, process the request
*/
$response->output();
if($enableCaching !== false && in_array($_GET['route'], $cachePages) ||
$enableCaching !== false && !isset($_GET['route'])){
file_put_contents($cacheFile, str_replace(array("\n","\r","\t"),'', str_replace(" "," ",ob_get_contents())));
}
}
Basically, create a variable generating a unique file name based on the file name and quest string.
Create that file, writing all HTML output to that file.
Then when it comes to processing request you can check if the unique cache file exists and just send that instead of processing the request.
use the memcached library...
you'll have to install it first and then memcached provides and in-memory caching system for php

Status report during form process

I created a little script that imports wordpress posts from an xml file:
if(isset($_POST['wiki_import_posted'])) {
// Get uploaded file
$file = file_get_contents($_FILES['xml']['tmp_name']);
$file = str_replace('&', '&', $file);
// Get and parse XML
$data = new SimpleXMLElement( $file , LIBXML_NOCDATA);
foreach($data->RECORD as $key => $item) {
// Build post array
$post = array(
'post_title' => $item->title,
........
);
// Insert new post
$id = wp_insert_post( $post );
}
}
The problem is that my xml file is really big, and when i submit the form, the browser just hangs for a couple of minutes.
Is it possible to display some messages during the import, like displaying a dot after every item is imported?
Unfortunately, no, not easily. Especially if you're building this on top of the WP framework you'll find it not worth your while at all. When you're interacting with a PHP script you are sending a request and awaiting a response. However long it takes that PHP script to finish processing and start sending output is how long it usually takes the client to start seeing a response.
There are a few things to consider if what you want is for output to start showing as soon as possible (i.e. as soon as the first echo or output statement is reached).
Turn off output buffering so that output begins sending immediately.
Output whatever you want inside the loop that would indicate to you the progress you wish to be know about.
Note that if you're doing this with an AJAX request content may not be ready immediately to transport to the DOM via your XMLHttpRequest object. Also note that some browsers do their own buffering before content can be ready for the user to display (like IE for example).
Some suggestions you may want to look into to speed up your script, however:
Why are you doing str_replace('&','&',$file) on a large file? You realize that has cost with no benefit, right? You've acomplished nothing and if you meant you want to replace the HTML entity & then you probably have some of your logic very wrong. Encoding is something you want to let the XML parser handle.
You can use curl_multi instead of file_get_contents to do multiple HTTP requests concurrently to save time if you are transferring a lot of files. It will be much faster since it's none-blocking I/O.
You should use DOMDocument instead of SimpleXML and a DOMXPath query can get you your array much faster than what you're currently doing. It's a much nicer interface than SimpleXML and I always recommend it above SimpleXML since in most cases SimpleXML makes things incredibly difficult to do and for no good reason. Don't let the name fool you.

Loading Javascript through PHP

From a tutorial I read on Sitepoint, I learned that I could load JS files through PHP (it was a comment, anyway). The code for this was in this form:
<script src="js.php?script1=jquery.js&scipt2=main.js" />
The purpose of using PHP was to reduce the number of HTTP requests for JS files. But from the markup above, it seems to me that there are still going to be the same number of requests as if I had written two tags for the JS files (I could be wrong, that's why I'm asking).
The question is how is the PHP code supposed to be written and what is/are the advantage(s) of this approach over the 'normal' method?
The original poster was presumably meaning that
<script src="js.php?script1=jquery.js&scipt2=main.js" />
Will cause less http requests than
<script src="jquery.js" />
<script src="main.js" />
That is because js.php will read all script names from GET parameters and then print it out to a single file. This means that there's only one roundtrip to the server to get all scripts.
js.php would probably be implemented like this:
<?php
$script1 = $_GET['script1'];
$script2 = $_GET['script2'];
echo file_get_contents($script1); // Load the content of jquery.js and print it to browser
echo file_get_contents($script2); // Load the content of main.js and print it to browser
Note that this may not be an optimal solution if there is a low number of scripts that is required. The main issue is that web browser does not load an infinitely number of scripts in parallel from the same domain.
You will need to implement caching to avoid loading and concatenating all your scripts on every request. Loading and combining all scripts on every request will eat very much CPU.
IMO, the best way to do this is to combine and minify all script files into a big one before deploying your website, and then reference that file. This way, the client just makes one roundtrip to the server, and the server does not have any extra load upon each request.
Please note that the PHP solution provided is by no means a good approach, it's just a simple demonstration of the procedure.
The main advantage of this approach is that there is only a single request between the browser and server.
Once the server receives the request, the PHP script combines the javascript files and spits the results out.
Building a PHP script that simply combines JS files is not at all difficult. You simply include the JS files and send the appropriate content-type header.
When it gets more difficult is based on whether or not you want to worry about caching.
I recommend you check out minify.
<script src="js.php?script1=jquery.js&scipt2=main.js" />
That's:
invalid (ampersands have to be encoded)
hard to expand (using script[]= would make PHP treat it as an array you can loop over)
not HTML compatible (always use <script></script>, never <script />)
The purpose of using PHP was to reduce the number of HTTP requests for JS files. But from the markup above, it seems to me that there are still going to be the same number of requests as if I had written two tags for the JS files (I could be wrong, that's why I'm asking).
You're wrong. The browser makes a single request. The server makes a single response. It just digs around in multiple files to construct it.
The question is how is the PHP code supposed to be written
The steps are listed in this answer
and what is/are the advantage(s) of this approach over the 'normal' method?
You get a single request and response, so you avoid the overhead of making multiple HTTP requests.
You lose the benefits of the generally sane cache control headers that servers send for static files, so you have to set up suitable headers in your script.
You can do this like this:
The concept is quite easy, but you may make it a bit more advanced
Step 1: merging the file
<?php
$scripts = $_GET['script'];
$contents = "";
foreach ($scripts as $script)
{
// validate the $script here to prevent inclusion of arbitrary files
$contents .= file_get_contents($pathto . "/" . $script);
}
// post processing here
// eg. jsmin, google closure, etc.
echo $contents();
?>
usage:
<script src="js.php?script[]=jquery.js&script[]=otherfile.js" type="text/javascript"></script>
Step 2: caching
<?php
function cacheScripts($scriptsArray,$outputdir)
{
$filename = sha1(join("-",$scripts) . ".js";
$path = $outputdir . "/" . $filename;
if (file_exists($path))
{
return $filename;
}
$contents = "";
foreach ($scripts as $script)
{
// validate the $script here to prevent inclusion of arbitrary files
$contents .= file_get_contents($pathto . "/" . $script);
}
// post processing here
// eg. jsmin, google closure, etc.
$filename = sha1(join("-",$scripts) . ".js";
file_write_contents( , $contents);
return $filename;
}
?>
<script src="/js/<?php echo cacheScripts(array('jquery.js', 'myscript.js'),"/path/to/js/dir"); ?>" type="text/javascript"></script>
This makes it a bit more advanced. Please note, this is semi-pseudo code to explain the concepts. In practice you will need to do more error checking and you need to do some cache invalidation.
To do this is a more managed and automated way, there's assetic (if you may use php 5.3):
https://github.com/kriswallsmith/assetic
(Which more or less does this, but much better)
Assetic
Documentation
https://github.com/kriswallsmith/assetic/blob/master/README.md
The workflow will be something along the lines of this:
use Assetic\Asset\AssetCollection;
use Assetic\Asset\FileAsset;
use Assetic\Asset\GlobAsset;
$js = new AssetCollection(array(
new GlobAsset('/path/to/js/*'),
new FileAsset('/path/to/another.js'),
));
// the code is merged when the asset is dumped
echo $js->dump();
There is a lot of support for many formats:
js
css
lot's of minifiers and optimizers (css,js, png, etc.)
Support for sass, http://sass-lang.com/
Explaining everything is a bit outside the scope of this question. But feel free to open a new question!
PHP will simply concatenate the two script files and sends only 1 script with the contents of both files, so you will only have 1 request to the server.
Using this method, there will still be the same number of disk IO requests as if you had not used the PHP method. However, in the case of a web application, disk IO on the server is never the bottle neck, the network is. What this allows you to do is reduce the overhead associated with requesting the file from the server over the network via HTTP. (Reduce the number of messages sent over the network.) The PHP script outputs the concatenation of all of the requested files so you get all of your scripts in one HTTP request operation rather than multiple.
Looking at the parameters it's passing to js.php it can load two javascript files (or any number for that matter) in one request. It would just look at its parameters (script1, script2, scriptN) and load them all in one go as opposed to loading them one by one with your normal script directive.
The PHP file could also do other things like minimizing before outputting. Although it's probably not a good idea to minimize every request on the fly.
The way the PHP code would be written is, it would look at the script parameters and just load the files from a given directory. However, it's important to note that you should check the file type and or location before loading. You don't want allow a people a backdoor where they can read all the files on your server.

RSS generator with caching function

Do you happen to know any good rss generator script with caching function. All the script I have found over the net so far doesn't support caching! I need the the content of rss to be generated automatically from database in a specified period of time.
Thanks in advance
First, to add caching to the script, it seems like it wouldn't be too hard to put Zend_Feed and Zend_Cache together - or just wrap your current generation script with Zend_Cache.
Just setup the cache with your lifetime:
$frontendOptions = array(
'lifetime' => 7200, // cache lifetime of 2 hours
'automatic_serialization' => true
);
Then check if the cache is still valid:
if(!$feed = $cache->load('myfeed')) {
//generate feed
$cache->save($feed, 'myfeed');
}
//output $feed
I don't know how you form your RSS, but you can import an array structure to Zend_Feed:
$rssFeedFromArray = Zend_Feed::importArray($array, 'rss');
Of course the best way may be to just use your current feed generator and save the output to a file. Use that file as the RSS feed, then use cron/web hooks/queue/whatever to generate the static file. That would be simpler, and use less resources, than having the generation script do the caching.
//feedGen.php
//may require some output buffering if the feed generator outputs directly
$output = $myFeedGenerator->output();
file_put_contents('feed.rss', $output);
Now the feed link is /feed.rss, and you just run feedGen.php whenever it needs to be refreshed. Serving the static file (not even parsed by php) means less for your server to do.

Categories