RSS generator with caching function - php

Do you happen to know any good rss generator script with caching function. All the script I have found over the net so far doesn't support caching! I need the the content of rss to be generated automatically from database in a specified period of time.
Thanks in advance

First, to add caching to the script, it seems like it wouldn't be too hard to put Zend_Feed and Zend_Cache together - or just wrap your current generation script with Zend_Cache.
Just setup the cache with your lifetime:
$frontendOptions = array(
'lifetime' => 7200, // cache lifetime of 2 hours
'automatic_serialization' => true
);
Then check if the cache is still valid:
if(!$feed = $cache->load('myfeed')) {
//generate feed
$cache->save($feed, 'myfeed');
}
//output $feed
I don't know how you form your RSS, but you can import an array structure to Zend_Feed:
$rssFeedFromArray = Zend_Feed::importArray($array, 'rss');
Of course the best way may be to just use your current feed generator and save the output to a file. Use that file as the RSS feed, then use cron/web hooks/queue/whatever to generate the static file. That would be simpler, and use less resources, than having the generation script do the caching.
//feedGen.php
//may require some output buffering if the feed generator outputs directly
$output = $myFeedGenerator->output();
file_put_contents('feed.rss', $output);
Now the feed link is /feed.rss, and you just run feedGen.php whenever it needs to be refreshed. Serving the static file (not even parsed by php) means less for your server to do.

Related

HTML/PHP/SQL - Save a HTML page

Currently I'm working on an application and i have a problem. I want to display an html page but the probem is : there is a lot of data/query behind the page. Is it possible to save the html page with the data every morning and then display the html page saved ? I dont want to load the data every time I load the page because the loading is really long.
I'm working with ZendFramwork and Oracle.
You can use either local storage or session storage for this.
HTML web storage provides two objects for storing data on the client:
window.localStorage - stores data with no expiration date
window.sessionStorage - stores data for one session (data is lost when the browser tab is closed)
Use this link to learn more (https://www.w3schools.com/html/html5_webstorage.asp)
You can use GitHub Pages, write a script in any language to send data in GitHub web page on your decided time and all done your html page in dynamic but act as a static and loads in no time
I think you want to use frontend cache.
There are at least 3 versions of Zend Framework, but the caching si very similar.
For Zend 1 there is some theory https://framework.zend.com/manual/1.12/en/zend.cache.theory.html#zend.cache.clean
Best way is set frontend cache in routes
For that, use this in your router definition file
addRoute($router, [
'url' => "[your-path]",
'defaults' => [
'controller' => '[controller-name]',
'action' => '[action-name]',
'cache' => [TIME-OF-CACHE] // 2 hours = 7200
]
]);
Then, if you really want to delete this cache every morning, you should do it manually, by some CRON script.
For that, try to use this
Zend Framework Clearing Cache
Here is the solution:
You need a cron job that runs the script (the HTML file) every morning
Add ob_start() to beginning of your HTML file
Save the buffer into a file :)
<?php
ob_start();
// Display that HTML file here. You don't need to change anything.
// Add this to the end of your file to output everything into a file.
$out = ob_get_contents();
ob_end_clean();
file_put_contents('cached.html', $out);
?>

Status report during form process

I created a little script that imports wordpress posts from an xml file:
if(isset($_POST['wiki_import_posted'])) {
// Get uploaded file
$file = file_get_contents($_FILES['xml']['tmp_name']);
$file = str_replace('&', '&', $file);
// Get and parse XML
$data = new SimpleXMLElement( $file , LIBXML_NOCDATA);
foreach($data->RECORD as $key => $item) {
// Build post array
$post = array(
'post_title' => $item->title,
........
);
// Insert new post
$id = wp_insert_post( $post );
}
}
The problem is that my xml file is really big, and when i submit the form, the browser just hangs for a couple of minutes.
Is it possible to display some messages during the import, like displaying a dot after every item is imported?
Unfortunately, no, not easily. Especially if you're building this on top of the WP framework you'll find it not worth your while at all. When you're interacting with a PHP script you are sending a request and awaiting a response. However long it takes that PHP script to finish processing and start sending output is how long it usually takes the client to start seeing a response.
There are a few things to consider if what you want is for output to start showing as soon as possible (i.e. as soon as the first echo or output statement is reached).
Turn off output buffering so that output begins sending immediately.
Output whatever you want inside the loop that would indicate to you the progress you wish to be know about.
Note that if you're doing this with an AJAX request content may not be ready immediately to transport to the DOM via your XMLHttpRequest object. Also note that some browsers do their own buffering before content can be ready for the user to display (like IE for example).
Some suggestions you may want to look into to speed up your script, however:
Why are you doing str_replace('&','&',$file) on a large file? You realize that has cost with no benefit, right? You've acomplished nothing and if you meant you want to replace the HTML entity & then you probably have some of your logic very wrong. Encoding is something you want to let the XML parser handle.
You can use curl_multi instead of file_get_contents to do multiple HTTP requests concurrently to save time if you are transferring a lot of files. It will be much faster since it's none-blocking I/O.
You should use DOMDocument instead of SimpleXML and a DOMXPath query can get you your array much faster than what you're currently doing. It's a much nicer interface than SimpleXML and I always recommend it above SimpleXML since in most cases SimpleXML makes things incredibly difficult to do and for no good reason. Don't let the name fool you.

Cache only part of a page in PHP

Is it possible to cache only a specific part of a page in PHP, or the output of a specific section of code in the PHP script? It seems when I try to cache a particular page, it caches the whole page which is not want I want, some of the content in my page should be updated with every page load while others (such as a dropdown list with data from a database) only needs to be updated every hour or so.
If you are talking about caching by the browser (and any proxies it might interact with), then no. Caching only takes place on complete HTTP resources (i.e. on a per URI basis).
Within your own application, you can cache data so you don't need to (for example) hit the database on every request. Memcached is a popular way to do this.
Zend_Cache
I would probably use Zend Frameworks Zend_Cache library for this.
You can just use this component without needing to use the entire framework.
Step over to Zend Framework Download Page and grab the latest.
After you have downloaded the core files, you will need to include Zend_Cache in your project.
Zend_Cache docs.
Have you decided how you want to cache your data? Are you using a file system? Or are you memcache? Once you know which you are going to use, you need to use a specific Zend_Cache backend.
Zend_Cache Backends / Zend_Cache Frontends
You need to use a backend (how you are caching in storage what it is you want to cache) and
You need to use a frontend (how do you actually want to cache.. like using a buffer, or caching function results etc)
Backend documentation: Zend_Cache Backends
Frontend documentation: Zend_Cache Frontends
So you would do something like this...
<?php
// configure caching backend strategy
$backend = new Zend_Cache_Backend_Memcached(
array(
'servers' => array( array(
'host' => '127.0.0.1',
'port' => '11211'
) ),
'compression' => true
) );
// configure caching frontend strategy
$frontend = new Zend_Cache_Frontend_Output(
array(
'caching' => true,
'cache_id_prefix' => 'myApp',
'write_control' => true,
'automatic_serialization' => true,
'ignore_user_abort' => true
) );
// build a caching object
$cache = Zend_Cache::factory( $frontend, $backend );
This would create a cache which makes use of the Zend_Cache_Frontend_Output caching mechanisms.
To use Zend_Cache_Frontend_Output which is want you want, it would be simple. Instead of the core you would use output. The options which you pass are identical. Then to use it you would:
Zend_Cache_Frontend_Output - Usage
// if it is a cache miss, output buffering is triggered
if (!($cache->start('mypage'))) {
// output everything as usual
echo 'Hello world! ';
echo 'This is cached ('.time().') ';
$cache->end(); // output buffering ends
}
echo 'This is never cached ('.time().').';
Useful Blog: http://perevodik.net/en/posts/14/
Sorry this question took longer to write than expected and lots of answers have been written I see!
You could roll your own caching with ob_start(), ob_end_flush() and similar functions. Gather the desired output, dump it into some file or database, and read later if conditions are the same. I usually build md5 sum of the state and restore it later.
It depends on both what caching and view technologies are you using. Generally speaking yes, you can do something like this:
// if it is a cache miss, output buffering is triggered
if (!($cache->start('mypage'))) {
// output everything as usual
echo 'Hello world! ';
echo 'This is cached ('.time().') ';
$cache->end(); // output buffering ends
}
echo 'This is never cached ('.time().').';
taken from Zend_Cache documentation.
Otherwise in your example you can always make a function which returns the dropdown list and implement the cache mechanism inside that function. In this way your page is not even aware of caching.

PHP and sitemap.xml

I am planning to build a script that will create a sitemap.xml for my site, say, every day (cron will execute the script). Should I just build the XML string and save it as a file? Or would there be some benefit to using one of PHP's classes/functions/etc. for XML?
If I should be using some sort of PHP class/function/etc., what should it be?
For simple XML it is often easier to just output the string. But the more complex your document gets, the more benefit you will get from using an XML library (either those included with PHP or a third party script) as it will help you to output correct XML.
For a sitemap, you would probably be best just writing the string.
It's simple format. Almost no structure. Just output it as string.
Unless you need to read/consume your own XML sitemap files, just output to string like others said. The XML sitemaps format is fairly simple. If you intend to support the subtypes as well then... Well I would still do it string based.
I would suggest you to do cron to put all of the url in an array and store it as a cache. Then you could use this Kohana module to generate the sitemap.xml on the fly.
// this is assume you have already install the module.
$sitemap = new Sitemap;
//this is assume you put an array of all url in a cache named 'sitemap'
foreach($cache->get('sitemap') as $loc)
{
// New basic sitemap.
$url = new Sitemap_URL;
// Set arguments.
$url->set_loc($loc)
->set_last_mod(1276800492)
->set_change_frequency('daily')
->set_priority(1);
// Add it to sitemap.
$sitemap->add($url);
}
// Render the output.
$response = $sitemap->render();
// Cache the output for 24 hours.
$cache->set('sitemap', $response, 86400);
// Output the sitemap.
echo $response;

Comparing XML documents for changes in PHP

Currently I'm using PHP to load multiple XML files from around the web (non-local) using simplexml_load_file(). This, as you can imagine, is quite a clunky process and is slowing load time significantly (7 seconds to load 7 files), and there could possibly be more files to load. These files don't change often, but changes should be displayed on the page as soon as they are made.
One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
My two concerns with this are:
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Speed: The main goal here is to increase the speed of the overall page load. Would the process described above increase the speed, or would it just bog down the server with more to do? Thanks for your help!
How about having a cron job crawl through every external XML source, say, hourly or quarter-hourly and update it if necessary?
It wouldn't be in 100% real time, but would take the load off your web page - that would always be using cached files. I don't think there is a reliable way of polling external sources for updates other than actually downloading the file (in theory, it should be possible to get the correct cache headers, but I wouldn't rely on them being configured correctly.)
Security: If I am storing a copy of an XML file, could this pose a security threat, seeing as I don't control the content of that file?
Hardly. To make totally sure, store the cached XML files outside the web root. The any threat that remains then is the same as if you were passing the stream through live.
One idea I had was to cache a version of each feed and the html output I generate from that feed in my DB. Then, each time the user loads the page, the feeds would be compared; if they are different I would run my existing code, generate the HTML, output it, and save it to the DB. However, if it is the same, I could simply output the cached HTML.
Rather than caching the XML file yourself, you should set the If-None-Match or If-Modified-Since fields in the request header. This way you can check to see if the files have changed without necessarily downloading them.
This can be done by setting a stream context for libxml before running simplexml_load_file(). If the file hasn't changed, you'll get a 304 Not Modified response, and simplexml_load_file will fail.
You could also use stream_context_get_default to set the general stream context, then retrieve the XML file into a string with file_get_contents and pass it to simplexml_load_string().
Here's an example of the first way:
Class CachedXml {
public $element,$url;
private $mod_date, $etag;
public function __construct($url){
$this->url = $url;
$this->element = NULL;
$this->mod_date = FALSE;
$this->etag = FALSE;
}
public function updateXml(){
if($this->mod_date || $this->etag){
$opts = array(
'http'=>array(
'header'=>"If-Modified-Since: $this->mod_date\r\n" .
"If-None-Match: $this->etag\r\n"
)
);
$context = stream_context_create($opts);
libxml_set_streams_context($context);
}
if($attempt = # simplexml_load_file($this->url)){
$this->element = $attempt;
$headers = get_headers($this->url,1);
$this->mod_date = $headers['Last-Modified'];
$this->etag = $headers['ETag'];
return TRUE;
}
return FALSE;
}
}
$bob = new CachedXml('http://example.com/xml/test.xml');
if($bob->updateXml()){
echo "Bob was just updated.<br />";
echo " Bob's name is " . $bob->element->getName() . ".<br />";
}
else{
echo "Bob was not updated.<br />";
}

Categories