Supersimple static file based (html) php site cache - php

I have a website that basically only displays things without any forms and post-gets.
This website is PHP based and hosted on shared hosting. It rarely changes.
I would like to enable caching for this website.
Its shared hosting so i need a solution that:
does not use Memcached
dont need to move my website to VPS
dont use APC or other things
So basically what i would like to acomplish is cache every subsite to HTML and tell PHP to get for 5 minutes the HTML cached version of current subsite and display it to user. And after 5 minutes to refresh the cache.
I've been looking for some while on the internets and there are some tutorials and frameworks that support this kind of kinky cache.
But what i need is just one good library that is extremely easy to use.
I imagine it to work in this way:
<?
if (current_site_cache_is_valid())
{
display_cached_version();
die;
}
..mywebsite rendering code
?>
So simple as it sounds but i hope some good fellow developer did library of this kind before. So do you know such ready to use, not very time consuming to implement solution?

This is how I normally do this, however I don't know your URL design nor your directory / file layout.
I do this with .htaccess and a mod_rewrite­Docs.
The webserver checks if a cached HTML file exists, and if yes, it's delivered. You can also check it's age.
If it's too old or if it does not exists your PHP script(s?) is started. At the beginning of your script you start the output buffer­Docs. At the end of your script, you obtain the output buffer and you place the content into the cache file and then you output it.
The benefit of this solution is, that apache will deliver static files in case they exist and there is no need to invoke a PHP process. If you do it all within PHP itself, you won't have that benefit.
I would even go a step further and run a cron-job that removes older cache-files instead of doing a time-check inside the .htaccess. That done, you can make the rewrite less complex to prefer a .php.cached file instead of the .php file.

I have a simple algo for HTML caching, predicated on the following conditions
The user is a guest (logged on users have a blog_user cookie set)
The request URI is a GET that contains no request parameters
An HTML cache version of the file exists
then an .htaccessrewrite rule kicks in, mapping the request to a cached file. Anything else is assumed to be context-specific and therefore not cacheable. Note that I use wikipedia-style URI mapping for my blog so /article-23 gets mapped to /index.php=article-23 when not cached.
I use a single HTML access file in my DOCUMENT_ROOT directory and here is the relevant extract. It's the third rewrite rule that does what you want. Any script which generates cacheable O/P wraps this in an ob_start() ob_get_clean() pair and write out the HTML cache file (though this is all handled by my templating engine). Updates also flush the HTML cache directory as necessary.
RewriteEngine on
RewriteBase /
# ...
# Handle blog index
RewriteRule ^blog/$ blog/index [skip=1]
# If the URI maps to a file that exists then stop. This will kill endless loops
RewriteCond %{REQUEST_FILENAME} -f
RewriteRule ^blog/.* - [last]
# If the request is HTML cacheable (a GET to a specific list, with no query params)
# the user is not logged on and the HTML cache file exists then use it instead of executing PHP
RewriteCond %{HTTP_COOKIE} !blog_user
RewriteCond %{REQUEST_METHOD}%{QUERY_STRING} =GET [nocase]
RewriteCond %{DOCUMENT_ROOT}/blog/html_cache/$1.html -f
RewriteRule ^blog/(article-\d+|index|sitemap.xml|search-\w+|rss-[0-9a-z]*)$ \
blog/html_cache/$1.html [last]
# Anything else relating to the blog pass to index.php
RewriteRule blog/(.*) blog/index.php?page=$1 [qsappend,last]
Hope this helps. My blog describes this in more detail. :-)

It's a while since you asked this, but as this is still gathering search hits I thought I'd give you a better answer.
You can do static caching in PHP without .htaccess or other trickery. I found this at http://simonwillison.net/2003/may/5/cachingwithphp/ :
<?php
$cachefile = 'cache/index-cached.html';
$cachetime = 5 * 60;
// Serve from the cache if it is younger than $cachetime
if (file_exists($cachefile) && time() - $cachetime < filemtime($cachefile)) {
include($cachefile);
echo "<!-- Cached copy, generated ".date('H:i', filemtime($cachefile))." -->\n";
exit;
}
ob_start(); // Start the output buffer
/* The code to dynamically generate the page goes here */
// Cache the output to a file
$fp = fopen($cachefile, 'w');
fwrite($fp, ob_get_contents());
fclose($fp);
ob_end_flush(); // Send the output to the browser
?>

Just to add a little more to nico's response to make it more useful for generic copy and paste use by saving the time of typing in individual cachefile names for each file saved.
Original:
$cachefile = 'cache/index-cached.html';
Modified:
$cachefile = $_SERVER['DOCUMENT_ROOT'].'/cache/'.pathinfo($_SERVER['SCRIPT_NAME'], PATHINFO_FILENAME).'-cached.html';
What this does is to take the filename of whatever file it is located in, minus the extension (.php in my case), and appends the "-cached.html" label and new extension to the cached file. There are probably more efficient ways of doing this, but it works for me and hopefully save others some time and effort.

You should give skycache a try. edit : this project seems cool too: cacheme
Another solution is to use auto_prepend_file/auto_append_file. Something like what's described in this tutorial: Output caching for beginners

Related

How to clean a url with htaccess?

I have a problem with url when loading PHP scripts.
The problem is that at the time of making the request to a php script, it loads normally, but when requesting another script, in the url they begin to gather and it looks like this:
www.example.com/file.php/route1/file2.php
I need this
www.example.com/file2.php
when i request another file, I need to have this
www.example.com/file2.php
What I need is to hide everything that it after file1.php or file2.php to load the other scripts without problems.
Without seeing your HTML content and .htaccess file it is hard to determine the exact cause of your issue(s).
Please verify the instructions in your .htaccess file. If you are using a Rewrite rule you need to validate that it is correct, for example:
RewriteRule ^/?$ "http\:\/\/example\.in" [R=301,L] If this rule is forcing a /route you obviously need to remove the /route from the instruction.
Please make sure to reference in your <a> tags the appropriate path. If you are not formatting it properly you will end up with concatenation. Are you using a framework? If so this may have an impact on your URL formatting.
Some example HTML a tags for you:
File 1 and File 1 will perform the same provided you do not have other factors impeding this simple approach.

User friendly URLs without htaccess

I want to create friendly urls for my website script only using PHP, right now im using the query style (Ex: index.php?location=register) and i would like to convert them to something like this:
https://www.sitename.com/index.php/Register
Right now im using a $_GET based function to parse and include the php script based on the $_GET value.
$includeDir = ".".DIRECTORY_SEPARATOR."assets/controllers".DIRECTORY_SEPARATOR;
$includeDefault = $includeDir."Home.php";
if(isset($_GET['ajaxpage']) && !empty($_GET['ajaxpage'])){
$_GET['ajaxpage'] = str_replace("\0", '', $_GET['ajaxpage']);
$includeFile = basename(realpath($includeDir.$_GET['ajaxpage'].".php"));
$includePath = $includeDir.$includeFile;
if(!empty($includeFile) && file_exists($includePath)) {
include($includePath);
}
else{
include($includeDefault);
}
exit();
}
if(isset($_GET['location']) && !empty($_GET['location']))
{
$_GET['location'] = str_replace("\0", '', $_GET['location']);
$includeFile=basename(realpath($includeDir.$_GET['location'].".php"));
$includePath = $includeDir.$includeFile;
if(!empty($includeFile) && file_exists($includePath))
{
include($includePath);
}
else
{
include($includeDefault);
}
}
else
{
include($includeDefault);
}
Kind regards!
Okay, my comment keeps growing...so I guess I'll just provide an answer...
1) This still requires server configuration. In the case of Apache, I believe it's called MultiView. This is what allows Apache to look up a directory when the first path /file.php/somepage is not found...if you don't have the right configuration, it will just give a 404 error even though file.php exists. So, if your intention is to avoid the need for server configuration, it won't work.
2) What you are doing is dangerous:
$includeFile = basename(realpath($includeDir.$_GET['ajaxpage'].".php"));
All I have to do is know where some of your files are and I can potentially cause one of your PHP files to run...e.g. run your nightly cron every 5 minutes and overwhelm your server or some other page that might do some damage...you need some way of forcing only files with a certain name can be included...e.g.
$includeFile = basename(realpath($includeDir.$_GET['ajaxpage']."Controller.php"));
By forcing a suffix of Controller to the filename, you just have to make sure not to use the name Controller at the end of the file name for any file you don't want to be include-able.
3) There are so many MV* style frameworks out there...and there are so many security considerations, etc., that it is not always wise to create your own until you understand many or most of them. Even if you don't like them, using those frameworks will also help you learn some best practices for creating your own.
4) Finally, what in the world is the reason to avoid using URL Rewriting. URL Rewriting is the STANDARD for both Apache and Windows to create clean URLs. There is a reason that "everybody's doing it." If it's performance, your way will actually, probably, be slower because apache first has to look to see if the path exists, then go up a directory and see if that file exists, then go up another directory and see if that file exists until it hits a match...then open that file.
Why do you need to show index.php in the URL?
I would create my URL to look like this https://www.sitename.com/register if you truly want clean URL's but you would need to use something like the rewrite.
But you would need to use .htaccess or Apache config rules such as this.
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^/]+)/?$ index.php?location=$1 [L]
Then in your PHP code you can do a get on location var $_GET["location"] and then load the page from the value sent.
The result of $_GET["location"] would be register from this URL and then you will display that page.
I don't suggest using MultiViews as it can cause issues if you have file and folders with the same name. e.g. /admin and admin.php.

URL handling – PHP vs Apache Rewrite

Currently I let a single PHP script handle all incoming URLs. This PHP script then parses the URL and loads the specific handler for that URL. Like this:
if(URI === "/")
{
require_once("root.php");
}
else if(URI === "/shop")
{
require_once("shop.php");
}
else if(URI === "/contact")
{
require_once("contact.php");
}
...
else
{
require_once("404.php");
}
Now, I keep thinking that this is actually highly inefficient and is going to need a lot of unnecessary processing power once my site is being visited more often. So I thought, why not do it within Apache with mod_rewrite and let Apache directly load the PHP script:
RewriteRule ^$ root.php [L]
RewriteRule ^shop$ shop.php [L]
...
However, because I have a lot of those URLs, I only want to make the change if it really is worth it.
So, here's my question: What option is better (efficiency-wise and otherwise) and why?
Btw, I absolutely want to keep the URL scheme and not simply let the scripts accessible via their actual file name (something.php).
So, here's my question: What option is better (efficiency-wise and otherwise) and why?
If every resource has to run through a PHP based check, as you say in your comment:
some resources are only available to logged in users, so I get to check cookies and login state first, then I serve them with readfile().
then you can indeed use PHP-side logic to handle things: A PHP instance is going to be started anyway, which renders the performance improvement of parsing URLs in Apache largely moot.
If you have static resources that do not need any session or other PHP-side check, you should absolutely handle the routing in the .htaccess file if possible, because you avoid starting a separate PHP process for every resource. But in your case, that won't apply.
Some ideas to increase performance:
consider whether really every resource needs to be protected through PHP-based authentication. Can style sheets or some images not be public, saving the performance-intensive PHP process?
try minifying resources into as few files as possible, e.g. by minifying all style sheets into one, and using CSS sprites to reduce the number of images.
I've heard that nginx is better prepared to handle this specific kind of scenario - at least I'm told it can very efficiently handle the delivery of a file after the authentication check has been done, instead of having to rely on PHP's readfile().
The PHP approach is correct but it could use a bit of improvement.
$file = $uri.".php";
if (!is_file($file)) { header("Status: 404 Not Found"); require_once(404.php); die(); }
require_once($uri.".php");
OK, as for efficiency - htaccess version with regexp and php version with single regexp and loading of matching file would be faster than many htaccess rules or many php if - else
Apart from that, htaccess and php way should be similar in efficiency in that case, probably with little gain with htaccess (eliminating one require in php)
RewriteRule ^([a-z]+)$ $1.php [L]
and rename root.php to index.php.

how to protect server directory using .htaccess

I have designed a website, and within it I have a range of PHP scripts which interact with my system. For example, if a user uploads an image, this is processed by the script
image.php
and if a user logs in this is processed by the script
login.php
All these scripts are stored in the folder called: scripts
How do I ensure someone cannot access these pages, however still ensure they can be used by the system? I want to ensure the PHP pages will accept post values, get values and can redirect to other pages, but not be directly accessed via the address bar or downloaded?
I attempted to block access using .htaccess using deny from all and Limit GET, POST but this prevented the system from working as I could not access those files at all.
Blocking files with htaccess makes the files inaccessible to the requestor, e.g. the visitor of the page. So you need a proxy file to pass the visitor's request to the files. For that, have a look at the MVC pattern and the Front Controller pattern.
Basically, what you will want to do is route all requests to a single point of entry, e.g. index.php and decide from there, which action(your scripts) is called to process the request. Then you could place your scripts and templates outside the publicly accessible folder or, if that is impossible (on some shared hosts), protect the folders with htaccess like you already did (DENY FROM ALL) then.
To use the upload script you'd have a URL like http://example.com/index.php?action=upload.
A supersimple FrontController is as easy as
$scriptPath = 'path/to/your/scripts/directory/';
$defaultAction = 'action404.php';
$requestedAction = $_GET['action']; // you might want to sanitize this
switch($action) {
case 'upload':
$actionScript = 'image.php';
break;
case 'login':
$actionScript = 'login.php';
break;
default:
$actionScript = $defaultAction;
}
include $scriptPath . $actionScript;
exit;
Your actionScript would then do everything you need to do with the request, including redirection, db access, authentication, uploading stuff, rendering templates, etc - whatever you deem necessary. The default action in the example above could look like this:
<?php // action404.php
header('HTTP/1.1 404 File Not Found');
fpassthru('path/to/template/directory/error404.html');
There is numerous implementations of the FrontController pattern in PHP. Some simple, some complex. The CodeIgniter framework uses a lightweight MVC/FrontController implementation that might not be too overwhelming if this is new to to you.
Like Atli above suggested, you could use mod_rewrite to force all requests to index.php and you could also use it to pretty up your URLs. This is common practice with MVC frameworks and has been covered extensively here and elsewhere.
You can't really prevent direct requests to the files, and still have them remain accessible to other requests. The best you can do is mask their location, and control how they are accessed.
One way you could go is to create a PHP "switch" script, which would include the scripts for you, rather than have Apache request them directly.
For example, if you had your scripts/image.php rule target switch.php?file=image.php instead, somewhat like:
RewriteRule ([^\.]+\.(jpe?g|png|gif)$ switch.php?file=image.php&rw=1&meta=$1 [L,QSA]
You could add deny from all to the scripts/.htaccess file and do this in your switch.php file.
<?php
/** File: switch.php **/
$allowed_files = array(
'login.php',
'image.php'
);
$script_dir = 'scripts/';
if(isset($_POST['rw']) && in_array($_REQUEST['file'], $allowed_files)) {
include $script_dir . $allowed_files[$_REQUEST['file']];
}
else {
header('HTTP/1.1 404 File Not Found');
include 'error404.html'; // Or something to that effect.
}
?>
The $_POST['rw'] there is a weak check, to see if the rule came from a RewriteRule, meant to prevent direct requests to the file. Pretty easy to bypass if you know it is there, but effective against random requests by bots and such.
This way, direct requests to either scripts/image.php and switch.php?file=image.php would fail, but requests to any image file would trigger the scripts/image.php script.
You can set deny from all on .htaccess and include these files from some accessible directory
I want to ensure the PHP pages will accept post values, get values and can redirect to other pages, but not be directly accessed via the address bar or downloaded?
As long as Apache is configured to associate all .php files with the PHP application, no one can download the PHP content itself. So, if someone browsed to "mysite.com/image.php", PHP will run. The user will NOT see your PHP content.
This should already by done in your httpd.conf file as :
AddType application/x-httpd-php .php .phtml
Now, image.php will be expecting certain post parameters. Short of implementing an MVC architecture as Atli suggested above, you could gracefully and securely deal with any missing parameters if they aren't provided. Then, users can get to the page directly but not do anything with it.
A lot of applications just put files like your scripts not in the public (like /public_html/ or /www/) folder but in the same root folder as your public folder.
so not
root/public_html/ and
root/public_html/scripts/
but
root/public_html/ and
root/scripts/
Anything in a folder above the public folder can't be accessed by visitors, but by specifying in for example /public_html/index.php the file '../scripts/yourscript.php' PHP can access these files and visitors can't. (the folder ../ means "go up one step in the folder hierarchy")

Zend Framework "under maintenance" page

I'm trying to figure out how to set up a holding/"under maintenance" page in Zend Framework for when I am upgrading the database or something and don't want anyone using the site. I'd like to have a static HTML page and have all traffic redirected to that.
I'd rather not use .htaccess and would like to do it via the bootstrap file.
Any ideas?
Thanks.
I've set Apache to show index.html in preference to index.php (which bootstraps the ZF). As long as you don't link directly to /index.php anywhere, then you can just drop in an index.html file, and it will show that in preference to the ZF site.
An alternative is to have an entry in your configuration .ini file, and as soon as you have read the configuration:
if ($config->maintenance) {
readfile(APPLICATION . '/../public/maintenance.html');
exit;
}
You may want to add another check in there for a particular IP address (your own) as well, so that you can get though even when everyone else is blocked.
I've done this by creating a plugin that check the validity of the request each time that a page is requested.
During the execution of the plugin in the "preDispatch()" you can analyze a variable from the config that it will hold your current status as active/under maintenance and let the request flow to the original destination or redirect it to a landing page for this purpose.
Code sample
public function preDispatch(Zend_Controller_Request_Abstract $request)
{
// get your user and your config
if( $config->suspended && $user->role()->name != "admin"){
$request
->setModuleName( 'default' )
->setControllerName( 'index' )
->setActionName( 'suspended' )
->setDispatched(true)
;
}
}
You could check your configuration file for a maintenance_mode switch and redirect every request from within the bootstrap to your static html maintenance page.
I have a blog post that demonstrates how to do this. Setting up a maintenance page with Zend Framework
I would use plugin with dispatchLoopShutdown() and based on the config settings i would redirect the request to any controller you want.
I followed all of these suggestions to a TEE on Zend 1.12. I googled around. Tried using application.ini, setting the plugin path, using zend_loader_autoloader_resource(), using Zend_Loader_PluginLoader. NONE of these worked for me. I ended up writing a .htaccess:
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI} !^/maintenance\.php$
RewriteRule ^(.*)$ /maintenance.php [R=503,L]
This is why Zend is the worst framework. Tons of different options on how to do something simple, Official Documentation is extremely ambiguous and unclear, and nobody fully understands or can explain the correct way to do anything so I end up wasting an hour of my time trying to do things correctly.

Categories