SSI parser written in PHP? - php

OK, this might sound a little crazy, but bear with me here for a minute.
I'm working on a site where the standard is to use SSI to include page headers, footers, and menus. The included files use SSI conditionals to handle different browsers, some #include nesting, and some #set / #if trickery to highlight the current page in the menu. In other words, it's more than just #include directives in the SSI.
I'm sure some might argue with the aesthetics, but it actually works quite nicely, for static HTML.
Now, the problem: I'd like to just "#include" the same SSI-parsed header and footer html files from my PHP scripts, thus avoiding code duplication and still maintaining the site's uniform look. If PHP were running in the usual mod_php environment, I'd be able to do just that by using PHP's virtual() function. Unfortunately, the site is using FastCGI/suexec to run PHP (so that each VirtualHost can run as a different user), and this breaks virtual().
I've been using a fairly simple SSI parser I wrote in PHP (it handles #includes, and some really simple #if statements), but I'd like a more general solution. So, before I go nuts and write some probably-buggy, more complete SSI parser, does anyone know of a complete SSI parser written in PHP? Naturally, I'm also open to other solutions that work under the constraints I've outlined.
Thanks so much for your time.

Take a look at ESI : http://en.wikipedia.org/wiki/Edge_Side_Includes
You can create a PHP-proxy to handle them, it's the HttpCache in Symfony2 : https://github.com/fabpot/symfony/blob/master/src/Symfony/Component/HttpKernel/HttpCache/Esi.php
Or use a HTTP proxy like Varnish, more performant than Symfony2...

I realize this is an old question, but I ran into that same problem a few years ago, though with a perl implementation. I went ahead and forked a previous attempt and got pretty far into implementing a full apache (2.2.22) mod_include emulator/parser as a perl module http://search.cpan.org/dist/CGI-apacheSSI/lib/CGI/apacheSSI.pm Soon after that I found apache output filters, and realized how perfect a solution that is for my needs. Basically, you can tell apache to parse the output of your script as if it was a .shtml or .php (or whatever) file. So you can output SSI markup from a perl or php (or whatever) script, and have apache parse that. This is how you can do it (in your .htaccess file):
AddOutputFilter INCLUDES .cgi
That's for normal cgi files, but beware, this adds quite a bit of overhead to all .cgi files being executed, so what I actually do is create a special extension so that it runs as a cgi that then has its output parsed, without having the overhead added to normal cgi files:
<Files ~ ".pcgi$">
Options +SymLinksIfOwnerMatch +Includes
AddOutputFilter INCLUDES .pcgi
</Files>
for php you could just do something like:
<Files ~ ".pphp$">
Options +SymLinksIfOwnerMatch +Includes
AddOutputFilter INCLUDES .pphp
</Files>
and that should do the trick! Hope that helps someone out there.

Related

Parse both SSI and PHP in specific .php file

I'm looking for a way to parse both SSI (which is usually in .shtml files) and PHP in one of my PHP files.
In .htaccess, I'm using this to add SSI parsing to my PHP file, but when I do that, the PHP stops working and behaves as HTML comments with only the SSI parsing as expected.
<Files phpfile.php>
AddHandler server-parsed .php
</Files>
How can I add both parsing methods into this file?
EDIT: There are other questions here regarding the opposite (PHP in .shtml files), but firstly, that solution didn't work for me, and secondly I'd preferably like it the other way around.
I am running Apache 2.4 with CloudLinux, Litespeed and cPanel.
It depends on the order you wish to execute this in.
If you are intending to have a file parsed for SSI directives, and then interpreted as PHP, this is not possible, as PHP operates at the file-system level.
If you intend to have the output of a PHP process parsed for SSI directives, there are some ways to do this. The linked question is one of them, there is also this answer, which may be something that will work for you.
The former works because Apache is able to take the output of PHP, parse it for SSI directives, satisfy those directives, and then output it to the client. However, it is not possible to do the reverse. Also note that no files included as a result of SSI includes will be parsed for PHP first. Only the initial request.
Perhaps more important would be to determine exactly what problem you are trying to solve here. What has put you in the place that you need to do this? And would it be possible to solve this problem with strictly PHP or SSI.
PHP has a number of built-in functions for doing similar tasks, include and require. It also has the virtual command, which will execute an Apache sub-request. This sub-request will be handled as Apache is configured, and if the file called in virtual is an SHTML file, it will be parsed for SSI directives.
It is important to note, however, that the virtual command will flush all data to the client prior to including the file, and the result of the include will be sent directly to the client (PHP will not have access to the output of the command, and the SHTML file will not have access to any PHP data [or vice versa]).

Difference between .php extension and AddType

Since I want to have PHP code run properly on my website, should I add
AddType application/x-httpd-php .html
to my htaccess file, or just change all of my *.html files into *.php files?
I've heard that changing the file extension to *.php causes the website to load slower, but I'm wondering if changing the htaccess file does the same.
Either way, the files will be passed through the PHP interpreter, making them ever-so-slightly slower than if they were plain HTML files directly served down. It's the same process however you set it up. The difference in speed from plain HTML is going to be quite small unless you have a lot of dynamic PHP in there. Given that you are considering renaming existing files from .html to .php, I suspect you don't have much PHP code in there already (or any).
So it doesn't really matter which way you handle it.
However...
Leaving them as .html has the possible disadvantage that if you ever forget to setup this configuration, you could wind up serving raw PHP code to the browser, which might include your database connection details or other secrets.
it does exactly the same. .php is not slower than html, html is not slower than php, just a different setting in your webserver config.
AddType application/x-httpd-php .html would be a fraction slower, as apache load this line dynamically. If you set it in httpd.conf it would be exactly the same.
Agree with Michael that you need to be careful with renaming them HTML and having the chance that it not be setup, or your host provider do something screwy with your account.
If you do this, make sure any database/password files remain as PHP that you simply include in your HTML file.

Parse HTML as PHP

Are there any security / performance concerns if we set the Apache web server to configure Apache to handle all HTML as PHP? I was specifically referring to:
AddType application/x-httpd-php .php .php3 .php4 .html
I was in a situation where I needed to add some PHP logic into some HTML files; ideally, I didn't have to change the filename e.g. page.html to page.php (to keep the page rank, etc. for page.html).
This is related to the following question: httpd AddType directive
Edits:
From the existing answers / comments below, it looks like the community suggests to either use redirects or only target specific HTML files. The constraint is that I am redesigning an existing site (400+ HTML pages; each of them uses some sort of Dreamweaver template that pulls in the header and footer from different files). I was hoping to completely shy away from Dreamweaver move into something non-proprietary. So, I am down with two options:
Use Server Side Includes (SSI) to pull in the header and footer. This will result in all my HTML files to be decorated with SSI.
Sprinkle some PHP snippet to include the header and footer. For this choice, I have to make sure the file name stays unchanged.
The more files the server determines it needs to pass through the PHP interpreter, the more overhead involved, but I think this goes without saying. If your site does not have ANY pages with plain HTML, then you're already paying all the performance penalties that you could possibly pay - adding HTML to the list is no different in this case than simply renaming all the files to have a .php extension.
The real performance penalty would come if you do have plain HTML pages - the server will needlessly pass these pages to PHP for interpretation when none is necessary. But even then, it isn't dramatic - the PHP interpreter won't be needed for those HTML pages, so it won't do anything aside from determining that it doesn't need to do anything. This has a cost, but it isn't significant.
Now, if we're talking high-volume here, every little bit of performance matters and this would not be a practicable solution. For low- to mid-volume sites, however, the performance penalty would be nill.
If this is a one-time change and there are a limited number of files that are affected, then it may be more conservative to use a FilesMatch directive.
<FilesMatch "^(file_one|file_two|file_three)\.html$">
AddType application/x-httpd-php .html
</FilesMatch>
I disagree with Tuga. I don't think you should make this change for all your files. Anytime you deal with security, you should try to control the environment. Doing it only for one file is probably the safest. You could do something like
<FilesMatch "^file_name\.html$">
AddType application/x-httpd-php .html
</FilesMatch>
This will only match file_name.html and process it as .php where it is much safer to do this than treat ALL .html files as php.

Using Apache Server-side Flow Control in PHP?

As discussed a bit in this question, I am using Apache mod_include with conditional flow control statements to alter the behavior of included shtml files depending on the URL of the parent page. The problem I'm having is that some of the pages on the site are PHP pages, which seems to mean that the mod_include directives are ignored (and instead treated as standard html comments).
Is there any way to have PHP pages correctly process these mod_include directives?
Specifically, here is what I am trying to have processed:
<!--#if expr='"$DOCUMENT_NAME" = /(podcasts\.php)|(series\.php)/' -->
<li id="features" class="current">
<!--#else -->
<li id="features">
<!--#endif -->
Similar lines blocks work in the .shtml files on the site, but for the php pages, all of the above ends up output to the client.
Edit: The closest thing to a solution I have come up with is to mimic the functionality of the included shtml file in a php file. I don't like this solution because it means that adding links in the future will require adding them to multiple places.
Assuming you're running PHP via mod_php (may not even matter) just adding:
AddOutputFilter INCLUDES .shtml .php
and it works fine for both .shtml and .php with both being properly parsed.
I have just started to read about SSI but found this quote at
http://httpd.apache.org/docs/2.2/howto/ssi.html#configuring
A brief comment about what not to do. You'll occasionally see people
recommending that you just tell Apache to parse all .html files for
SSI, so that you don't have to mess with .shtml file names. These
folks have perhaps not heard about XBitHack. The thing to keep in mind
is that, by doing this, you're requiring that Apache read through
every single file that it sends out to clients, even if they don't
contain any SSI directives. This can slow things down quite a bit, and
is not a good idea.
So if I understand it right, you should not include .php in AddOutputFilter since if forces Apache to search all .php pages for SSI directives since it will slow down the server.
Maybe there is another solution to your problem?
http://httpd.apache.org/docs/2.2/mod/mod_include.html#xbithack
/Philip

Running other file types as PHP

Is there any problem with running HTML as PHP via .htaccess? such as security or best practices etc. was doing this to make URLs cleaner.
## run the following file types as php
Addhandler application/x-httpd-php .html .htm .rss .xml
Well ideally id like to have my URLs like
localhost/blog/posts/view.php?id=64
to be
localhost/projects/bittyPHP/bittyphp/posts/view/id-64
But having trouble accomplishing that without routing everything to one file and having PHP run determine the paths. I guess this is my real question
I would use mod rewrite.
Probably you do not need to run all html files as PHP, and if you have short_tags enabled "<?" in XML will give you trouble.
Keep in mind that you will run each and every of those files through the PHP handler then. If there is no PHP inside the files, the parser will still inspect them to see if there is any PHP in it. This adds some overhead, but it is likely neglectable in most setups.
Main issue I would say is performance. If you have a significant number of plain HTML files then you're creating unnecessary overhead by always running them through the PHP interpretter.
Best practice is not to do this, but use "friendly" URLS like mysite.com/item/123 and use mod_rewrite to convert them to mysite.com/displayitem.php?id=123 internally
Like many people have already stated, mod_rewrite is the best solution for accomplishing friendly URLs.
Sitepoint has a decent guide to getting started with mod_rewrite.

Categories