Parse HTML as PHP - php

Are there any security / performance concerns if we set the Apache web server to configure Apache to handle all HTML as PHP? I was specifically referring to:
AddType application/x-httpd-php .php .php3 .php4 .html
I was in a situation where I needed to add some PHP logic into some HTML files; ideally, I didn't have to change the filename e.g. page.html to page.php (to keep the page rank, etc. for page.html).
This is related to the following question: httpd AddType directive
Edits:
From the existing answers / comments below, it looks like the community suggests to either use redirects or only target specific HTML files. The constraint is that I am redesigning an existing site (400+ HTML pages; each of them uses some sort of Dreamweaver template that pulls in the header and footer from different files). I was hoping to completely shy away from Dreamweaver move into something non-proprietary. So, I am down with two options:
Use Server Side Includes (SSI) to pull in the header and footer. This will result in all my HTML files to be decorated with SSI.
Sprinkle some PHP snippet to include the header and footer. For this choice, I have to make sure the file name stays unchanged.

The more files the server determines it needs to pass through the PHP interpreter, the more overhead involved, but I think this goes without saying. If your site does not have ANY pages with plain HTML, then you're already paying all the performance penalties that you could possibly pay - adding HTML to the list is no different in this case than simply renaming all the files to have a .php extension.
The real performance penalty would come if you do have plain HTML pages - the server will needlessly pass these pages to PHP for interpretation when none is necessary. But even then, it isn't dramatic - the PHP interpreter won't be needed for those HTML pages, so it won't do anything aside from determining that it doesn't need to do anything. This has a cost, but it isn't significant.
Now, if we're talking high-volume here, every little bit of performance matters and this would not be a practicable solution. For low- to mid-volume sites, however, the performance penalty would be nill.
If this is a one-time change and there are a limited number of files that are affected, then it may be more conservative to use a FilesMatch directive.
<FilesMatch "^(file_one|file_two|file_three)\.html$">
AddType application/x-httpd-php .html
</FilesMatch>

I disagree with Tuga. I don't think you should make this change for all your files. Anytime you deal with security, you should try to control the environment. Doing it only for one file is probably the safest. You could do something like
<FilesMatch "^file_name\.html$">
AddType application/x-httpd-php .html
</FilesMatch>
This will only match file_name.html and process it as .php where it is much safer to do this than treat ALL .html files as php.

Related

Replacing .html files with .php files while maintaining search engine rankings

I maintain a website that contains a dozen or so .html documents which I have just rewritten to include php code. As search engines currently index the .html documents, I would rather not break those links and I certainly don't want to do anything that will affect my search rankings. I understand I have a couple of choices.
Option 1 is to replace all the .html extensions on my documents with .php, and then update .htaccess so that requests for .html documents are rewritten/redirected to the corresponding .php documents (as suggested here).
If I want to make this a permanent (301) redirection, so that the search engine links are replaced the next time my site is crawled, is this the correct way to do this?
RewriteEngine On
RewriteRule ^(.*)\.html$ $1.php [L,R301]
Option 2 would be to instruct the webserver to send all html documents through the php parser (as suggested here), which means the .html extension on the the files doesn't need to be changed at all:
AddType application/x-httpd-php .htm .html
So I see two viable choices. Is one better than the other (or can you think of a better one)?
The first method means you need to rename all of your html files to php files and for a little while, a marginal amount of extra traffic for the redirects.
The second method means you don't need to change anything at all and html files get processed by the PHP handler like php files do.
First method is more work but it also means your site is more portable. Meaning that if you copy your site to a new host that, say, doesn't give you the ability to change the handler types, then you will still be fine because your files end with the php extension.
The second method is less work and won't require search engines to re-index your site but will make your site a little less portable.
Note that you can also use mod_alias to redirect:
RedirectMatch 301 ^/(.*).html$ /$1.php

Difference between .php extension and AddType

Since I want to have PHP code run properly on my website, should I add
AddType application/x-httpd-php .html
to my htaccess file, or just change all of my *.html files into *.php files?
I've heard that changing the file extension to *.php causes the website to load slower, but I'm wondering if changing the htaccess file does the same.
Either way, the files will be passed through the PHP interpreter, making them ever-so-slightly slower than if they were plain HTML files directly served down. It's the same process however you set it up. The difference in speed from plain HTML is going to be quite small unless you have a lot of dynamic PHP in there. Given that you are considering renaming existing files from .html to .php, I suspect you don't have much PHP code in there already (or any).
So it doesn't really matter which way you handle it.
However...
Leaving them as .html has the possible disadvantage that if you ever forget to setup this configuration, you could wind up serving raw PHP code to the browser, which might include your database connection details or other secrets.
it does exactly the same. .php is not slower than html, html is not slower than php, just a different setting in your webserver config.
AddType application/x-httpd-php .html would be a fraction slower, as apache load this line dynamically. If you set it in httpd.conf it would be exactly the same.
Agree with Michael that you need to be careful with renaming them HTML and having the chance that it not be setup, or your host provider do something screwy with your account.
If you do this, make sure any database/password files remain as PHP that you simply include in your HTML file.

SSI parser written in PHP?

OK, this might sound a little crazy, but bear with me here for a minute.
I'm working on a site where the standard is to use SSI to include page headers, footers, and menus. The included files use SSI conditionals to handle different browsers, some #include nesting, and some #set / #if trickery to highlight the current page in the menu. In other words, it's more than just #include directives in the SSI.
I'm sure some might argue with the aesthetics, but it actually works quite nicely, for static HTML.
Now, the problem: I'd like to just "#include" the same SSI-parsed header and footer html files from my PHP scripts, thus avoiding code duplication and still maintaining the site's uniform look. If PHP were running in the usual mod_php environment, I'd be able to do just that by using PHP's virtual() function. Unfortunately, the site is using FastCGI/suexec to run PHP (so that each VirtualHost can run as a different user), and this breaks virtual().
I've been using a fairly simple SSI parser I wrote in PHP (it handles #includes, and some really simple #if statements), but I'd like a more general solution. So, before I go nuts and write some probably-buggy, more complete SSI parser, does anyone know of a complete SSI parser written in PHP? Naturally, I'm also open to other solutions that work under the constraints I've outlined.
Thanks so much for your time.
Take a look at ESI : http://en.wikipedia.org/wiki/Edge_Side_Includes
You can create a PHP-proxy to handle them, it's the HttpCache in Symfony2 : https://github.com/fabpot/symfony/blob/master/src/Symfony/Component/HttpKernel/HttpCache/Esi.php
Or use a HTTP proxy like Varnish, more performant than Symfony2...
I realize this is an old question, but I ran into that same problem a few years ago, though with a perl implementation. I went ahead and forked a previous attempt and got pretty far into implementing a full apache (2.2.22) mod_include emulator/parser as a perl module http://search.cpan.org/dist/CGI-apacheSSI/lib/CGI/apacheSSI.pm Soon after that I found apache output filters, and realized how perfect a solution that is for my needs. Basically, you can tell apache to parse the output of your script as if it was a .shtml or .php (or whatever) file. So you can output SSI markup from a perl or php (or whatever) script, and have apache parse that. This is how you can do it (in your .htaccess file):
AddOutputFilter INCLUDES .cgi
That's for normal cgi files, but beware, this adds quite a bit of overhead to all .cgi files being executed, so what I actually do is create a special extension so that it runs as a cgi that then has its output parsed, without having the overhead added to normal cgi files:
<Files ~ ".pcgi$">
Options +SymLinksIfOwnerMatch +Includes
AddOutputFilter INCLUDES .pcgi
</Files>
for php you could just do something like:
<Files ~ ".pphp$">
Options +SymLinksIfOwnerMatch +Includes
AddOutputFilter INCLUDES .pphp
</Files>
and that should do the trick! Hope that helps someone out there.

Parse PHP in .html files with IIS and FastCGI

I have a big site with lots of .html files, and I want to start using PHP in my pages, but I don't want to change the links to .php . I read on Apache servers you can add a rule to the .htaccess file that will allow PHP parsing in plain .html files. Is this possible in IIS?
Absolutely. Assuming you're using IIS7, you simply change the request path in "Handler Mappings" to *.html (to handle all html files).
Note that you'll get a big performance hit though. It's much quicker to serve static content, so if you have lots of html pages every single one of them will start being parsed by PHP. It would be preferable to switch pages to .php as needed, but I understand that it would be tricky to fix all the backlinks.
More information about setting it up is available here.
Be aware that when changing handler mappings you'll also want to make sure it is still sending the correct MIME types. I just implemented the solution Hamish linked to, but all my CSS #import directives were failing, as they were to pure .css files which were now being served with PHP's standard text/html Content-type header.

Using Apache Server-side Flow Control in PHP?

As discussed a bit in this question, I am using Apache mod_include with conditional flow control statements to alter the behavior of included shtml files depending on the URL of the parent page. The problem I'm having is that some of the pages on the site are PHP pages, which seems to mean that the mod_include directives are ignored (and instead treated as standard html comments).
Is there any way to have PHP pages correctly process these mod_include directives?
Specifically, here is what I am trying to have processed:
<!--#if expr='"$DOCUMENT_NAME" = /(podcasts\.php)|(series\.php)/' -->
<li id="features" class="current">
<!--#else -->
<li id="features">
<!--#endif -->
Similar lines blocks work in the .shtml files on the site, but for the php pages, all of the above ends up output to the client.
Edit: The closest thing to a solution I have come up with is to mimic the functionality of the included shtml file in a php file. I don't like this solution because it means that adding links in the future will require adding them to multiple places.
Assuming you're running PHP via mod_php (may not even matter) just adding:
AddOutputFilter INCLUDES .shtml .php
and it works fine for both .shtml and .php with both being properly parsed.
I have just started to read about SSI but found this quote at
http://httpd.apache.org/docs/2.2/howto/ssi.html#configuring
A brief comment about what not to do. You'll occasionally see people
recommending that you just tell Apache to parse all .html files for
SSI, so that you don't have to mess with .shtml file names. These
folks have perhaps not heard about XBitHack. The thing to keep in mind
is that, by doing this, you're requiring that Apache read through
every single file that it sends out to clients, even if they don't
contain any SSI directives. This can slow things down quite a bit, and
is not a good idea.
So if I understand it right, you should not include .php in AddOutputFilter since if forces Apache to search all .php pages for SSI directives since it will slow down the server.
Maybe there is another solution to your problem?
http://httpd.apache.org/docs/2.2/mod/mod_include.html#xbithack
/Philip

Categories