Apache and parsed PHP caching - php

Assume that we have Linux + Apache + PHP installed with all default settings. I have some PHP website that uses some large thirdparty PHP library, let's say 1 Mb of PHP sources. This library is used very rarely, let's say for POST requests only. There is a reason why I can't move this library usage into separate PHP file. So, I have to include this library for each HTTP request, but use it very rarely. Should I concern about time spend for PHP parsing in that case? Let me explain. I could do this:
<?php
require_once('heavy_library.php');
// do regular stuff
if(we need heavy library)
{
heavy_library_function();
}
?>
I assume that this solution is bad because in this case heavy_library.php is parsed for each HTTP request. I can move it into the if statement:
<?php
// do regular stuff
if(we need heavy library)
{
require_once('heavy_library.php');
heavy_library_function();
}
?>
Now as I understand it it's being parsed only in case when we need the library.
Now, get back to the question. Default settings for Apache and PHP. Should I concern about this issue? Should I move require_once into the place where it is really being used, or I can leave it as usual and Apache / PHP will do some kind of caching that will prevent parsing for each HTTP request?

No, Apache will not do the caching. You should keep the require_once inside the if so it is only used when you need it.
If you do want caching of PHP, then look at something like eaccelerator.

When you tell PHP to require() something, it will do it no matter what; the only thing that prevents parsing that file from scratch every time will be to use an opcode cache such as APC.
Conditionally loading the file would be preferred in this case. If you're worried about making life more complicated by having these conditions, perform a small benchmark.
You could also use autoloading to load files "on demand" automatically; see spl_autoload

Related

When is a php include file parsed?

When is a PHP include file parsed? At startup, or during execution?
My web forms call a single php script. Depending on the arguements passed in the URL, a switch/case condition determines what the script will do. Each "case" within the switch has its own include files.
If include files are parsed during initial load, then my php script will take up more memory/time to process which leads me to believe having individual php files called from my web form is better, than having one which includes what it needs.
If include files are parsed when needed (thus, when a branch of the code reaches a specific case statement, that it then performs the include) it tells me my code will be reasonably conservative on memory.
So.... my question... When is a PHP include file parsed? At initial load, or during execution?
(note... I failed to find the answer here, and I have read http://php.net/manual/en/function.include.php)
Files are included if and when the include statement is reached at runtime. To very succinctly summarise what that means, the following file is never going to be included:
if (false) {
include 'foo.php';
}
Since you're concerned about memory usage from too many includes, I feel that a bit more detail will be useful over and above a direct answer to your question.
Firstly, to directly answer you, PHP files are parsed as soon as they are loaded -- if a file contains a syntax error, you will be told of that immediately; it won't wait till it gets to that line of code. However subsequent files are only included if the specific line of code containing the include statement is executed.
You're concerned about memory usage, but having a lot of included files is generally not a major memory issue, nor a major performance issue. Indeed, most modern PHP applications of any size will use a framework library that load hundreds of PHP files for every page load. Memory and performance issues are far more likely to be caused by bugs within your code rather than simply loading too much code.
If you are concerned about memory and performance from this, you should consider using PHP's OpCache feature. With this feature enabled, PHP stores a cache in memory of the compiled state of all the files it has included within a system. When it runs the page again, therefore, it does not need to actually load or parse anything when it encounters an include statement; it simply fetches it from the cache.
Using OpCache you can write your code with a very large number of files to include, and without any performance penalty at all.
The good news is that OpCache is enabled by default in recent PHP versions, and is completely transparent to the developer -- you don't even need to know that it's there; the only difference you'll see between it being turned on and off is your site running faster.
So, firstly, make sure your PHP version is up-to-date (v5.5 or higher). Then make sure OpCache is enabled in your PHP.ini file. Then just sit back and stop worrying about these kinds of things.
File included with include statement are parsed during exection. When your php code hits a include statement it will start parsing the file to see what is in there.
From w3schools
The include (or require) statement takes all the text/code/markup that
exists in the specified file and copies it into the file that uses the
include statement.
There is other questions with a similar topic:
In PHP, how does include() exactly work?

Using Eval to include remote php file

I just wanted to keep all my code libraries (PHP Classes; ex: http://libraries.com/form.php) on a single server for easy maintenance and availability. Wherever I need to use this library; I'd just include it in my code. But; I know; enabling remote URL include isn't safe at all. So I found a work around.
I'd just use eval( file_get_contents( 'http://libraries.com/form.txt' ). I use .txt instead of .php so I get PHP code as it is; not a blank file returned by server after PHP is processed.
This works; I get my PHP library/class and I can play with it on a remote location. But I don't know if it is safe or not. What could be pros and cons of this way. Or what other way you can suggest me to achieve this safely?
This:
Has all the security downsides of includeing remote files
Is massively inefficient due to all the extra HTTP requests
Means that a new release of a library gets deployed without being tested against the rest of the code in an application
Adds an extra point of failure for the application
Don't do this. It is a terrible idea.
Installation of dependencies should be a feature of your install script, not the application itself.

Need to compress CSS and JavaScript via PHP but need to know performance implications and how to implement?

I'm currently using PHP to include multiple css (or js) files into a single file (as well as compress the content using GZIP).
E.g. the HTML page calls resources like this...
<link rel="stylesheet" href="Concat.php?filetype=css&files=stylesheet1,stylesheet2,stylesheet3"></link>
<script src="Concat.php?filetype=js&files=script1,script2,script3"></script>
Example of my Concat.php file can be found here: http://dl.dropbox.com/u/3687270/Concat.php (feel free to comment on any problems with the code)
But instead of having to open up my command prompt and running YUI Compressor manually on my CSS/JS files I want the Concat.php file to handle this for at least the CSS side of things (I say CSS only because I appreciate that YUI Compressor does minification of variables and other optimisations so it isn't feasible to replicate in PHP - but that is part 2 of my question).
I know this can be done with some Regex magic and I haven't a problem doing that.
So, my question has 2 parts, which are:
1.) What is the performance implications of having the server minify using preg_replace on a CSS file (or set of CSS files that could have a few hundred lines of code per file - normally it would be a lot less but I'm thinking that if the server compresses the file then I wouldn't have to worry too much about extra whitespace in my CSS)
2.) And how can I get the JavaScript files that are concatenated via my Concat.php file run through YUI Compressor? Maybe run via the server (I have direct access to the server so I could install YUI Compressor there if necessary), but would this be a good idea? Surely optimising on the server everytime a page is requested will be slow and bad for the server + increase bandwidth etc.
The reason this has come up is that I'm constantly having to go back and make changes to existing 'compressed/minified' JS/CSS files which is a real pain because I need to grab the original source files, make changes then re-minify and upload. When really I'd rather just have to edit my files and let the server handle the minification.
Hope someone can help with this.
If your webserver is Apache, you should use mod_concat and let the Apache take care of compression using gzip,
http://code.google.com/p/modconcat/
You should minify the JS just once and save the minified version on servers.
As suggested in the comments you could use one of the pre-built scripts for that. They make use of YUI compressor as well as other solutions even if you can't run Java on the server.
The first one was probably PHP Speedy, which still works but has been abandoned.
A new one is Minify, which offers a lot of features including general caching solution depending on the server's capabilities (APC, Memcached, File cache).
Another advantage of these projects is that your URLs won't have query strings in them (contrary to your current method), which causes troubles in a lot of browsers when it comes to caching. They also take care of gzipping and handling Expires headers for your content.
So I definitely recommend that you try out one of these projects as they offer immediate positive effects, with some simple steps of configuration.
Here's how i recommend you do it:
Turn on GZIP for that specific folder (Web server level)
Use one of the tools to strip out whitespace and concat the files. This will serve as a backup for search engines/proxy users who don't have gzip enabled. You'd then cache the output of this - so the expensive regex calls aren't hit again.
The above wont be very expensive CPU wise if you configure your server correctly. The PHP overhead won't really be much - As you'll have a cached version of the CSS, ie.
-- css.php --
if (!isset($_GET['f'])) {
exit();
}
if (file_exists('path/to/cached/css/'.md5($_GET['f'])) {
// just include that file...
readfile('/path/to/cached/css/'.md5($_GET['f']));
exit();
}
$files = explode(',', $_GET['f']);
ob_start();
foreach ($files as $file)
{
readfile($file);
}
// set a header (including the etags, future expiration dates..)
Header(....);
echo ob_get_flush(); // remove whitespace etc..
// write a new cached file
file_put_contents('/path/to/cache/'.md5($_GET['f']));
exit();
You can then do href="css.php?f=style.css,something.css,other.css" the script will then make a cache file which is the md5 of those files included.
The above example isn't complete.. it's more pseudo really.

Will including unnecessary php files slow down website?

The question might prompt some people to say a definitive YES or NO almost immediately, but please read on...
I have a simple website where there are 30 php pages (each has some php server side code + HTML/CSS etc...). No complicated hierarchy, nothing. Just 30 pages.
I also have a set of purely back-end php files - the ones that have code for saving stuff to database, doing authentication, sending emails, processing orders and the like. These will be reused by those 30 content-pages.
I have a master php file to which I send a parameter. This specifies which one of those 30 files is needed and it includes the appropriate content-page. But each one of those may require a variable number of back-end files to be included. For example one content page may require nothing from back-end, while another might need the database code, while something else might need the emailer, database and the authentication code etc...
I guess whatever back-end page is required, can be included in the appropriate content page, but one small change in the path and I have to edit tens of files. It will be too cumbersome to check which content page is requested (switch-case type of thing) and include the appropriate back-end files, in the master php file. Again, I have to make many changes if a single path changes.
Being lazy, I included ALL back-end files inthe master file so that no content page can request something that is not included.
First question - is this a good practice? if it is done by anyone at all.
Second, will there be a performance problem or any kind of problem due to me including all the back-end files regardless of whether they are needed?
EDIT
The website gets anywhere between 3000 - 4000 visits a day.
You should benchmark. Time the execution of the same page with different includes. But I guess it won't make much difference with 30 files.
But you can save yourself the time and just enable APC in the php.ini (it is a PECL extension, so you need to install it). It will cache the parsed content of your files, which will speed things up significantly.
BTW: There is nothing wrong with laziness, it's even a virtue ;)
If your site is object-oriented I'd recommend using auto-loading (http://php.net/manual/en/language.oop5.autoload.php).
This uses a magic method (__autoload) to look for a class when needed (it's lazy, just like you!), so if a particular page doesn't need all the classes, it doesn't have to get them!
Again, though, this depends on if it is object-oriented or not...
It will slow down your site, though probably not by a noticable amount. It doesn't seem like a healthy way to organize your application, though; I'd rethink it. Try to separate the application logic (eg. most of the server-side code) from the presentation layer (eg. the HTML/CSS).
it's not a bad practice if the files are small and contains just definition and settings.
if they actually run code, or extremely large, it will cause a performance issue.
now - if your site has 3 visitors an hour - who cares, if you have 30000... that's another issue, and you need to work harder to minimize that.
You can migitate some of the disadvantages of PHP code-compiling by using XCache. This PHP module will cache the PHP-opcode which reduces compile time and performance.
Considering the size of your website; if you haven't noticed a slowdown, why try to fix it?
When it comes to larger sites, the first thing you should do is install APC. Even though your current method of including files might not benefit as much from APC as it could, APC will still do an amazing job speeding stuff up.
If response-speed is still problematic, you should consider including all your files. APC will keep a cached version of your sourcefiles in memory, but can only do this well if there are no conditional includes.
Only when your PHP application is at a size where memory exhaustion is a big risk (note that for most large-scale websites Memory is not the bottleneck) you might want to conditionally include parts of your application.
Rasmus Lerdorf (the man behind PHP) agrees: http://pooteeweet.org/blog/538
As others have said, it shouldn't slow things down much, but it's not 'ideal'.
If the main issue is that you're too lazy to go changing the paths for all the included files (if the path ever needs to be updated in the future). Then you can use a constant to define the path in your main file, and use the constant any time you need to include/require a file.
define('PATH_TO_FILES', '/var/www/html/mysite/includes/go/in/here/');
require_once PATH_TO_FILES.'database.php';
require_once PATH_TO_FILES.'sessions.php';
require_once PATH_TO_FILES.'otherstuff.php';
That way if the path changes, you only need to modify one line of code.
It will indeed slow down your website. Most because of the relative slow loading and processing of PHP. The more code you'd like to include, the slower the application will get.
I live by "include as little as possible, as much as necessary" so i usually just include my config and session handling for everything and then each page includes just what they need using an include path defined in the config include, so for path changes you still just need to change one file.
If you include everything the slowdown won't be noticeable until you get a lot of page hits (several hits per second) so in your case just including everything might be ok.

How to create a fast PHP library?

For our online game, we have written tons of PHP classes and functions grouped by theme in files and then folders. In the end, we have now all our backend code (logic & DB access layers) in a set of files that we call libs and we include our libs in our GUI (web pages, presentation layer) using include_once('pathtolib/file.inc').
The problem is that we have been lazy with inclusions and most include statements are made inside our libs file resulting that from each webpage, each time we include any libs file, we actually load the entire libs, file by file.
This has a significant impact on the performance. Therefore What would be the best solution ?
Remove all include statements from the libs file and only call the necessary one from the web pages ?
Do something else ?
Server uses a classic LAMP stack (PHP5).
EDIT: We have a mix of simple functions (legacy reason and the majority of the code) and classes. So autoload will not be enough.
Manage all includes manually, only where needed
Set your include_path to only where it has to be, the default is something like .:/usr/lib/pear/:/usr/lib/php, point it only at where it has to be, php.net/set_include_path
Don't use autoload, it's slow and makes APC and equivalent caches jobs a lot harder
You can turn off the "stat"-operation in APC, but then you have to clear the cache manually every time you update the files
If you've done your programming in an object-oriented way, you can make use of the autoload function, which will load classes from their source files on-demand as you call them.
Edit: I noticed that someone downvoted both answers that referred to autoloading. Are we wrong? Is the overhead of the __autoload function too high to use it for performance purposes? If there is something I'm not realizing about this technique, I'd be really interested to know what it is.
If you want to get really hard-core, do some static analysis, and figure out exactly what libraries are needed when, and only include those.
If you use include and not include_once, then there is a bit of a speed savings there as well.
All that said, Matt's answer about the Zend Optimizer is right on the money. If you want, try the Advanced PHP Cache (APC), which is an opcode cache, and free. It should be in the PECL repository.
You could use spl_autoload_register() or __autoload() to create whatever rules you need for including the files that you need for classes, however autoload introduces its own performance overheads. You'll need to make sure whatever you use is prepended to all gui pages using a php.ini setting or an apache config.
For your files with generic functions, I would suggest that you wrap them in a utility class and do a simple find and replace to replace all your function() calls with util::function(), which would then enable you to autoload these functions (again, there is an overhead introduced to calling a method rather than a global function).
Essentially the best thing to do is go back through your code and pay off your design debt by fixing the include issues. This will give you the most performance benefit, and it will allow you to make the most of optimisers like eAccelerator, Zend Platform and APC
Here is a sample method for loading stuff dynamically
public static function loadClass($class)
{
if (class_exists($class, false) ||
interface_exists($class, false))
{
return;
}
$file = YOUR_LIB_ROOT.str_replace('_', DIRECTORY_SEPARATOR, $class).'.php';
if (file_exists($file))
{
include_once $file;
if (!class_exists($class, false) &&
!interface_exists($class, false))
{
throw new Exception('File '.$file.' was loaded but class '.$class.' was not found');
}
}
}
What your looking for is Automap PECL extension.
It basically allows for auto loading with only a small overhead of loading a pre-computed map file. You can also sub divide the map file if you know a specific directory will only pull from certain PHP files.
You can read more about it here.
It's been a while since I used php, but shouldn't the Zend Optimizer or Cache help in this case? Does php still load & compile every included file again for every request?
I'm not sure if autoloading is the answer. If these files are included, they are probably needed in the class including it, so they will still be autoloaded anyway.
Use a byte code cache (ideally APC) so that PHP doesn't need to parse the libraries on each page load. Be aware that using autoload will negate the benefits of using a byte code cache (you can read more about this here).
Use a profiler. If you try to optimise without having measures, you're working blind.

Categories