For our online game, we have written tons of PHP classes and functions grouped by theme in files and then folders. In the end, we have now all our backend code (logic & DB access layers) in a set of files that we call libs and we include our libs in our GUI (web pages, presentation layer) using include_once('pathtolib/file.inc').
The problem is that we have been lazy with inclusions and most include statements are made inside our libs file resulting that from each webpage, each time we include any libs file, we actually load the entire libs, file by file.
This has a significant impact on the performance. Therefore What would be the best solution ?
Remove all include statements from the libs file and only call the necessary one from the web pages ?
Do something else ?
Server uses a classic LAMP stack (PHP5).
EDIT: We have a mix of simple functions (legacy reason and the majority of the code) and classes. So autoload will not be enough.
Manage all includes manually, only where needed
Set your include_path to only where it has to be, the default is something like .:/usr/lib/pear/:/usr/lib/php, point it only at where it has to be, php.net/set_include_path
Don't use autoload, it's slow and makes APC and equivalent caches jobs a lot harder
You can turn off the "stat"-operation in APC, but then you have to clear the cache manually every time you update the files
If you've done your programming in an object-oriented way, you can make use of the autoload function, which will load classes from their source files on-demand as you call them.
Edit: I noticed that someone downvoted both answers that referred to autoloading. Are we wrong? Is the overhead of the __autoload function too high to use it for performance purposes? If there is something I'm not realizing about this technique, I'd be really interested to know what it is.
If you want to get really hard-core, do some static analysis, and figure out exactly what libraries are needed when, and only include those.
If you use include and not include_once, then there is a bit of a speed savings there as well.
All that said, Matt's answer about the Zend Optimizer is right on the money. If you want, try the Advanced PHP Cache (APC), which is an opcode cache, and free. It should be in the PECL repository.
You could use spl_autoload_register() or __autoload() to create whatever rules you need for including the files that you need for classes, however autoload introduces its own performance overheads. You'll need to make sure whatever you use is prepended to all gui pages using a php.ini setting or an apache config.
For your files with generic functions, I would suggest that you wrap them in a utility class and do a simple find and replace to replace all your function() calls with util::function(), which would then enable you to autoload these functions (again, there is an overhead introduced to calling a method rather than a global function).
Essentially the best thing to do is go back through your code and pay off your design debt by fixing the include issues. This will give you the most performance benefit, and it will allow you to make the most of optimisers like eAccelerator, Zend Platform and APC
Here is a sample method for loading stuff dynamically
public static function loadClass($class)
{
if (class_exists($class, false) ||
interface_exists($class, false))
{
return;
}
$file = YOUR_LIB_ROOT.str_replace('_', DIRECTORY_SEPARATOR, $class).'.php';
if (file_exists($file))
{
include_once $file;
if (!class_exists($class, false) &&
!interface_exists($class, false))
{
throw new Exception('File '.$file.' was loaded but class '.$class.' was not found');
}
}
}
What your looking for is Automap PECL extension.
It basically allows for auto loading with only a small overhead of loading a pre-computed map file. You can also sub divide the map file if you know a specific directory will only pull from certain PHP files.
You can read more about it here.
It's been a while since I used php, but shouldn't the Zend Optimizer or Cache help in this case? Does php still load & compile every included file again for every request?
I'm not sure if autoloading is the answer. If these files are included, they are probably needed in the class including it, so they will still be autoloaded anyway.
Use a byte code cache (ideally APC) so that PHP doesn't need to parse the libraries on each page load. Be aware that using autoload will negate the benefits of using a byte code cache (you can read more about this here).
Use a profiler. If you try to optimise without having measures, you're working blind.
Related
I've recently inherited a large PHP application with NO objects/modules/namespaces...only a lot of files containing functions.
Of course, there is a LOT of dependencies (and all files and almost always included).
I'm looking for a tool that could analyse the files and generate a dependencies graph. It would then be easier to detect independent files/set of files and re-factor the whole thing.
So far the best solution I've found would be to write a CodeSniffer sniff to detect all functions calls and then use that to generate the graph.
It seems something useful for other, so I'm sure tools already exists for it.
What would you recommend ?
I think that the best solution is use a doc generat + grapviz, PHPDocumentor looks to have a Grapviz extension at https://github.com/phpDocumentor/GraphViz
This is a example made with PHPDocumentor:
http://demo.phpdoc.org/Clean/graphs/classes.svg
Too you can use a hierarchical profiler like xhprof (https://github.com/facebook/xhprof), this can draw a tree of all call to functions from a execution.
A example form xhprof draw done by Graphviz
I could recommend a lightweight project I wrote few days ago. Basically I had a 300+ files PHP project and I wanted to detect what files do these files require/include and vice-versa. Moreover, I wanted to check for each individual file what files does this file requires/includes (directly or indirectly, ie. via file inheritance) and vice-versa: what are the files that include this particular file. For any combination of these I wanted an interactive dependency graph (base on file inclusion and not on class/function calls/usage).
Check out the project's sandbox and its source code.
Note that the whole thing was written in only 2 days so don't judge it
too harsh. What's important is that it's doing its job!
Assume that we have Linux + Apache + PHP installed with all default settings. I have some PHP website that uses some large thirdparty PHP library, let's say 1 Mb of PHP sources. This library is used very rarely, let's say for POST requests only. There is a reason why I can't move this library usage into separate PHP file. So, I have to include this library for each HTTP request, but use it very rarely. Should I concern about time spend for PHP parsing in that case? Let me explain. I could do this:
<?php
require_once('heavy_library.php');
// do regular stuff
if(we need heavy library)
{
heavy_library_function();
}
?>
I assume that this solution is bad because in this case heavy_library.php is parsed for each HTTP request. I can move it into the if statement:
<?php
// do regular stuff
if(we need heavy library)
{
require_once('heavy_library.php');
heavy_library_function();
}
?>
Now as I understand it it's being parsed only in case when we need the library.
Now, get back to the question. Default settings for Apache and PHP. Should I concern about this issue? Should I move require_once into the place where it is really being used, or I can leave it as usual and Apache / PHP will do some kind of caching that will prevent parsing for each HTTP request?
No, Apache will not do the caching. You should keep the require_once inside the if so it is only used when you need it.
If you do want caching of PHP, then look at something like eaccelerator.
When you tell PHP to require() something, it will do it no matter what; the only thing that prevents parsing that file from scratch every time will be to use an opcode cache such as APC.
Conditionally loading the file would be preferred in this case. If you're worried about making life more complicated by having these conditions, perform a small benchmark.
You could also use autoloading to load files "on demand" automatically; see spl_autoload
I have multiple PHP classes with approximately 20 functions each all in one PHP file. Will the server store every class and every function in memory once I include this file?
Or will it only load the class and its functions once I instantiate like so?:
$ajax = new ajax();
Or will the server only cache the functions that I specifically call?:
$ajax->make_request();
I'm wondering if it is OK to have so many classes and functions housed in one single PHP file or if I should put in some type of logic that includes only the classes and functions that are required for the job.
I think you are a bit confused about how PHP works.
Every request the PHP parser, parses the requested file, e.g. index.php
If index.php include's another file, PHP will the parse that file.
Once a PHP file is parsed, it is stored in memory with "byte codes" (an almost machine language), during that request.
Regardless of how many functions or classes are in a file, it will all be stored in memory for that request.
There are extensions like APC that cache these parsed byte codes in memory between requests, but they need to be added on to PHP.
It is however better (in terms of memory usage) to use auto loading for your classes.
http://php.net/manual/en/language.oop5.autoload.php
PSR0 is a good set of guidelines for autoloading classes:
https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-0.md
The logic you need is called "Autoloading",
this is the process to include magicaly the class Ajax when the php executable hit a new Ajax(); instruction.
This way, there is never useless classes loaded in memory.
Look at PSR-0 and this good ClassLoader component.
(If you only have 20 classes on your project, you don't need to add an autoloader - the profit will be very low).
No, server by default will not cache anything unless it is set up to cache stuff with use of APC or similar extensions. So if your code uses all these classes then it would be perhaps better to put them in one file to reduce I/O. But if you do not use, separate the code into logical classes, put into separate file and use autoloading.
Where should I put require_once statements, and why?
Always on the beginning of a
file, before the class,
In the actual method when the
file is really needed
It depends
?
Most frameworks put includes at the beginning and do not care if the file is really needed.
Using autoloader is the other case here.
Edit:
Surely, we all agree, that the autoloader is the way to go. But that is the 'other case' I was not
asking here. (BTW, Zend Framework Application uses autoloader, and the files are still hard-required, and placed at the beginning).
I just wanted to know, why do programmers include required files at the beginning of the file, even when they likely will not be used at all (e.g. Exception files).
Autoloading is a much better practice, as it will only load what is needed. Obviously, you also need to include the file which defines the __autoload function, so you're going to have some somewhere.
I usually have a single file called "includes.php" which then defines the __autoload and includes all the non-class files (such as function libraries, configuration files, etc). This file is loaded at the start of each page.
I'd say 3. It depends. If you're dealing with a lot of code, it may be worth loading the include file on request only, as loading code will take time to do, and eat up memory. On the other hand, this makes maintenance much harder, especially if you have dependencies. If you load includes "on demand", you may want to use a wrapper function so you can keep track of what module is loaded where.
I think the autoloader mechanism really is the way to go - of course, the application's design needs to be heavily object oriented for that to work.
The question might prompt some people to say a definitive YES or NO almost immediately, but please read on...
I have a simple website where there are 30 php pages (each has some php server side code + HTML/CSS etc...). No complicated hierarchy, nothing. Just 30 pages.
I also have a set of purely back-end php files - the ones that have code for saving stuff to database, doing authentication, sending emails, processing orders and the like. These will be reused by those 30 content-pages.
I have a master php file to which I send a parameter. This specifies which one of those 30 files is needed and it includes the appropriate content-page. But each one of those may require a variable number of back-end files to be included. For example one content page may require nothing from back-end, while another might need the database code, while something else might need the emailer, database and the authentication code etc...
I guess whatever back-end page is required, can be included in the appropriate content page, but one small change in the path and I have to edit tens of files. It will be too cumbersome to check which content page is requested (switch-case type of thing) and include the appropriate back-end files, in the master php file. Again, I have to make many changes if a single path changes.
Being lazy, I included ALL back-end files inthe master file so that no content page can request something that is not included.
First question - is this a good practice? if it is done by anyone at all.
Second, will there be a performance problem or any kind of problem due to me including all the back-end files regardless of whether they are needed?
EDIT
The website gets anywhere between 3000 - 4000 visits a day.
You should benchmark. Time the execution of the same page with different includes. But I guess it won't make much difference with 30 files.
But you can save yourself the time and just enable APC in the php.ini (it is a PECL extension, so you need to install it). It will cache the parsed content of your files, which will speed things up significantly.
BTW: There is nothing wrong with laziness, it's even a virtue ;)
If your site is object-oriented I'd recommend using auto-loading (http://php.net/manual/en/language.oop5.autoload.php).
This uses a magic method (__autoload) to look for a class when needed (it's lazy, just like you!), so if a particular page doesn't need all the classes, it doesn't have to get them!
Again, though, this depends on if it is object-oriented or not...
It will slow down your site, though probably not by a noticable amount. It doesn't seem like a healthy way to organize your application, though; I'd rethink it. Try to separate the application logic (eg. most of the server-side code) from the presentation layer (eg. the HTML/CSS).
it's not a bad practice if the files are small and contains just definition and settings.
if they actually run code, or extremely large, it will cause a performance issue.
now - if your site has 3 visitors an hour - who cares, if you have 30000... that's another issue, and you need to work harder to minimize that.
You can migitate some of the disadvantages of PHP code-compiling by using XCache. This PHP module will cache the PHP-opcode which reduces compile time and performance.
Considering the size of your website; if you haven't noticed a slowdown, why try to fix it?
When it comes to larger sites, the first thing you should do is install APC. Even though your current method of including files might not benefit as much from APC as it could, APC will still do an amazing job speeding stuff up.
If response-speed is still problematic, you should consider including all your files. APC will keep a cached version of your sourcefiles in memory, but can only do this well if there are no conditional includes.
Only when your PHP application is at a size where memory exhaustion is a big risk (note that for most large-scale websites Memory is not the bottleneck) you might want to conditionally include parts of your application.
Rasmus Lerdorf (the man behind PHP) agrees: http://pooteeweet.org/blog/538
As others have said, it shouldn't slow things down much, but it's not 'ideal'.
If the main issue is that you're too lazy to go changing the paths for all the included files (if the path ever needs to be updated in the future). Then you can use a constant to define the path in your main file, and use the constant any time you need to include/require a file.
define('PATH_TO_FILES', '/var/www/html/mysite/includes/go/in/here/');
require_once PATH_TO_FILES.'database.php';
require_once PATH_TO_FILES.'sessions.php';
require_once PATH_TO_FILES.'otherstuff.php';
That way if the path changes, you only need to modify one line of code.
It will indeed slow down your website. Most because of the relative slow loading and processing of PHP. The more code you'd like to include, the slower the application will get.
I live by "include as little as possible, as much as necessary" so i usually just include my config and session handling for everything and then each page includes just what they need using an include path defined in the config include, so for path changes you still just need to change one file.
If you include everything the slowdown won't be noticeable until you get a lot of page hits (several hits per second) so in your case just including everything might be ok.