So I'm working on a project written in old-style (no OOP) PHP with no full rewrite in the near future. One of the problems with it currently is that its slow—much of the time is spent requireing over 100 files based on where it is in the boot process.
I was wondering if we could condense this (on deployment, not development of course) into a single file or two with all the require'd text just built in. However, since there are so many lines of code that aren't used for each page, I'm wondering if doing this would backfire.
At its core, I think, it's a question of whether:
<?php
echo 'hello world!';
?>
is any faster than
<?php
if(FALSE) {
// thousands of lines of code here
}
echo 'hello world!';
?>
And if so, how much slower?
(Also, if what I've outlined above is a bad idea for some other reasons, please let me know.)
The difference between the two will be negligible. If most of the execution time is currently spent requiring files you're likely to see a significant boost by using an optcode cache like APC, if you are not already.
Other than that - benchmark, find out exactly where the bottlenecks are. In my experience requires are often the slowest part of an old-style procedural PHP app, but even with many included files I'd be surprised if these all added up to a 'slow' app.
Edit: ok, a quick 'n dirty benchmark. I created three 'hello world' PHP scripts like the example. The first (basic.php) was just echoing the string. The second (complex.php) included an if false statement that contained ~5000 lines of PHP code pasted in from another app. The third (require.php) included the same if statement but required in the ~5000 lines of code from another file.
Page generation time (as measured by microtime()) between basic.php and complex.php was around ~0.000004 seconds, so really not significant. Some more comprehensive results from apache bench:
without APC with APC
req/sec avg (ms) req/sec avg (ms)
basic.php: 7819.87 1.277 6960.49 1.437
complex.php: 346.82 2.883 352.12 2.840
require.php: 6819.24 1.446 5995.49 1.668
APC's not doing a lot here but using up memory, but it's likely to be a different picture in a real world app.
require does have some overhead. 100 requires is probably a lot. Parsing an entire file that has the 100 includes is probably slow too. The overhead from require might cost you more, but it is hard to say. It might not cost you enough.
All benchmarks are evil, but here is what I did:
ran a single include of a file that was about 8000 lines (didn't do anything useful each line, just declares a variable). Compared to the time it takes to run an include of an 80 line file (same declarations) 100 times. Results were inconclusive.
Is the including of the files really causing the problem? Is there not something in the script execution that can be optimized? Caching may be an option..
Keep in mind that PHP will parse all the code it sees, even if it's not run.
It will still take relatively long to process the a file too, and from experience, lots of code will eat up considerable amounts of memory even though they're not executed.
Opcode caching as suggested by #Tim should be your first port of call.
If that is out of the question (e.g. due to server limitations): If the functions are somehow separable into categories, one possibility to make things a bit faster and lighter could be (ab)using PHP's Autoloading by putting the functions into separate files as methods of static classes.
function xyz() { ... }
would become
class generic_tools
{
public static function xyz() { ... }
}
and any call to xyz() is replaced by generic_tools::xyz();
The call would then trigger the inclusion of (e.g.) generic_tools.class.php on demand, instead of including everything at once.
This would require rewriting the function calls to static method calls, which may be dead easy or a bit more difficult (if function calls are cooked up dynamically or something). But beyond that, no refactoring would be needed, because you're not really using any OOP mechanisms.
How much this will actually help strongly depends on the app's architecture and how intertwined the functions are with each other.
Related
I can have only one php file.
I need to define a class only on certain occasions to generate output.
Something like this:
if (date("j") == "1") {
echo "Not today";
exit;
}
new OutputGenerator();
class OutputGenerator {
// very large definition (~3500 lines)
}
Problem with this code is that PHP processes the whole file and that makes the script very slow. Without the class declaration I can process ~500 requests per second. With the definition (even if I exit before it is defined) I get only ~50, which is 90% performance drop.
Conditionally requiring external file with the declaration would solve my problem, but I need to stay within this one file.
Is there a way (other than using eval() or require an external file with the definition) to do this?
Parse time on the php file shouldn't take that long.. I think the problem is that constructor for OutputGenerator has some expensive operation going on in it.
You can conditionally create an instance of an OutputGenerator to dodge this cost.
if($iNeedAnOutputGenerator){
$og = new OutputGenerator();
}
Since your script is so extremely minimal, I buy into the idea that your performance can lower significantly. None of the things you are doing are very resource intensive, so I can see that a large class definition could show a noticable difference in that case.
The most obvious answer, if parsing is the issue, is to use an opcode cache. From PHP 5.5 this is built in, but for any PHP version before that you should use the APC extension.
Even without APC though, I think 50 requests per second for such a simple script is low, and you may well have a completely different bottleneck somewhere.
I've just finished writing a pretty big class, and its turned out to be noticeably slow (when executing a function from).
The class contains 3 public functions and atleast 10 private functions, resulting in 1,569 lines of code (a fair amount of that is also comments).
Now I must admit, there is allot of singing and dancing going on behind the scenes when calling one of the public functions, but the response time is so bad, the class is not even worth using.
The class file size is 57KB. I'm wondering if the file size is the problem here, or is it just simply because of the code I'm executing - So can I just break the file down, or am I going to have to make drawbacks to my code?
I've also tried simply including the class file from another file, and the same goes...
In case this does any good:
1.) There are quite a few functions in the class that involve file reading/writing (such as file_exists(),fopen(),fwrite(),ect ect)
2.) There are no database queries/connections
3.) There are no huge (over count of 20) loops
Any help is much appreciated!
IO is very likely your slowest operation.
Size of program itself isn't much of an issue. You can profile your code with xdebug to determine what specifically is dragging you down. Using it, you can prioritize what code you will optimize first.
I've had big files with no problems in terms of speed.
Most likely reason for slowness if because you're not using file handles efficiently. For example, if you close a file after every time you use it it will take significantly longer as opposed to closing it at the end of the script (or just letting it be implicitly closed by not closing it manually).
I have about 10 dynamic php pages which use about 30 functions. Each function is needed in more than 1 page, and every page needs a different subset of functions.
I've been pondering these options:
1- all functions in a single include file: every page loads unneeded code
2- each function in its own include file: too many server requests when loading each page
3 - single include file with conditionals only declaring functions needed based on REQUEST_URI: additional processing when loading each page
4 - one include file per php page, with copies of functions needed by that page: difficult to maintain
How does people handle this scenario? Thanks!
Option 1 is the simplest, the easiest to maintain, and probably quicker to run than options 2 and 3. Option 4 would be very very slightly faster to run, at the cost of being a maintenance nightmare.
Stick with option 1.
throw related functions into a library include. include libraries as needed.
Further, if you spend another 5 seconds thinking about this, that will be 5 additional seconds you've wasted
(In case you don't get what I'm saying, worrying about include optimization is about the 5billionth thing on your list of things you should ever worry about, until such time as a reported performance problem from end users and subsequent profiling tells you otherwise.)
You problem shouts OOP.
Organize code by Classes. Load Classes as needed.
More into PHP OOP.
The impact on your server of loading up 30 functions is despicable compared to the impact of yourself having to maintain a map and possible links to all 30 functions.
Start by putting them all in one include_once'ed file.
If performance becomes an issue, start by installing a PHP accelerator (to cache the parsed PHP into opcodes, so constant re-parsing is avoided) on the server.
When maintenance becomes an issue, break them up by function. You'll still end up with a catch-all "util.inc" or "misc.inc" file, though...
What is better practice for script performance tuning?
This one?
require_once("db.php");
if (!is_cached()) {
require_once("somefile.php");
require_once("somefile2.php");
//do something
} else {
//display something
}
Or this one?
require_once("db.php");
require_once("somefile.php");
require_once("somefile2.php");
if (!is_cached()) {
//do something
} else {
//display something
}
Is it worth placing includes/requires into flow control structures or no?
Thank you
Yes, it will increase performance and contrary to what others said, the impact may not be negligible.
When you include/require files, PHP will check all the defined include_paths until it finds the file or there is nothing more to check. Depending on the amount of include pathes and the amount of files to include, this will have a serious impact on your application, especially when including each and every file up front. Lazy including (like in your first example) or - even better - using the Autoloader is advisable.
Related reading that addresses this in more details:
Zend Framework Performance Guide*
*The advice given there is applicable to any application or framework
Includes and requires are designed in part to allow you to organize your code and make maintaining your software easier to do. While there technically is a performance hit for using includes it is negligible and more then offset by increasing the maintainability of your software. Especially if you use an opcode cache. I wouldn't worry about this and chalk it up to premature optimization.
It can sometimes be worth it if the required files are huge, and no byte code cache is available.
The issue then, however, is not really the number of include() statements, but the amount of data that gets included. The less unused code you include in a request, the better.
A good remedy to monolithic includes is splitting the code base into smaller ones, or, if your application is largely object oriented, using PHP's autoloading feature.
I've seen shared hosting packages where un-tangling monolithic includes could save up to half a second - which is a lot.
Also, included PHP code gets parsed and takes up memory, whether it's executed or not. A clean structure with lean objects is usually the optimal way.
Yes, it will increase performance, but it is very very little so it isn't worth it.
Using absolute paths for include files will increase performance significantly. Test using Apache Benchmark to see if it doesn't.
I generally include 1 functions file into the hader of my site, now this site is pretty high traffic and I just like to make every little thing the best that I can, so my question here is,
Is it better to include multiple smaller function type files with just the code that's needed for that page or does it really make no difference to just load it all as 1 big file, my current functions file has all the functions for my whole site, it's about 4,000 lines long and is loaded on every single page load sitewide, is that bad?
It's difficult to say. 4,000 lines isn't that large in the realms of file parsing. In terms of code management, that's starting to get on the unwieldy side, but you're not likely to see much of a measurable performance difference by breaking it up into 2, 5 or 10 files, and having pages include only the few they need (it's better coding practice, but that's a separate issue). Your differential in number-of-lines read vs. number-of-files that the parser needs to open doesn't seem large enough to warrant anything significant. My initial reaction is that this is probably not an issue you need to worry about.
On the opposite side of the coin, I worked on an enterprise-level project where some operations had an include() tree that often extended into the hundreds of files. Profiling these operations indicated that the time taken by the include() calls alone made up 2-3 seconds of a 10 second load operation (this was PHP4).
If you can install extensions on your server, you should take a look at APC (see also).
It is free, by the way ;-) ; but you must be admin of your server to install it ; so it's generally not provided on shared hosting...
It is what is called an "opcode cache".
Basically, when a PHP script is called, two things happen :
the script is "compiled" into opcodes
the opcodes are executed
APC keeps the opcodes in RAM ; so the file doesn't have to be re-compiled each time it is called -- and that's a great thing for both CPU-load and performances.
To answer the question a bit more :
4,000 lines is not that much, speaking of performances ; Open a couple of files of any big application / Framework, and you'll rapidly get to a couple thousand of lines
a really important thing to take into account is maintenability : what will be easier to work with for you and your team ?
loading many small files might imply many system calls, which are slow ; but those would probably be cached by the OS... So probably not that relevant
If you are doing even 1 database query, this one (including network round-trip between PHP server and DB server) will probably take more time than the parsing of a couple thousand lines ;-)
I think it would be better if you could split the functions file up into components that is appropriate for each page; and call for those components in the appropriate pages. Just my 2 cents!
p/s: I'm a PHP amateur and I'm trying my hands on making a PHP site; I'm not using any functions. So can you enlighten me on what functions would you need for a site?
In my experience having a large include file which gets included everywhere can actually kill performance. I worked on a browser game where we had all game rules as dynamically generated PHP (among others) and the file weighed in at around 500 KiB. It definitely affected performance and we considered generating a PHP extension instead.
However, as usual, I'd say you should do what you're doing now until it is a performance problem and then optimize as needed.
If you load a 4000 line file and use maybe 1 function that is 10 lines, then yes I would say it is inefficient. Even if you used lots of functions of a combined 1000 lines, it is still inefficient.
My suggestion would be to group related functions together and store them in separate files. That way if a page only deals with, for example, database functions you can load just your database functions file/library.
Anothe reason for splitting the functions up is maintainability. If you need to change a function you need to find it in your monalithic include file. You may also have functions that are very, very similar but don't even realise it. Sorting functions by what they do allows you to compare them and get rid of things you don't need or merge two functions into one more general purpose function.
Most of the time Disc IO is what will kill your server so I think the lesser files you fetch from disc the better. Furthermore if it is possible to install APC then the file will be stored compiled into memory which is a big win.
Generally it is better, file management wise, to break stuff down into smaller files because you only need to load the files that you actually use. But, at 4,000 lines, it probably won't make too much of a difference.
I'd suggest a solution similar to this
function inc_lib($name)
{
include("/path/to/lib".$name.".lib.php");
}
function inc_class($name)
{
include("/path/to/lib".$name.".class.php");
}