I am now writing a php framework. I am wondering whether it will slow down when php require/include or require_once/include_once too many files during a request?
Well of course it will. Doing anything too many times will cause a slow down.
On a more serious note though, IO operations that touch disk are very slow compared to anything that happens in memory. So often times, including files will be a major performance factor when using a large framework (just look at Zend Framework...).
However, there are typically ways to alleviate this such as APC and similar op code caches.
Sometimes programming approaches are also taken. For example, if I remember correctly, Doctrine 1 has the capability to bundle everything into 1 giant file as to have fewer IO calls.
If in doubt, do some indepth profiling of an application written with your framework and see if include/require/etc are one of the major slow points.
Yes, this will slow your application down. *_once calls are generally more expensive, since it must be checked whether that file has already been included. With a lot of includes, there is a lot of hard disk access and a lot of memory usage bundled. I've developed applications with the Zend Framework that include a total of 150 to 200 files at each request - you really can see the impact that has on the overall performance.
The more files you include will add to some load. However, if you have to choose between require and require_once, require_once / include_once take more load because a check will need to be done by the server to see if the same file has been included elsewhere. So if you could possibly avoid that, at least you could boost performance.
Unless you use cache libraries, everytime a request comes those files would be included again and again. Surely it would slow things down. Create a framework that only include-s what needs to be include-ed.
Related
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
To give some context:
I had a discussion with a colleague recently about the use of Autoloaders in PHP. I was arguing in favour of them, him against.
My point of view is that Autoloaders can help you minimise manual source dependency which in turn can help you reduce the amount of memory consumed when including lots of large files that you may not need.
His response was that including files that you do not need is not a big problem because after a file has been included once it is kept in memory by the Apache child process and this portion of memory will be available for subsequent requests. He argues that you should not be concerned about the amount of included files because soon enough they will all be loaded into memory and used on-demand from memory. Therefore memory is less of an issue and the overhead of trying to find the file you need on the filesystem is much more of a concern.
He's a smart guy and tends to know what he's talking about. However, I always thought that the memory used by Apache and PHP was specific to that particular request being handled.
Each request is assigned an amount of memory equal to memory_limit PHP option and any source compilation and processing is only valid for the life of the request.
Even with op-code caches such as APC, I thought that the individual request still needs to load up each file in it's own portion of memory and that APC is just a shortcut to having it pre-compiled for the responding process.
I've been searching for some documentation on this but haven't managed to find anything so far. I would really appreciate it if someone can point me to any useful documentation on this topic.
UPDATE:
Just to clarify, the autoloader discussion part was more of a context :).
It may not have been clear but my main question is about whether Apache will pool together its resources to respond to multiple requests (especially memory used by included files), or whether each request will need to retrieve the code required to satisfy the execution path in isolation from other requests handled from the same process.
e.g.:
Files 1, 2, 3 and 4 are an equal size of 100KB each.
Request A includes file 1, 2 and 3.
Request B includes file 1, 2, 3 and 4.
In his mind he's thinking that Request A will consume 300KB for the entirety of it's execution and Request B will only consume a further 100KB because files 1,2 and 3 are already in memory.
In my mind it's 300KB and 400KB because they are both being processed independently (if by the same process).
This brings him back to his argument that "just include the lot 'cos you'll use it anyway" as opposed to my "only include what you need to keep the request size down".
This is fairly fundamental to how I approach building a PHP website, so I would be keen to know if I'm off the mark here.
I've also always been of the belief that for large-scale website memory is the most precious resource and more of a concern than file-system checks for an autoloader that are probably cached by the kernel anyway.
You're right though, it's time to benchmark!
Here's how you win arguments: run realistic benchmark, and be on the right side of the numbers.
I've had this same discussion, so I tried an experiment. Using APC, I tried a Kohana app with a single monolithic include (containing all of Kohana) as well as with the standard autoloader. The final result was that the single include was faster at a statistically irrelevant rate (less than 1%) but used slightly more memory (according to PHP's memory functions). Running the test without APC (or XCache, etc) is pointless, so I didn't bother.
So my conclusion was to continue use autoloading because it's much simpler to use. Try the same thing with your app and show your friend the results.
Now you don't need to guess.
Disclaimer: I wasn't using Apache. I cannot emphasize enough to run your own benchmarks on your own hardware on your own app. Don't trust that my experience will be yours.
You are the wiser ninja, grasshopper.
Autoloaders don't load the class file until the class is requested. This means that they will use at most the same amount memory as manual includes, but usually much less.
Classes get read fresh from file each request even if an apache thread can handle multiple requests, so your friends 'eventuall all are read' doesn't hold water.
You can prove this by putting an echo 'foo'; above the class definition in the class file. You'll see on each new request the line will be executed regardless of if you autoload or manually include the whole world of class files at start.
I couldn't find any good concise documentation on this--i may write some with some memory usage examples--as i also have had to explain this to others and show evidence to get it to sink in. I think the folks at zend didn't think anyone would not see the benifits of autoloading.
Yes, apc and such (like all caching solutions) can overcome the resouce negatives and even eek out small gains in performance, but you eat up lots of unneeded memory if you do this on a non-trivial number of libraries and serving a large number of clients. Try something Like loading a healthy chunk of the pear libraries in a massive include file while handling 500 connections hitting your page at the same time.
Even using things like Apc you benefit from using autoloaders with any non-namespaced classes (most of the existing php code currently) as it can help avoid global namespace pollution when dealing with large umbers of class libraries.
This is my opionion.
I think autoloaders are a very bad idea for the following reasons
I like to know what and where my scripts are grabbing the data/code from. Makes debugging easier.
This also has configuration problems in so far as if one of your developers changes the file (upgrade etc) or configuration and things stop working it is harder to find out where it is broken.
I also think that it is lazy programming.
As to memory/preformance issues it is just as cheap to buy some more memory for the computer if it is struggling with that.
Just wondering... Does it? And how much
Like including 20 .php files whith classes in them, but without actually using the classes (they might be used though).
I will give a slight variant Answer to this:
If you are running on a tuned VPS or dedicated server: a trivial amount.
If you are running on a shared hosting service: it can considerably degrade performance of your script execution time.
Why? because in the first case you should have configured a PHP opcode cache such as APC or Xcache, which can, in practical terms, eliminate script load and compilation overheads. Even where files need to be read or stat-checked the meta and file data will be "hot" and therefore largely cached in the file-system cache if the (virtual) server is dedicated to the application.
On a shared service everything is running in the opposite direction: PHP is run as a per-request image in the users UID; no opcode caching solutions support this mode, so everything needs to be compiled. The killer here is that files need to be read and many (perhaps most) shared LAMP hosting providers use a scalable server farm for the LAMP tier, with the user data on shared NFS mounted NAS infrastructure. Since these NFS mounts with have an acremin of less than 1 min, the I/O requests will require RPCs off-server. My blog article gives some benchmarks here. The details for a shared IIS hosting template are different but the net effects are similar.
I run the phpBB forum package on my shared service and I roughly halved response times by aggregating the common set of source includes as I describe here.
Yes, though by how much depends on a number of things. The performance cost isn't too high if you are using a PHP accelerator, but will drastically slow things if you're aren't. Your best bet is generally to use autoloading, so you only load things at the point of actual use, rather than loading everything just in case. That may reduce your memory consumption too.
Of course it affects the performance. Everything you do in PHP does.
How much performance is a matter of how much data is in them, and how long it takes to execute them or in the case of classes, read them.
If your not using them, why include them? I assume your using some main engine file, or header file and should rethink your methods of including files.
EDIT: Or as #Pekka pointed out, you can autoload classes.
Short answer - yes it will.
For longer answers a quick google search revealed these - Will including unnecessary php files slow down website? ; PHP Performance on including multiple files
Searching helps!
--Matīss
By "common script startup sequence", what I mean is that in the majority of pages on my site, the first order of business is to consult 3 specific files (via include()), which centrally define constants, certain functions used in many scripts, and a class or two, as well as providing the database credentials. I don't know if there's a more standard term for such a setup.
What I want to know is whether it's possible to have too many of these and make things slower as a result. I know that using include() has a certain amount of overhead because it's another file to look for in the filesystem, parse, and execute. If there is such a thing as too many includes, I want to know whether I am anywhere near that point. N.B. Some of my pages include() still more scripts that they specifically, individually need (for example, a script that defines a function used by only a few pages), and I do not count these occasional extra includes, which are used reasonably sparingly anyway. I'm only worrying about the 3 includes that occur on the majority of pages and set everything up.
What are the 3 includes?
Two of them are outside of webroot. common.php defines a bunch of functions, classes and other things that do not vary between the development and production sites. config.php defines various constants and paths that are different in the development and production sites (which database to connect to, among other things). Of course, it's desirable for this file in particular to be outside of webroot. config.php include()s common.php at the bottom.
The other one is inside webroot and contains a single line:
include [path to appropriate directory]/config.php
The directory differs between the development and production sites.
(Feel free to question the rationale behind setting up the includes this way, but I feel that this does provide a good, reliable system for preparing to execute each page, and my question is about whether it is bad to have that many includes as a baseline on each page.)
Use APC and your worries go away. The opcode of your files will be cached in the RAM and everything will go super fast. :) Facebook does this so it'll definitely help you to scale.
Because you may not notice any difference between 1 include or 50 in terms of speed, but for an application with high concurrency, I/O can be a huge bottleneck. So the key is not speed, but scaling.
The best thing to do is use an accelerator of some kind, APC or eAccelerator or something like this to keep them cached in RAM. The reasons behind this are quite a few and on a busy site it means a lost.
For example a friend did an experiment on his website which has about 15k users a day and average page load time of 0.03s. He removed most of the includes which he used as templates - the average load time dropped to 0.01 secs. Then he put an accelerator - 0.002 secs per page. I hope those numbers convince you that includes must be kept as little as possible on busy sites if you don't use an accelerator of some kind.
This is because of the high I/O which is needed to scan directories, find the files, open them, read them and so on.
So keep the includes to minimum. Study the most important parts of your site and optimize there by moving required parts to general includes and so on.
I dont believe the performance has anything do with no of includes, because think of a case where one included file contains 500 lines of codes and in another case you have 50 included files with just one line of code each.
Or if you by any chance using Windows as OS, you can use WinCache.
http://php.net/manual/en/book.wincache.php
Q1)
I'm designing a CMS (-who isn't!) but priority is being given to caching. Literally everything is cached. DB rows, DB id queries, Configuration data, processed data, compiled templates. Currently it has two layers of caching.
The first is a opcode cache or memory cache such as apc, eaccelerator, xcache or memcached. If an entry is not found in there it is then searched for in the secondary slow cache, ie php includes.
Are the opcode caches actually faster than doing a require_once to a php file with a var_export'd array of data in it? My tests are inconclusive as my development box (5.3 of XAMPP) keeps throwing errors installing any of the aforementioned programs.
Q2)
The CMS has numerous helper classes that are autoloaded on demand instead of loading all files. Mostly each has a require before it so no autoloading needs to take place, however this is not the question. Because a page script can have up to 50/60 helper files included I have a feeling that if the site was under pressure it would buckle because of all the i/o that this incurs. Ignore for the moment that there is output cache in place that would remove the need for what I am about to suggest, and also that opcode caches would render this moot. What I have tried to do is join all the helper files required for the scripts execution in one single file. This is achievable and works well, however it has a side effect of greatly increasing the memory usage dramatically even though technically the same code is being used.
What are your thoughts and opinions on this?
Using a compiler cache like APC should help out as it will take your helper files and cache them after they are converted to opcode. That will mean the files will not only be cached but already in opcode so they do not need to be parsed and compiled each time they are required.
Looks like you just have no idea what you want to cache (and why).
You just cannot compare "opcode cache" and "require_once". Opcode cache will cache required code as well as other code.
First, keep in mind that your operating system will cache files in memory if they are being accessed frequently enough.
Also, don't use require_once. It is significantly slower than require. If you aren't using an autoloader, you should be. There is no reason to be manually including files in a modern php application (very few exceptions).
50-60 helper files is crazy. Isn't there some way to combine these? Can't you put them all in a related helper class, like OutputHelper or CacheHelper? That way you only have to include the class, which, again, should be taken care of your autoloader. It sounds to me you're doing something like putting one function per file.
Opcode caching greatly reduces memory usage and execution speed, but I'm not sure what effect it has on require statements.
I agree with ryeguy. require_once is slower than require or include because it has to log every include and check against it. If your only doing one require/include (which you should be for classes) then you don't need require_once or include_once.
Autoloading is great for optimization. As you only will load in classes when needed. So if your app has 500 classes, but only needs 15 to run a certain page/script. Then only those 15 get loaded. Which is nice.
If you take a peak at any big framework. You will notice that they have migrated to using autoloaders. They use to use require_once at the last moment like this example from the Zend Framework Version 1.
require_once 'Zend/Db/Exception.php';
throw new Zend_Db_Exception('Adapter name must be specified in a string');
Zend Framework Version 2 is going to be using auto loaders instead. I believe this is the fastest and it's also the easiest to code for.
I read through the related questions and didn't find my answer. This isn't about require/require_once or the use of the __autoload function or even the name of the files.
My company builds large sites and as we've grown, the practice we've grown into is splitting up functions by their relation such as:
inc.functions-user.php
inc.functions-media.php
inc.functions-calendar.php
Each these files tends to be 1000 to 3000 lines of code. Combining would make them a monster to maintain and more difficult for more developers.
However, in some of our larger sites, we end of with somewhere between 8 and 15 of these individual functions files.
Is including the 15 functions files in the header the best way or should we find a way to combine them? Are 12 includes vs. 5 includes significantly detrimental to the running of our site?
If you care about performance install an opcocde cche like APC which will save the compiled form of the script in memory.
If you don't want to install APC the difference is minimal, yes accessing less files takes less time, but that's not where most of the time is spent. (Especially as the filesystem should be able to cache the scripts (uncompiled) in memory) if they are requested often enough.
Calling include/require function 5 times instead of 12 time is not so different, what is important is content of the included file(s).
Also, include cahchers are well suit for your purpose such as APC or xcache.
I would even suggest to split them into much more files.
Look at the MVC pattern, or other frameworks, they are extremly splitted, so you easily can maintain "only" parts, without worrying about destroying something, as long as you follow your structure.
Some points to consider that I think about also
Rasmus Lerdorf has said frequently that "you shouldn't have more than about five includes". I can only assume that he knows what he's talking about, because he made PHP. I am skeptical about the feasibility of this, however. Especially on large projects.
I've found that it's better for development and milestones to make life easier on your developers. If that means separate files, then that's a good idea.
If you're worried about CPU usage or bandwidth, there are probably more obvious bottlenecks than liberal use of include. Un-optimized functions are a good way to make the app faster, and paying attention to images and css or js files is a good way to reduce bandwidth.
With vanilla PHP it is generally better to use as few include files as possible, but of course that makes maintenance a pain. Use an opcode cache such as APC and the performance problem will pretty much disappear. Also, 12 files isn't a very large number of includes, compared to the large MVC-frameworks and other libraries. Keeping the functions separated in a logical structure is the best way by far.