I read through the related questions and didn't find my answer. This isn't about require/require_once or the use of the __autoload function or even the name of the files.
My company builds large sites and as we've grown, the practice we've grown into is splitting up functions by their relation such as:
inc.functions-user.php
inc.functions-media.php
inc.functions-calendar.php
Each these files tends to be 1000 to 3000 lines of code. Combining would make them a monster to maintain and more difficult for more developers.
However, in some of our larger sites, we end of with somewhere between 8 and 15 of these individual functions files.
Is including the 15 functions files in the header the best way or should we find a way to combine them? Are 12 includes vs. 5 includes significantly detrimental to the running of our site?
If you care about performance install an opcocde cche like APC which will save the compiled form of the script in memory.
If you don't want to install APC the difference is minimal, yes accessing less files takes less time, but that's not where most of the time is spent. (Especially as the filesystem should be able to cache the scripts (uncompiled) in memory) if they are requested often enough.
Calling include/require function 5 times instead of 12 time is not so different, what is important is content of the included file(s).
Also, include cahchers are well suit for your purpose such as APC or xcache.
I would even suggest to split them into much more files.
Look at the MVC pattern, or other frameworks, they are extremly splitted, so you easily can maintain "only" parts, without worrying about destroying something, as long as you follow your structure.
Some points to consider that I think about also
Rasmus Lerdorf has said frequently that "you shouldn't have more than about five includes". I can only assume that he knows what he's talking about, because he made PHP. I am skeptical about the feasibility of this, however. Especially on large projects.
I've found that it's better for development and milestones to make life easier on your developers. If that means separate files, then that's a good idea.
If you're worried about CPU usage or bandwidth, there are probably more obvious bottlenecks than liberal use of include. Un-optimized functions are a good way to make the app faster, and paying attention to images and css or js files is a good way to reduce bandwidth.
With vanilla PHP it is generally better to use as few include files as possible, but of course that makes maintenance a pain. Use an opcode cache such as APC and the performance problem will pretty much disappear. Also, 12 files isn't a very large number of includes, compared to the large MVC-frameworks and other libraries. Keeping the functions separated in a logical structure is the best way by far.
Related
I am creating a PHP website and it contains several sections, I was wondering is it safe to keep all of my functions in 1 file and then include it in every other file?
It would certainly make things easier for me but do you think it's a good idea? In both security aspects and speed. Because if I keep all my functions in a single page it would definitely become quite big, and I wouldn't be needing a lot of them in a lot of pages, so, wouldn't it affect my script's speed?
And do you think it's wise to keep all of them together? Aren't I just making it easier for hackers to find the core of my script? What do you suggest I should do?
Big, long functions files were (and still are, to an extent) pretty common in PHP projects.
Security of all files should certainly be a consideration, but a single functions file is no different from any other, really. Best practice is to keep it outside of your web root so that if PHP fails (it happens), your file still won't be accessible. If the server gets hacked, the location and format of your files is unlikely to make any difference.
Modern frameworks typically include hundreds of different files on every page load. The extra time this takes is barely measurable and probably not worth thinking about.
For maintainability, it's rarely a good idea to have one massive file. You should look at separating these into utility classes and autoload them as needed.
Security-wise, it is safe as long as the file is kept under the right permissions.
It is not the best practice, and I guess you should take a look at php's autoload
Security has nothing to do with the size of file.
You just need to make it inaccessible (which you can do it by .htaccess) and hidden by public (keep it outside of the web root)
And, what about the speed ?
I think it's nicer to get the files organized as specific as possible.
If there are many of tiny functions, you can organized it by its shared characteristics (maybe string functions, array functions, etc)
The time overhead is very very small and can be negligible.
And, I think maintainability is much more important than that negligible performance difference.
This question already has answers here:
PHP: Including a page from deeply nested directory hierarchy - performance issue
(4 answers)
Closed 9 years ago.
Sorry if there is something wrong with this question. I'm developing website. But there is a confusing situation in my head about file system. Choosing to load files from few nested directory or deep nested directories? example:
A. file_get_contents('layout/guest/pages/home/data/slogan.txt');
include_once 'layout/guest/required/front.php';
OR
B. file_get_contents('layout/slogan.txt');
include_once 'layout/front.php';
Which performs faster?
I worry about this because there are lots of file system operations inside the website. If we look at the FileZilla operation, there seen that load many nested directories consuming more time. But I do not know, I hope your help.
Thank you for all your helps :)
Assuming you're using a UNIX-based OS there will be very little difference so you should use what you find easier to maintain. FTP is an entirely different case as it actually transverses directories as a human would (it doesn't have access to your inodes).
Because of how inodes work, your operating system is not going to transverse your directories one by one looking for a reference to another file. Directories exist to make your life easier but most filesystems do not represent them internally as anything more than an organizational file.
You will gain filesystem performance boost by enabling dir_index (instructions) on your extX filesystem (or alternatively, check out XFS as it's really good when dealing with large numbers of files), regularly cleaning out files and defragmenting the disk and using faster drives.
Also, try to use require_once() rather than require() when loading files, as this way the file will only be loaded a single time.
How deeply nested your directories are makes virtually no difference whatsoever. Only the number, size and complexity of the files you include matters, not what particular path they're included from.
I think you're worrying about the wrong problem.
Depending on the operating system you're using, there's probably a slight overhead for using many directories rather than one - the OS needs to check permissions etc. However, on modern hardware, you'd be hard pushed to measure the impact, and caching at the OS level almost certainly wipes out any noticable impact.
The structure you show in your question shows a considered approach to putting files in a logical place - almost certainly, that's going to be better than bunching them all in the same directory from a maintainability point of view.
On the other hand, there's definitely some performance impact with include() and its friends.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
To give some context:
I had a discussion with a colleague recently about the use of Autoloaders in PHP. I was arguing in favour of them, him against.
My point of view is that Autoloaders can help you minimise manual source dependency which in turn can help you reduce the amount of memory consumed when including lots of large files that you may not need.
His response was that including files that you do not need is not a big problem because after a file has been included once it is kept in memory by the Apache child process and this portion of memory will be available for subsequent requests. He argues that you should not be concerned about the amount of included files because soon enough they will all be loaded into memory and used on-demand from memory. Therefore memory is less of an issue and the overhead of trying to find the file you need on the filesystem is much more of a concern.
He's a smart guy and tends to know what he's talking about. However, I always thought that the memory used by Apache and PHP was specific to that particular request being handled.
Each request is assigned an amount of memory equal to memory_limit PHP option and any source compilation and processing is only valid for the life of the request.
Even with op-code caches such as APC, I thought that the individual request still needs to load up each file in it's own portion of memory and that APC is just a shortcut to having it pre-compiled for the responding process.
I've been searching for some documentation on this but haven't managed to find anything so far. I would really appreciate it if someone can point me to any useful documentation on this topic.
UPDATE:
Just to clarify, the autoloader discussion part was more of a context :).
It may not have been clear but my main question is about whether Apache will pool together its resources to respond to multiple requests (especially memory used by included files), or whether each request will need to retrieve the code required to satisfy the execution path in isolation from other requests handled from the same process.
e.g.:
Files 1, 2, 3 and 4 are an equal size of 100KB each.
Request A includes file 1, 2 and 3.
Request B includes file 1, 2, 3 and 4.
In his mind he's thinking that Request A will consume 300KB for the entirety of it's execution and Request B will only consume a further 100KB because files 1,2 and 3 are already in memory.
In my mind it's 300KB and 400KB because they are both being processed independently (if by the same process).
This brings him back to his argument that "just include the lot 'cos you'll use it anyway" as opposed to my "only include what you need to keep the request size down".
This is fairly fundamental to how I approach building a PHP website, so I would be keen to know if I'm off the mark here.
I've also always been of the belief that for large-scale website memory is the most precious resource and more of a concern than file-system checks for an autoloader that are probably cached by the kernel anyway.
You're right though, it's time to benchmark!
Here's how you win arguments: run realistic benchmark, and be on the right side of the numbers.
I've had this same discussion, so I tried an experiment. Using APC, I tried a Kohana app with a single monolithic include (containing all of Kohana) as well as with the standard autoloader. The final result was that the single include was faster at a statistically irrelevant rate (less than 1%) but used slightly more memory (according to PHP's memory functions). Running the test without APC (or XCache, etc) is pointless, so I didn't bother.
So my conclusion was to continue use autoloading because it's much simpler to use. Try the same thing with your app and show your friend the results.
Now you don't need to guess.
Disclaimer: I wasn't using Apache. I cannot emphasize enough to run your own benchmarks on your own hardware on your own app. Don't trust that my experience will be yours.
You are the wiser ninja, grasshopper.
Autoloaders don't load the class file until the class is requested. This means that they will use at most the same amount memory as manual includes, but usually much less.
Classes get read fresh from file each request even if an apache thread can handle multiple requests, so your friends 'eventuall all are read' doesn't hold water.
You can prove this by putting an echo 'foo'; above the class definition in the class file. You'll see on each new request the line will be executed regardless of if you autoload or manually include the whole world of class files at start.
I couldn't find any good concise documentation on this--i may write some with some memory usage examples--as i also have had to explain this to others and show evidence to get it to sink in. I think the folks at zend didn't think anyone would not see the benifits of autoloading.
Yes, apc and such (like all caching solutions) can overcome the resouce negatives and even eek out small gains in performance, but you eat up lots of unneeded memory if you do this on a non-trivial number of libraries and serving a large number of clients. Try something Like loading a healthy chunk of the pear libraries in a massive include file while handling 500 connections hitting your page at the same time.
Even using things like Apc you benefit from using autoloaders with any non-namespaced classes (most of the existing php code currently) as it can help avoid global namespace pollution when dealing with large umbers of class libraries.
This is my opionion.
I think autoloaders are a very bad idea for the following reasons
I like to know what and where my scripts are grabbing the data/code from. Makes debugging easier.
This also has configuration problems in so far as if one of your developers changes the file (upgrade etc) or configuration and things stop working it is harder to find out where it is broken.
I also think that it is lazy programming.
As to memory/preformance issues it is just as cheap to buy some more memory for the computer if it is struggling with that.
I am now writing a php framework. I am wondering whether it will slow down when php require/include or require_once/include_once too many files during a request?
Well of course it will. Doing anything too many times will cause a slow down.
On a more serious note though, IO operations that touch disk are very slow compared to anything that happens in memory. So often times, including files will be a major performance factor when using a large framework (just look at Zend Framework...).
However, there are typically ways to alleviate this such as APC and similar op code caches.
Sometimes programming approaches are also taken. For example, if I remember correctly, Doctrine 1 has the capability to bundle everything into 1 giant file as to have fewer IO calls.
If in doubt, do some indepth profiling of an application written with your framework and see if include/require/etc are one of the major slow points.
Yes, this will slow your application down. *_once calls are generally more expensive, since it must be checked whether that file has already been included. With a lot of includes, there is a lot of hard disk access and a lot of memory usage bundled. I've developed applications with the Zend Framework that include a total of 150 to 200 files at each request - you really can see the impact that has on the overall performance.
The more files you include will add to some load. However, if you have to choose between require and require_once, require_once / include_once take more load because a check will need to be done by the server to see if the same file has been included elsewhere. So if you could possibly avoid that, at least you could boost performance.
Unless you use cache libraries, everytime a request comes those files would be included again and again. Surely it would slow things down. Create a framework that only include-s what needs to be include-ed.
I know you can minify PHP, but I'm wondering if there is any point. PHP is an interpreted language so will run a little slower than a compiled language. My question is: would clients see a visible speed improvement in page loads and such if I were to minify my PHP?
Also, is there a way to compile PHP or something similar?
PHP is compiled into bytecode, which is then interpreted on top of something resembling a VM. Many other scripting languages follow the same general process, including Perl and Ruby. It's not really a traditional interpreted language like, say, BASIC.
There would be no effective speed increase if you attempted to "minify" the source. You would get a major increase by using a bytecode cache like APC.
Facebook introduced a compiler named HipHop that transforms PHP source into C++ code. Rasmus Lerdorf, one of the big PHP guys did a presentation for Digg earlier this year that covers the performance improvements given by HipHop. In short, it's not too much faster than optimizing code and using a bytecode cache. HipHop is overkill for the majority of users.
Facebook also recently unveiled HHVM, a new virtual machine based on their work making HipHop. It's still rather new and it's not clear if it will provide a major performance boost to the general public.
Just to make sure it's stated expressly, please read that presentation in full. It points out numerous ways to benchmark and profile code and identify bottlenecks using tools like xdebug and xhprof, also from Facebook.
2021 Update
HHVM diverged away from vanilla PHP a couple versions ago. PHP 7 and 8 bring a whole bunch of amazing performance improvements that have pretty much closed the gap. You now no longer need to do weird things to get better performance out of PHP!
Minifying PHP source code continues to be useless for performance reasons.
Forgo the idea of minifying PHP in favor of using an opcode cache, like PHP Accelerator, or APC.
Or something else like memcached
Yes there is one (non-technical) point.
Your hoster can spy your code on his server. If you minify and uglify it, it is for spys more difficult to steal your ideas.
One reason for minifying and uglifying php may be spy-protection. I think uglyfing code should one step in an automatic deployment.
With some rewriting (shorter variable names) you could save a few bytes of memory, but that's also seldomly significant.
However I do design some of my applications in a way that allows to concatenate include scripts together. With php -w it can be compacted significantly, adding a little speed gain for script startup. On an opcode-enabled server this however only saves a few file mtime checks.
This is less an answer than an advertisement. I'm been working on a PHP extension that translates Zend opcodes to run on a VM with static typing. It doesn't accelerate arbitrary PHP code. It does allow you to write code that run way faster than what regular PHP allows. The key here is static typing. On a modern CPU, a dynamic language eats branch misprediction penalty left and right. Fact that PHP arrays are hash tables also imposes high cost: lot of branch mispredictions, inefficient use of cache, poor memory prefetching, and no SIMD optimization whatsoever. Branch misprediction and cache misses in particular are achilles' heel for today's processors. My little VM sidesteps those problem by using static types and C array instead of hash table. The result ends up running roughly ten times faster. This is using bytecode interpretation. The extension can optionally compile a function through gcc. In that case, you get two to five times more speed.
Here's the link for anyone interested:
https://github.com/chung-leong/qb/wiki
Again, the extension is not a general PHP accelerator. You have to write code specific for it.
There are PHP compilers... see this previous question for a list; but (unless you're the size of Facebook or are targetting your application to run client-side) they're generally a lot more trouble than they're worth
Simple opcode caching will give you more benefit for the effort involved. Or profile your code to identify the bottlenecks, and then optimise it.
You don't need to minify PHP.
In order to get a better performance, install an Opcode cache; but the ideal solution would be to upgrade your PHP to the 5.5 version or above because the newer versions have an opcode cache by default called Zend Optimiser that is performing better than the other ones http://massivescale.blogspot.com/2013/06/php-55-zend-optimiser-opcache-vs-xcache.html.
The "point" is to make the file smaller, because smaller files load faster than bigger files. Also, removing whitespace will make parsing a tiny bit faster since those characters don't need to be parsed out.
Will it be noticeable? Almost never, unless the file is huge and there's a big difference in size.