One 'compressed' file of classes vs multiple class files in PHP - php

Is there any gain to combining all of the classes for a project into one massive 'compressed' file (spaces and comments removed) vs loading each class as they are required (other than all of the classes are there so you don't have to require/include them, isn't that what we have __autoload for?)?
It seems to me that loading each class as needed would be a lot less strenuous on php.

Generally, in the vast majority of cases, for any non-trivial number of classes, you want some kind of dependency injection. The overhead of the dependency injection (via whatever method) will be dwarfed by the resources needed to parse a bunch of classes that won't get used in a particular request.
A lot has been written on the subject of how to efficiently manage loading classes as needed.

If you use a bytecode cache like APC, there's likely no performance gain to be had from "minifying" your PHP (removing whitespace and comments). The only benefit would be obfuscation -- which isn't going to buy you much anyway.
And yes -- loading 30 classes when you only need 1 is going to be a waste of resources.

Related

Keep functions in the same helper file or separate files?

Should I try to keep all helper functions in the same file (say, all in functions.php) at the cost of reading in unnecessary functions, or store functions in separate files where I'll only include files with functions I need, at the cost of the overhead for including more files? How big of an overhead is there to include separate helper files? I know for images like icons it's faster to join icons together in 1 image, but I'm not sure if the same applies here.
This completely depends on the project you're working on - sometimes the one approach may be faster, sometimes the other. But in general, just avoid making lots of global helper functions and put them in appropriate classes instead, as static helpers if need be. Then read up on autoloading and watch as PHP's magic solves the problem all in one go for you - loading and parsing the files automatically as you need them.
If you also use PHP 5.5+ (or an older version with an opcache-like extension) the code will even be precompiled, lowering overhead even further.
Generally speaking some more - once you're starting to worry about the parsing overhead of your code you're usually guilty of premature optimization. In a world where nearly all webservers have quad core hyperthreaded multigigahertz processors and are backed by RAID SATA storage, loading an extra file isn't going to be a realistic problem. Find the real bottlenecks when all the code is done, and spend your optimization time there.

"performance impact" when using a 20K lines single class

This question was asked here before, but none of the answers really tried to answer the actual question asked, so I'm asking it in a different way. Is loading a single class of 20,000 lines with 100s of functions more resource intensive in any way than breaking up the code to smaller classes with fewer functions each and loading these smaller classes as needed?
The larger a script or class is, the more memory this uses per instance. Out of the box, PHP doesn't have a way to share the memory space of libraries and classes, so creating massive scripts for a website is not a great idea.
The typical approach should be to break classes down into blocks such that you need only include per script what you actually need to run that script.
Also, it's unlikely to cause you performance problems unless you've got a huge amount of traffic - and then you could probably fix your problems easier than refactoring classes.
When a script is loaded, it requires a fixed amount of memory to parse it. The larger it is, the more memory it requires. Next, the script itself is executed, running any top-level code (not in a
class or global function). If that includes any require/include statements, those scripts are loaded (if necessary). If it creates objects, that takes more memory.
However, the size of each instance of the class is affected only by the data it stores. This correction aside, the advice here is spot on: split your classes based on responsibilities. The reason for this has also to do with ease of development than performance. Say you have one monster class filled with static methods. If your application uses most of those methods for each request, splitting it will have no performance benefit because both scripts will end up being loaded anyway. But if you can group the methods into logical subsystems, they will be easier to understand and work with.
One big class require single cycle to compile into binary code (op-code).
Many smaller classes using lesser memory but require more compilation, and the memory use for compilation will be accumulated.
Is really depend how many classes/files has been included in run-time.
So, solution for this, break into multiple classes and use APC or equivalent.
PS: the memory consumption for the big file is much smaller, because PHP do not need to re-compile the source into op-code (if you reluctant to break the big class into smaller)

Is it possible to have too many functions in a PHP application?

Can a PHP application have too many functions? Does the execution of a large number of PHP functions hog memory and resources? My WordPress theme in-development has a lot of functions (probably more than 100 by the time I'm done) and I'm concerned that I might have too many.
Even if many functions led to more memory consumption, I'd advise you to use them anyway. A clear structure is much more important than performance.
Performance can be improved by throwing hardware at it, a badly structure application gets quickly very hard to maintain.
By the way: 100 functions are nothing!
100 functions is nothing to worry about. If your code is slow you should profile it to explicitly find the slow parts.
Remember, premature optimizations are evil
No, it can't. Actually, it is good to have many functions, because that means that you've divided your code into more manageable pieces. A good rule is to keep your functions small, have them do just one thing. Oh, and remember to name your functions so that the function name describes well what the function does.
Having a lots of methods will produce a small overhead in the execution time, but it is so insignificant compared with the benefits you get from having structured code, so I wouldn't worry about it.
I wouldn't be concerned about 100 functions. That's very few functions.
Global functions and methods are compiled, stored in memory, and indexed in a hash table (save for cached lookups implemented recently). You won't have performance deterioration when calling the functions as the number of functions grows, since accessing the hash table is done, on average, in constant time.
However, there will be more overhead parsing the scripts and, if you actually call all those functions, compiling them. This is not really a problem if the you use an opcode cache. There will also be more memory overhead, but typically, memory is not a problem in enterprise grade web servers, where it's more appropriate to try to serve the requests as fast as possible, without caring that much about the memory.
There are also stylistic issues. If you have too many global functions, consider whether:
You are duplicating code between those functions. Consider refactoring, moving the common code to other function and generalizing the behavior of the functions by adding parameters, where appropriate.
You would be better grouping functions that operate on the same data in an class.
In the end, worry about this only if you profile your application and find function calls to be a CPU bottleneck and function definitions to be a memory bottleneck.
Just make your code to do it's job. Don't care about your function count.
PHP has over 3000 functions in its core, so don't worry about adding too much.
I think yes a project can have too many functions. I don't know how you have setup your project but it sounds like you have function libraries. You may want to consider Object Oriented Development in the future. This allows you to encapsulate functionality into Objects. Therefore the object contains the functions and they are not visible to other objects (unless you want them to be). This helps to keep the API pollution down a lot.
For you memory concerns - I used to worry about this too, DON'T. Good programming practices are more important that speed. You will not even notice the difference on a Web Server anyway. Always favor maintainability over speed!

Is object-oriented PHP slow?

I used to use procedural-style PHP. Later, I used to create some classes. Later, I learned Zend Framework and started to program in OOP style. Now my programs are based on my own framework (with elements of cms, but without any design in framework), which is built on the top of the Zend Framework.
Now it consists of lots classes. But the more I program, more I'm afraid. I'm afraid that my program will be slow because of them I'm afraid to add every another one class which can help me to develop but can slow the application.
All I know is that including lots of files slows application (using eAccelerator + gathering all the code in one file can speed up application 20 times!), but I have no idea if creating new classes and objects slows PHP by itself.
Does anyone have any information about it?
This bugs me. See...procedural code is not always spaghetti code, yet the OOP fanboys always presume that it is. I've written several procedural based web apps as well as an IRC services daemon in PHP. Amazingly, it seems to outperform most of the other ones that are out there and editing it is super easy. One of my friends who generally does OOP took a look at it and said "no code has the right to be this clean"
Conversely, I wrote my own PHP framework (out of boredom) and it was done in a purely OOP manner.
A good programmer can write great procedural code without the overhead classes bring. A bad programmer who uses OOP will always write crappy OOP code that slows things down.
There is no one right answer to which is better for PHP, but rather which is better for the exact scenario.
Here's good article discussing the issue. I also have seen some anecdotal bench-marks that will put OOP PHP overhead at 10-15%
Personally I think OOP is better choice since at the end it may perform better just because it probably was better designed and thought through. Procedural code tends to be messy and hard to maintain. So at the end - it has to be how critical is performance difference for your app vs. ability to maintain, extend and simply comprehend
The most important thing to remember is, design first, optimize later. A better design, which is more maintainable, is better than spaghetti code. Otherwise, you might as well write your web app in assembler. After you're done, you can profile (instead of guess), and optimize what seems slowest.
Yes, every include makes your program slower, but there is more to it than that.
If you decompose your program, over many files, there is a point where you're including/parsing/executing the least amount of code, vs the overhead of including all those files.
Furthermore, having lots of files with little code ain't so bad, because, as you said, using things like eAccelerator, or APC, is a trivial way to get a crap ton of performance back. At the same time you get, if you believe in them, all the wonderful benefits of having and Object Oriented code base.
Also, slow on a per request basis != not scalable.
Updated
As requested, PHP is still faster at straight up array manipulation than it is classes. I vaguely remember the doctrine ORM project, and someone comparing hydration of arrays versus objects, and the arrays came out faster. It's not an order of magnitude, it is noticable, however -- this is in french, but the code and results are completely understandable.. Just a note, that doctrine uses magic methods __get, and __set a lot, and these are also slower than an explicit variable access, part of doctrine's object hydration slowness could be attributed to that, so I would treat it as a worst case scenario. Lastly, even if you're using arrays, if you have to do a lot of moving around in memory, or tonnes of tests, such as isset, or functions like 'in_array' (it's order N), you'll screw the performance benefits. Also remember that objects are just arrays underneath, the interpreter just treats them as a special. I would, personally, favour better code than a small performance increase, you'll get more benefit from having smarter algorithms.
If your project contains many files and due to the nature of PHP's file access checking and restrictions, I'd recommend to turn on realpath_cache, bump up the configuration settings to reasonable numbers, and turn off open_basedir and safe_mode. Ensure to use PHP-FPM or SuExec to run the php process under a user id which is restricted to the document root to get back the security one usually gains from open_basedir and/or safe_mode.
Here are a few pointers why this is a performance gain:
https://bugs.php.net/bug.php?id=46965
http://nirlevy.blogspot.de/2009/01/slow-lstat-slow-php-slow-drupal.html
Also consider my comment on the answer from #Ólafur:
I found especially auto-loading to be the biggest slow down. PHP is extremely slow for directory lookup and file open access, the more PHP function you use during a custom auto-loader, the bigger the slow-down. You can help it a bit with turning off safe-mode (deprecated anyways) or even open-basedir (but I would not do that), but the biggest improvement comes from not using auto-loading and simply use "require_once" with complete fs pathes to require all dependencies per php file you use.
Using large frameworks for web apps that actually do not require so large number of classes for everything is probably the worst problem that many are not aware of. Strip it down at least not to include every bit of code, keep just what you need and throw the rest.
If you're using include_once() then you are causing an unnecessary slowdown, regardless of OOP design or not.
OOP will add an overhead to your code but I will bet that you will never notice it.
You may reconsider to rethink your classes structure and how do you implement them. If you said that OOP is slower you may have to redesign your classes and how do you implement them. A class is just a template of an object, any bad designed method affects all the objects of that class.
Use inheritance and polimorfism the most you can, this will effectively reduce the amount of behaviors and independent methods your classes need, but first off all you need to create a good inheritance map, abstracting your first or mother classes as much as you can.
It is not a problem about how many classes do you have, the problem is how many methods, properties or fields they have and how well are those methods structured. Inheritance reduces the amount of methods to design drammatically and the amount of code to be compiled too.
As several other people have pointed out, there is a mild overhead to OO PHP, but you can offset it by focusing your optimization effort on the core classes that your various other classes derive from. This is why C++ is becoming increasingly popular in the world of high-performance computing, traditionally the realm of C and Fortran.
Personally, I've never seen a PHP server that was CPU-constrained. Check your RAM use (you can optimize the core classes for this as well) and make sure you're not making unnecessary database calls, which are orders of magnitude more expensive than any extra CPU work you're doing.
If you design a huge OOP object hog, that does everything rather than doing functional decomposition to various classes, you will obviously fill up the memory with useless ballast code. Also, with a slow framework you will not make a simply hello World any fast. I noticed it is a kind trend (bad habit) that for one single facebook icon, people include a hole awesome font library and then next there is a search icon with fontello included. Each time they accomplish something unusual, they connect an entire framework. If you want to create a fast loading oop app use one framework only like zephir-phalcon or whatever you fancy and stick to it.
There are ways to limit the penalty from the include_once entries, and that's by having functions declared in the 'include_once' file that themselves have their code content in an 'include' statement. This will load your library of code, but only those functions actually being used will load code as it is needed. You take a second file system hit for the included code, but memory usages drop to practically nothing for the library itself, and only the code used by your program gets loaded. The hit from the second file system access can be mitigated by caching. When dealing with a large project of procedural based PHP, this provides low memory usage and fast processing. DO NOT do this with classes. This would be for a production instance, a development server will show all the penalty of hits since you don't want caching turned on.

How important is to not load unused scripts in PHP?

On a site where 90% of the pages use the same libraries, should you just load the libraries all the time or only load them when needed? The other pages would be ajax or simple pages that don't have any real functionality.
Also, should you only load the code when needed? If part way down a page you need a library, should you load it then or just load it at the top. Maybe it's possible it may never get there before of an error or wrong data. (Loading at the top makes it somewhat easier to understand, but may result in extra code not needed.)
I'm also wondering if I should make the libraries more specific so I'm not say loading the code to edit at the same time as viewing?
Basically, how much should I worry about loading code or not loading code?
I would always try to give a file, class, and method a single responsibility. Because of that, separating the displaying from the editing code could be a good idea in either case.
As for loading libraries, I believe that the performance loss of including non required libraries could be quite irrelevant in a lot of cases. However, include, require, include_once, and require_once are relatively slow as they (obviously) access the file system. If the libraries you do not use on each occasion are quite big and usually include a lot of different files themselves, removing unnecessary includes could help reducing the time spent there. Nonetheless, this cost could also be reduced drastically by using an efficient caching system.
Given you are on PHP5 and your libraries are nicely split up into classes, you could leverage PHP's auto loading functionality which includes required classes as the PHP script needs them. That would pretty effectively avoid a lot of non used code to be included.
Finally, if you make any of those changes which could affect your website's performance, run some benchmarks and profile the gain or loss in performance. That way, you do not run into the risk of doing some possibly cool optimization which just costs too much time to fully implement or even degrades performance.
Bear in mind that each script that is loaded gets parsed as PHP is compiled at run-time, so there is a penalty for loading unneeded scripts. This might be minor depending on your application structure and requirements, but there are cases which this is not the case.
There are two things you can do to negate such concerns:
Use __autoload to load your scripts as they are needed. This removes the need to maintain a long 'require' list of scripts and only loads what's needed for the current run.
Use APC as a byte-code cache to lower the cost of loading scripts. APC caches scripts in their compiled state and will do wonders for your application performance.
+1 Vote for the autoload technique.
The additional benefit of using autoload is it eliminates some of the potential for abusive code. If something fails, pop a back-trace and an "included_files" list and you get a list of places where the problem could come from.
This means you have less files to hunt through if somebody hides malicious code at the end of one of them, or designs something fruity.
I worked on a codebase once ( not mine ) where the presence of certain tokens in the URL caused unexpected behaviour, and because the code was horrible, it was a nightmare tracking the origin of the problem burried in the fact in one of the 200 included files one of them was rewriting the entire request and then calling "die"
The question was "how important".
Answer: it is NOT important at all. If you don't have a dozen servers running this app already, then this is probably early optimization, and as we all know, early optimization is the root of all evil.
In other words: don't even worry about it. There are a lot of other things to optimize speed before you should even consider this.

Categories