Can a PHP application have too many functions? Does the execution of a large number of PHP functions hog memory and resources? My WordPress theme in-development has a lot of functions (probably more than 100 by the time I'm done) and I'm concerned that I might have too many.
Even if many functions led to more memory consumption, I'd advise you to use them anyway. A clear structure is much more important than performance.
Performance can be improved by throwing hardware at it, a badly structure application gets quickly very hard to maintain.
By the way: 100 functions are nothing!
100 functions is nothing to worry about. If your code is slow you should profile it to explicitly find the slow parts.
Remember, premature optimizations are evil
No, it can't. Actually, it is good to have many functions, because that means that you've divided your code into more manageable pieces. A good rule is to keep your functions small, have them do just one thing. Oh, and remember to name your functions so that the function name describes well what the function does.
Having a lots of methods will produce a small overhead in the execution time, but it is so insignificant compared with the benefits you get from having structured code, so I wouldn't worry about it.
I wouldn't be concerned about 100 functions. That's very few functions.
Global functions and methods are compiled, stored in memory, and indexed in a hash table (save for cached lookups implemented recently). You won't have performance deterioration when calling the functions as the number of functions grows, since accessing the hash table is done, on average, in constant time.
However, there will be more overhead parsing the scripts and, if you actually call all those functions, compiling them. This is not really a problem if the you use an opcode cache. There will also be more memory overhead, but typically, memory is not a problem in enterprise grade web servers, where it's more appropriate to try to serve the requests as fast as possible, without caring that much about the memory.
There are also stylistic issues. If you have too many global functions, consider whether:
You are duplicating code between those functions. Consider refactoring, moving the common code to other function and generalizing the behavior of the functions by adding parameters, where appropriate.
You would be better grouping functions that operate on the same data in an class.
In the end, worry about this only if you profile your application and find function calls to be a CPU bottleneck and function definitions to be a memory bottleneck.
Just make your code to do it's job. Don't care about your function count.
PHP has over 3000 functions in its core, so don't worry about adding too much.
I think yes a project can have too many functions. I don't know how you have setup your project but it sounds like you have function libraries. You may want to consider Object Oriented Development in the future. This allows you to encapsulate functionality into Objects. Therefore the object contains the functions and they are not visible to other objects (unless you want them to be). This helps to keep the API pollution down a lot.
For you memory concerns - I used to worry about this too, DON'T. Good programming practices are more important that speed. You will not even notice the difference on a Web Server anyway. Always favor maintainability over speed!
Related
I declare 100 functions, but I don't actually call any of them. Will having so many functions defined affect loading time?
Does PHP process these functions before they are called?
Yes, php parses all functions on the run, and checks possible syntax errors , (though it does not execute them all this time) and registers their name as a symbol. When you call any of the functions, php searches for the function in the registered symbol table for function name and then executes that function.
So, better to use functions of your purpose only as it will increase the size of symbol table.
Just to be clear, even having hundreds of unused classes and functions is not going to make much difference to the performance of your program. Some difference, yes maybe, but not much. Improving the code that is being run will make a bigger difference. Don't worry about optimising for language mechanics until you've got your own code perfect. The key to performance optimisation is to tackle the biggest problems first, and the biggest problems are very rarely caused by subtle language quirks.
If you do want to minimise the effect of loading too much code that isn't going to be used, the best way to do this is to use PHP's autoloading mechanism.
This probably means you should also write your code as classes rather than stand-alone functions, but that's a good thing to do anyway.
Using an autoloader means that you can let PHP do the work of loading the code it needs when it needs it. If you don't use a particular class, then it won't be loaded, but on the other hand it will be there when you need it without you having to do an include() or anything like that.
This setup is really powerful and eliminates any worries about having too much code loaded, even if you're using a massive framework library.
Autoloading is too big a topic for me to explain in enough detail in an answer here, but there are plenty of resources on the web to teach it. Alternatively, use an existing one -- pretty much all frameworks have an autoloader system built-in, so if you're using any kind of modern PHP framework, you should be able to use theirs.
I have an existing php website, that is written in an old fashion way.
Every page has the same pattern
<?php
require_once("config.php");
//Some code
require_once("header.php");
?>
Some HTML and PHP code mixture
<?php
require_once("footer.php");
?>
Where all the db connection, session data, language files are initiated at the "config.php" file.
And every DB access is done with a mysql_query call,
No OOP what-so-ever, purely procedural programming.
How would you to optimize this code structure in order to improve performance and make this website robust enough to handle heavy traffic ?
How would you to optimize this code structure in order to improve performance and make this website robust enough to handle heavy traffic ?
The structure you've shown us has very little scope for for optimization. Making it object-oriented will make it slower than it currently is. Note that the code within the included files may benefit greatly from various changes.
There's only 3 lines of code here. So not a lot of scope for tuning. The _once constructs add a (tiny) overhead, as does use of require rather than include - but this is a very small amount compared to what's likely to be happening in the code you've not shown us.
Where all the db connection, session data, language files are initiated at the "config.php" file
There are again marginal savings by delaying access to external resources until they are needed (and surrendering access immediately when they are no longer required) - but this is not an OO vs procedural issue.
Why will OO be slower?
Here the routing of requests is implemented via the the webserver - webservers are really good at this and usually very, very efficient. The alternative approach of using a front controller gives some scope for applying templating in a more manageable way and for applying late patching of the content (although it's arguable if this is a good idea). But using a front-controller pattern is not a requirement for OO.
Potentially, when written as OO code, redundant memory allocations hang around for longer - hence the runtime tends to have a larger memory footprint.
Overriding and decorating adds additional abstraction and processing overhead to the invocation of data transformations.
every DB access is done with a mysql_query call
There's several parts to this point. Yes, the mysql_ extension is deprecated and you should be looking to migrate this as a priority (sorry, but I can't recommend any good forward/backward shims) however it is mostly faster than the mysqlnd engine - the latter has a definite performance advantage with large datasets / high volume due to reduced memory load.
You seem to be convinced that there's something inherently wrong with procedural programming with regard to performance and scale. Nothing could be further from the truth. G-Wan, Apache httpd, Varnish, ATS, the Linux kernel are all written in C - not C++.
If you want to improve performance and scalability, then you're currently looking in the wrong place. And the only way to make significant in-roads is to to profile your code under relevant load levels.
If you really don't want to change your structure (OOP and mysql_*..., and even with these in fact), you should implement a cache system, to avoid the generation of content everytime. If you have some data that doesn't change often (like blog post, news or member profile), you can create a cache of 5 minutes on it, to lighten the SQL use. Google might help you for this.
There's also a lot of web techniques to optimize pages loading: use a CDN to load your static ressources, Varnish cache...
With PHP itself, there's also some methods, but a lot of blog post exists about that, just look here for example :) For example:
Avoid regex if possible
Initialize variable if you need it: it's 10 times slower to increment/decrement an non-initialized variable (which is ugly btw)
Don't call functions in for declaration, make temp variable instead
etc.
Don't hesitate to benchmark, make some tests with JMeter to simulate a pool of connections and see which page is slow and what you should optimize first.
This question was asked here before, but none of the answers really tried to answer the actual question asked, so I'm asking it in a different way. Is loading a single class of 20,000 lines with 100s of functions more resource intensive in any way than breaking up the code to smaller classes with fewer functions each and loading these smaller classes as needed?
The larger a script or class is, the more memory this uses per instance. Out of the box, PHP doesn't have a way to share the memory space of libraries and classes, so creating massive scripts for a website is not a great idea.
The typical approach should be to break classes down into blocks such that you need only include per script what you actually need to run that script.
Also, it's unlikely to cause you performance problems unless you've got a huge amount of traffic - and then you could probably fix your problems easier than refactoring classes.
When a script is loaded, it requires a fixed amount of memory to parse it. The larger it is, the more memory it requires. Next, the script itself is executed, running any top-level code (not in a
class or global function). If that includes any require/include statements, those scripts are loaded (if necessary). If it creates objects, that takes more memory.
However, the size of each instance of the class is affected only by the data it stores. This correction aside, the advice here is spot on: split your classes based on responsibilities. The reason for this has also to do with ease of development than performance. Say you have one monster class filled with static methods. If your application uses most of those methods for each request, splitting it will have no performance benefit because both scripts will end up being loaded anyway. But if you can group the methods into logical subsystems, they will be easier to understand and work with.
One big class require single cycle to compile into binary code (op-code).
Many smaller classes using lesser memory but require more compilation, and the memory use for compilation will be accumulated.
Is really depend how many classes/files has been included in run-time.
So, solution for this, break into multiple classes and use APC or equivalent.
PS: the memory consumption for the big file is much smaller, because PHP do not need to re-compile the source into op-code (if you reluctant to break the big class into smaller)
I have a custom-built MVC PHP framework that I am in the process of rewriting and had a question about performance and magic methods. With the model portion of the framework, I was thinking if __get/__set magic methods would cause too much performance hit to be worth using. I mean accessing (reads and writes) model data is going to be one of the most common things performed. Is the use of __get/__set magic methods too big of a performance hit for heavy use functionality like the model portion of a MVC framework?
Measure it.
It certainly has a big performance hit, especially considering function calls are expensive in PHP. This difference will be even bigger in the next version of PHP, which implements optimizations that render regular access to declared instance properties significantly faster.
That said, PHP rarely is the bottleneck. I/O and database queries frequently take much more time. This, however, depends on your usage; the only way to know for sure it to benchmark.
That are also other readability problems with the magic methods. They frequently degenerate into one big switch statement and compromise code completion, both of which may be hits in programming productivity.
Here's a an article (three years old) with some benchmarks. Run your own tests to see how it impacts your code.
Generally speaking, they are much slower. But are they the bottleneck? It depends what you are doing. If you're concerned about speed, where does it stop? The foreach loop is slower than the for loop, yet most people don't rewrite all of their code to use the for.
Just using PHP means that the speed of your code isn't all that critical. Personally, I would favor whatever style makes it easier to program.
Purely from my experience, it does add a quite a lot over overhead. On a page with about 4000 __get's (yes, there was quite a lot of data on that page) performance was quite measurably slower, to the point it became unacceptable. I threw away all __set's & __get's for variables which wouldn't require other data to be altered or external dependancies (like foreign keys) to be checked, after which the time to generate that page was about 15% of the time it took before.
I have just asked myself this same question and came to the same conclusion: It's better to set the properties in the traditional way. __get() + massive switch slows everything down
I used to use procedural-style PHP. Later, I used to create some classes. Later, I learned Zend Framework and started to program in OOP style. Now my programs are based on my own framework (with elements of cms, but without any design in framework), which is built on the top of the Zend Framework.
Now it consists of lots classes. But the more I program, more I'm afraid. I'm afraid that my program will be slow because of them I'm afraid to add every another one class which can help me to develop but can slow the application.
All I know is that including lots of files slows application (using eAccelerator + gathering all the code in one file can speed up application 20 times!), but I have no idea if creating new classes and objects slows PHP by itself.
Does anyone have any information about it?
This bugs me. See...procedural code is not always spaghetti code, yet the OOP fanboys always presume that it is. I've written several procedural based web apps as well as an IRC services daemon in PHP. Amazingly, it seems to outperform most of the other ones that are out there and editing it is super easy. One of my friends who generally does OOP took a look at it and said "no code has the right to be this clean"
Conversely, I wrote my own PHP framework (out of boredom) and it was done in a purely OOP manner.
A good programmer can write great procedural code without the overhead classes bring. A bad programmer who uses OOP will always write crappy OOP code that slows things down.
There is no one right answer to which is better for PHP, but rather which is better for the exact scenario.
Here's good article discussing the issue. I also have seen some anecdotal bench-marks that will put OOP PHP overhead at 10-15%
Personally I think OOP is better choice since at the end it may perform better just because it probably was better designed and thought through. Procedural code tends to be messy and hard to maintain. So at the end - it has to be how critical is performance difference for your app vs. ability to maintain, extend and simply comprehend
The most important thing to remember is, design first, optimize later. A better design, which is more maintainable, is better than spaghetti code. Otherwise, you might as well write your web app in assembler. After you're done, you can profile (instead of guess), and optimize what seems slowest.
Yes, every include makes your program slower, but there is more to it than that.
If you decompose your program, over many files, there is a point where you're including/parsing/executing the least amount of code, vs the overhead of including all those files.
Furthermore, having lots of files with little code ain't so bad, because, as you said, using things like eAccelerator, or APC, is a trivial way to get a crap ton of performance back. At the same time you get, if you believe in them, all the wonderful benefits of having and Object Oriented code base.
Also, slow on a per request basis != not scalable.
Updated
As requested, PHP is still faster at straight up array manipulation than it is classes. I vaguely remember the doctrine ORM project, and someone comparing hydration of arrays versus objects, and the arrays came out faster. It's not an order of magnitude, it is noticable, however -- this is in french, but the code and results are completely understandable.. Just a note, that doctrine uses magic methods __get, and __set a lot, and these are also slower than an explicit variable access, part of doctrine's object hydration slowness could be attributed to that, so I would treat it as a worst case scenario. Lastly, even if you're using arrays, if you have to do a lot of moving around in memory, or tonnes of tests, such as isset, or functions like 'in_array' (it's order N), you'll screw the performance benefits. Also remember that objects are just arrays underneath, the interpreter just treats them as a special. I would, personally, favour better code than a small performance increase, you'll get more benefit from having smarter algorithms.
If your project contains many files and due to the nature of PHP's file access checking and restrictions, I'd recommend to turn on realpath_cache, bump up the configuration settings to reasonable numbers, and turn off open_basedir and safe_mode. Ensure to use PHP-FPM or SuExec to run the php process under a user id which is restricted to the document root to get back the security one usually gains from open_basedir and/or safe_mode.
Here are a few pointers why this is a performance gain:
https://bugs.php.net/bug.php?id=46965
http://nirlevy.blogspot.de/2009/01/slow-lstat-slow-php-slow-drupal.html
Also consider my comment on the answer from #Ólafur:
I found especially auto-loading to be the biggest slow down. PHP is extremely slow for directory lookup and file open access, the more PHP function you use during a custom auto-loader, the bigger the slow-down. You can help it a bit with turning off safe-mode (deprecated anyways) or even open-basedir (but I would not do that), but the biggest improvement comes from not using auto-loading and simply use "require_once" with complete fs pathes to require all dependencies per php file you use.
Using large frameworks for web apps that actually do not require so large number of classes for everything is probably the worst problem that many are not aware of. Strip it down at least not to include every bit of code, keep just what you need and throw the rest.
If you're using include_once() then you are causing an unnecessary slowdown, regardless of OOP design or not.
OOP will add an overhead to your code but I will bet that you will never notice it.
You may reconsider to rethink your classes structure and how do you implement them. If you said that OOP is slower you may have to redesign your classes and how do you implement them. A class is just a template of an object, any bad designed method affects all the objects of that class.
Use inheritance and polimorfism the most you can, this will effectively reduce the amount of behaviors and independent methods your classes need, but first off all you need to create a good inheritance map, abstracting your first or mother classes as much as you can.
It is not a problem about how many classes do you have, the problem is how many methods, properties or fields they have and how well are those methods structured. Inheritance reduces the amount of methods to design drammatically and the amount of code to be compiled too.
As several other people have pointed out, there is a mild overhead to OO PHP, but you can offset it by focusing your optimization effort on the core classes that your various other classes derive from. This is why C++ is becoming increasingly popular in the world of high-performance computing, traditionally the realm of C and Fortran.
Personally, I've never seen a PHP server that was CPU-constrained. Check your RAM use (you can optimize the core classes for this as well) and make sure you're not making unnecessary database calls, which are orders of magnitude more expensive than any extra CPU work you're doing.
If you design a huge OOP object hog, that does everything rather than doing functional decomposition to various classes, you will obviously fill up the memory with useless ballast code. Also, with a slow framework you will not make a simply hello World any fast. I noticed it is a kind trend (bad habit) that for one single facebook icon, people include a hole awesome font library and then next there is a search icon with fontello included. Each time they accomplish something unusual, they connect an entire framework. If you want to create a fast loading oop app use one framework only like zephir-phalcon or whatever you fancy and stick to it.
There are ways to limit the penalty from the include_once entries, and that's by having functions declared in the 'include_once' file that themselves have their code content in an 'include' statement. This will load your library of code, but only those functions actually being used will load code as it is needed. You take a second file system hit for the included code, but memory usages drop to practically nothing for the library itself, and only the code used by your program gets loaded. The hit from the second file system access can be mitigated by caching. When dealing with a large project of procedural based PHP, this provides low memory usage and fast processing. DO NOT do this with classes. This would be for a production instance, a development server will show all the penalty of hits since you don't want caching turned on.