I am building a PHP CMS from the ground up. There is one super-core file within my system which I currently have automatically importing all other packages and classes that make up the core of the system. On a typical page, only a few of these classes and methods are used.
Considering the load require_once() puts on a server to include all of these files, and the time a user must wait for the page to load, I am wondering which path I should take:
Keep the super-core as-is and automatically include all of the system core for each page that includes this core file.
Use the super-core to include only essential packages, such as database management, and import additional packages/classes on an as-needed basis.
Could someone please let me know which of the two options are the best, as well as a brief overview of its pros and cons?
Thank you for your time!!!
You're asking a question about which is the best load-strategy. This is often discussed related to auto-loaders.
As with any strategy, there are pros and cons. Including all files can save you the hassle to forget one. An autoloader on the other does not forget a file as well.
However you must not always use the one or other strategy but if you implement multiple you can choose as needed. For example if you develop your CMS things might change often. But if the CMS is installed on a server, that version does not change often.
So in production a strategy to combine all core libraries into one file and require them on startup can be a benefit depending on how much load a server has.
For an easy way to build own systems I can propose an autoloader. If you line-up your classes file by file they will get automatically loaded in the moment you use the class.
When you achieved a certain step in development you actually know what core files are or not. You can then load these by default so the autoloader would not be triggered any longer for them.
Earlier this year, I came upon this exact problem while developing a framework in PHP.
I considered the pros-cons and here's my evaluation:
Option 1 - Front Controller Pattern script include all other scripts
Advantages
Inclusion of packages are done within one script; you can see what files are included what are not at one glance.
Inclusion of a particular package is always called once, there is no overhead.
Disadvantages
Consider the case of such:
We have two classes Rectangle and Shape. Rectangle is a child class i.e. extension of Shape. However the core script includes the classes alphabetically. Thus when Rectangle is included, Shape is not found and PHP will throw an error.
Rectangle class:
class Rectangle extends Shape{
}
Shape class:
class Shape{
}
more overhead when everything that is not needed is also loaded into the memory.
Option 2 - Load main packages, then load other packages as-needed
Advantages
Files are only included when needed. Reduces overhead in another way
Solves the problem mentioned in Option 1.
You are able to concentrate on what each package requires from other packages and simply just load them
Disadvantages
Overhead as multiple requests for a particular package may occur.
Package inclusion is done in every single file.
Programming code is for human. Therefore to make things more logical and breaking down the problem, I chose option 2 to go for the framework.
Don't load what you are not going to use. Implement an autoloader or deepen your require_once's.
Even if the performance is neglect-able, less files includes will increase your ability to quickly hunt down bugs and determine the flow of your application.
Related
I'm new to php and inherited a website project with hundreds of pages, all procedural (when I do a text search of the files, there isn't even a function definition anywhere). I'm coming from the c# and Java worlds. I'm looking for a way to incrementally add OOP. (They want me to update the front end and I am trying to convince them of fixing the backend at the same time and they don't want to use a framework (dammit)).
While looking into autoloader... Well, here's my understanding. It's a method of registering folders where classes are stored and when you instantiate a class, trait, etc. it searches the folder based on the class/filename/namespace and loads the appropriate definitions.
I have a few questions:
Does autoloader search the folder and load the appropriate definitions on every page lifecycle (or does it cache them)?
Pre-loading:
Is there a way to use autoloader, or some alternative, to pre-load ALL class definitions into memory and make them available across all sessions?
If so, when updating class files, how would I tell this mechanism to reload everything to memory when I make changes to class files?
UPDATE TO QUESTIONS:
Thank you both for your answers and it helps a little, but... I do have a bad habit of posing the wrong question(s) on StackOverflow.
The thing I want to avoid is slowing down pages by adding classes. So let's say I add a library and register the paths with autoloader. A page instanciates a class with multiple dependencies. Let's say that the dependency graph includes 15 files. For each request lifecycle, the server loads the page and 15 other files just on that one page.
Since I am coming from compiled languages, I feel a little strange not loading these classes into memory. All the classes together should not be over say 5MB.
(Or maybe I should just create a RAM Disk and copy all the files in there on boot and just have a symlink?)
Auto loaders in PHP are lazy. When PHP encounters a the use of a class it doesn't know about, it will ask the registered autoloader (or chain of autoloaders) to go find it. It's the autoloader's job to figure out where to get the file the class is defined in and include it. Having some sort of convention for naming your classes and organizing your class files is key to having a useful autoloader, and several conventions have arisen in the PHP community, such as PSR-4.
Does autoloader search the folder and load the appropriate definitions on every page lifecycle (or does it cache them)?
The autoloader(s) is(are) called on every request, but only when the need to autoload a class arises.
Pre-loading: Is there a way to use autoloader, or some alternative, to pre-load ALL class definitions into memory and make them available across all sessions?
I don't believe so, but as the number of classes grow, this becomes more and more wasteful.
Welcome to the wonderful[citation needed] world of legacy PHP, I highly recommend you check out Modernizing Legacy Applications In PHP. It's like a strategy guide for getting from Mordor back to the Shire.
I think you may misunderstand the purpose of autoloading. It is simply instructions on what to do when your code calls for a class that PHP doesn't recognize. That's it. The autoloader just calls requires /path/to/classfile so that PHP will see the class.
Does autoloader search the folder and load the appropriate definitions
on every page lifecycle (or does it cache them)?
There is no caching across requests, so if you make a change to file, the next http request will incorporate those changes. It's just as if you changed any other instruction in your script, for example change echo 1 to echo 2
Pre-loading: Is there a way to use autoloader, or some alternative, to
pre-load ALL class definitions into memory and make them available
across all sessions?
There is no need for this. A well written autoloader has instructions for where to find any class, so loading all possible classes ahead of time is wasteful. If you're still running into undefined classes errors, you need to either improve the autoloader or place the class files in accordance with the current autoloader instructions.
If you really want to preload all your classes, use the auto_prepend_file setting in php.ini. The docs say
Specifies the name of a file that is automatically parsed before the
main file
Set it to an initialization script. In that script have something like:
//put all your class files in this folder
$dir = '/path/to/classes/folder';
$handle = opendir($dir);
//require all PHP files from classes folder
while (false !== ($item = readdir($handle))){
$path = $dir.'/'.$item;
if(is_file($path) && pathinfo($path,PATHINFO_EXTENSION)==='php')
require_once $path;
}
This is simplified. There is significant risk in just including all files in any directory into your script so I would not do this. You would also need to adjust this if you want to include files in subdirectories.
Basically, don't do this. Just have a good autoloader.
No one posted what I was looking for but it seems the best route is the OptCache that's prebuilt into php 5.5 and above (my client is using 5.3 so I didn't know about it).
https://github.com/zendtech/ZendOptimizerPlus
The Zend OPcache
The Zend OPcache provides faster PHP execution through opcode caching
and optimization. It improves PHP performance by storing precompiled
script bytecode in the shared memory. This eliminates the stages of
reading code from the disk and compiling it on future access. In
addition, it applies a few bytecode optimization patterns that make
code execution faster.
Why there is the function spl_autoload_unregister ? If I register autoloader in what cases I would want to unregister it ? seems redundant to me.
And another question in that topic: since now php 5.5 comes with built-in opcache, and in the past many install apc - is there any reason to use autoloader all together ? since all the code now will go to the memory anyhow - isn't it better to just make one file which loads all my php classes ?
You can specify more than one autoload method. So if you have an application that has a lot of autoload methods, it's possible you would want to unregister those methods as well. In practice, this method likely exists more for completeness (not too many projects use so many methods that you would want to unregister an autoload method).
Opcode caching is a different topic. This has nothing to do with autoloading. When PHP is told to execute a file it first parses the file and builds machine level instruction code (operations code, or opcode). The second pass executes the machine level code. Opcode caching (APC, Zend Opcache, etc) simply stores the machine level code so you only need to execute it the next time you call that page, and thus save yourself the extra processing.
Expanded for the comment
You can include all the files if you want, but autoloading does two important things
You can structure your classes within a namespace and use files and
directories to mirror the namespace structure. This makes working with your classes very easy because you can tell quickly where each class file lives
You're including files only as you need them
As for your opcache, your thinking is incorrect. Let's say you include all classes and methods you use. Now, let's say you have a page that only uses 25% of your code base. That means you loaded the other 75% and forced the opcache to cache it. But to what end? Opcache works on a file-by-file basis, not a project level basis. So you would be bloating your code on every page for no gain because autoloading would have included the code you needed anyways, but dynamically.
So I have the following dilemma. I want to have a common library that all my ZF2 applications will use. This library will contain all the business logic for my website. Each application will consume different parts of the library to properly display/perform whatever actions are necessary. Now so far I've managed to create a library. Lets call it Foo. Foo has a Module.php which does the basic autoloading required to load the entire library.
Now here is where I start to have problems. I want to take advantage of dependency injection, the service manager, etc from ZF2 inside Foo. The problem is I only have the one Module.php that loads Foo. This means as my library grows so will Module.php since as far as I can tell I can't have sub modules. Is there any way around this issue?
Essentially I want every app to just include Foo and Foo to have several Module.php so that at least the dependency stuff can be handled on a module by module basis.
You're probably swimming against the current to try and do sub-modules -- and you probably don't need to.
If you've written your module nicely, loading it won't be a very expensive operation. Remember, the whole point of the service manager is that all those services are lazily created. So if the calling code never asks for a particular service in a particular request, that service's classfile is never autoloaded, the object is never instantiated, etc. So you may be fine staying with a big, monolithic, module.
The one place that things might get a little tricky is if you're leaning heavily on the EventManager, and your module is attaching a bunch listeners. But you can probably get around that by setting up some module configuration directives, and then just conditionally attach listeners.
Having said that, it probably makes sense to try to split your module up. So you could have FooBar and FooBaz modules.
If you really, really, want sub-modules, you can dig into the ModuleManager and try to figure it out. I went a little ways down that road once -- and then got distracted. In my case, I was dealing with shipping physical items. I wanted a "Fulfillment" module that could be configured to load a bunch of similar shipping modules (Fulfillment\Courier\USPSModule, Fulfillment\Courier\FedExModule, etc), so that my main Fulfillment module could iterate over all loaded submodules, without specific knowledge about any of them. If I recall correctly, the best way to do it was to essentially mirror what ZF2 does, but inside my Fulfillment\Module class. However, I can't think of many situations where you'd want to do that, unless you want a set of similar submodules that all implement the same interface, and want them to be consumed by a super-module that has no specific knowledge of them. I also looked at this because was thinking about runtime enabling/disabling of those submodules by end-users (sort of like a plugin system).
If you're not doing that, I'd say stick to FooBarModule, FooBazModule, etc, so far as it makes sense. And remember even if your module contains a ton of code, the ServiceManager will only autoload, parse, and instantiate classes that are needed to satisfy the dependencies of any given request.
From php.net:
In PHP 5, this is no longer necessary. You may define an __autoload() function which is automatically called in case you are trying to use a class/interface which hasn't been defined yet. By calling this function the scripting engine is given a last chance to load the class before PHP fails with an error.
Now I am wanting to know, is it bad practice to solely use __autoload to load the appropriate classes on a dynamic site?
The way my site is setup is to include files into the index.php file, for example http://www.site.com/index.php?p=PAGE-I-WANT-TO-LOAD
So if I am on the forums section or the blogs section of my site, I want only appropriate classes and functions to be loaded, so I use autoload but I never include a file manually, should I be using __autoload as a last resort or is what I am doing fine even on a high traffic system?
Bad? No. __autoload() is one of my favorite additions to PHP 5. It removes the responsibility (and annoyance) of manually having to include/require the class files necessary to your application. That being said, it's up to you as the developer to ensure that only the 'appropriate classes' are loaded. This is easily done with a structured naming scheme and directory structure. There are plenty examples online of how to properly use __autoload(), do a Google search and you'll find plenty of information.
Autoload is a good way to load only what classes is needed.
In PHP 5 >= 5.1.2, most of the problems with the old __autoload() dissapeared, thanks to spl_autoload_register().
Now I am wanting to know, is it bad practice to solely use __autoload to load the appropriate classes on a dynamic site?
Not at all. You can rely on autoload, all you need to do is to devise a good naming convention and implement an efficient autoloader.
There is one major issue to consider. Autoloading and Zend Guard do not play well together, because Zend Guard tends to rename things, which will mean that the naming convention you decided to use will most likely not be the same. If you will be using Zend Guard (or any other obfuscator for that matter) you will most likely be forced to include all the files by hand.
Here is a quote from the Zend Guard user guide:
Autoloading classes will not work since the filename on the disk would not
match the obfuscated class name.
The only danger to __autoload() is if you define a poor autoloading function. Generally, all you're going to get in terms of a performance hit is a few disk seeks as PHP looks for the right files that contain your classes. The upside is getting rid of all those annoying include() calls.
If you're worried about performance at this level, then you should already be using an opcode cache such as APC.
What solution would you recommend for including files in a PHP project?
There aren't manual calls of require/include functions - everything loads through autoload functions
Package importing, when needed.
Here is the package importing API:
import('util.html.HTMLParser');
import('template.arras.*');
In this function declaration you can explode the string with dots (package hierarchy delimeter), looping through files in particular package (folder) to include just one of them or all of them if the asterisk symbol is found at the end of the string, e.g. ('template.arras.*').
One of the benefits I can see in package importing method, is that it can force you to use better object decomposition and class grouping.
One of the drawbacks I can see in autoload method - is that autoload function can become very big and not very obvious/readable.
What do you think about it?
What benefits/drawbacks can you name in each of this methods?
How can I find the best solution for the project?
How can I know if there will be any performance problems if package management is used?
I use __autoload() extensively. The autload function that we use in our application has a few tweaks for backwards compatibility of older classes, but we generally follow a convention when creating new classes that allow the autoload() to work fairly seemlessly:
Consistent Class Naming: each class in its own file, each class is named with camel-case separated by an underscore. This maps to the class path. For example, Some_CoolClass maps to our class directory then 'Some/CoolClass.class.php'. I think some frameworks use this convention.
Explicitly Require External Classes: since we don't have control over the naming of any external libraries that we use, we load them using PHP's require_once() function.
The import method is an improvement but still loads up more than needed.
Either by using the asterisk or loading them up in the beginning of the script (because importing before every "new Classname" will become cumbersome)
I'm a fan of __autoload() or the even better spl_autoload_register()
Because it will include only the classes you're using and the extra benefit of not caring where the class is located. If your colleges moves a file to another directory you are not effected.
The downside is that it need additional logic
to make it work properly with directories.
I use require_once("../path-to-auto-load-script.php.inc") with auto load
I have a standard naming convention for all classes and inc files which makes it easier to programaticaly determine what class name is currently being requested.
for example, all classes have a certain extension like inc.php (so I know that they'll be in the /cls directory)
and
all inc files start with .ht (so they'll be in the /inc directory)
auto load accepts one parameter: className, which I then use to determine where the file is actually located. looping once I know what my target directory is, each time adding "../" to account for sub sub pages, (which seemed to break auto load for me) and finally require_once'ing the actual code file once found.
I strongly suggest doing the following instead:
Throw all your classes into a static array, className => filepath/classFile. The auto load function can use that to load classes.
This ensures that you always load the minimum amount of files. This also means you avoid completely silly class names, and parsing of said names.
If it's slow, you can throw on some accelerator, and that will gain you a whole lot more, if it still is slow, you can run things through a 'compile' process, where often used files are just dumped into common files, and the autoload references can be updated to point to the correct place.
If you start running into issues where your autoloading is too slow, which I find hard to believe, you can split that up according to packages, and have multiple autoloading functions, this way only subsets of the array are required, this works best if your packages are defined around modules of your software (login, admin, email, ...)
I'm not a fan of __autoload(). In a lot of libraries (some PEAR libraries, for instance), developersuse class_exists() without passing in the relatively new second parameter. Any legacy code you have could also have this issue. This can cause warnings and errors if you have an __autoload() defined.
If your libraries are clear though, and you don't have legacy code to deal with, it's a fantastic tool. I sometimes wish PHP had been a little smarter about how they managed the behavior of class_exists(), because I think the problem is with that functionality rather than __autoload().
Rolling your own packaging system is probably a bad idea. I would suggest that you go with explicit manual includes, or with autoload (or a combination for that matter).