What solution would you recommend for including files in a PHP project?
There aren't manual calls of require/include functions - everything loads through autoload functions
Package importing, when needed.
Here is the package importing API:
import('util.html.HTMLParser');
import('template.arras.*');
In this function declaration you can explode the string with dots (package hierarchy delimeter), looping through files in particular package (folder) to include just one of them or all of them if the asterisk symbol is found at the end of the string, e.g. ('template.arras.*').
One of the benefits I can see in package importing method, is that it can force you to use better object decomposition and class grouping.
One of the drawbacks I can see in autoload method - is that autoload function can become very big and not very obvious/readable.
What do you think about it?
What benefits/drawbacks can you name in each of this methods?
How can I find the best solution for the project?
How can I know if there will be any performance problems if package management is used?
I use __autoload() extensively. The autload function that we use in our application has a few tweaks for backwards compatibility of older classes, but we generally follow a convention when creating new classes that allow the autoload() to work fairly seemlessly:
Consistent Class Naming: each class in its own file, each class is named with camel-case separated by an underscore. This maps to the class path. For example, Some_CoolClass maps to our class directory then 'Some/CoolClass.class.php'. I think some frameworks use this convention.
Explicitly Require External Classes: since we don't have control over the naming of any external libraries that we use, we load them using PHP's require_once() function.
The import method is an improvement but still loads up more than needed.
Either by using the asterisk or loading them up in the beginning of the script (because importing before every "new Classname" will become cumbersome)
I'm a fan of __autoload() or the even better spl_autoload_register()
Because it will include only the classes you're using and the extra benefit of not caring where the class is located. If your colleges moves a file to another directory you are not effected.
The downside is that it need additional logic
to make it work properly with directories.
I use require_once("../path-to-auto-load-script.php.inc") with auto load
I have a standard naming convention for all classes and inc files which makes it easier to programaticaly determine what class name is currently being requested.
for example, all classes have a certain extension like inc.php (so I know that they'll be in the /cls directory)
and
all inc files start with .ht (so they'll be in the /inc directory)
auto load accepts one parameter: className, which I then use to determine where the file is actually located. looping once I know what my target directory is, each time adding "../" to account for sub sub pages, (which seemed to break auto load for me) and finally require_once'ing the actual code file once found.
I strongly suggest doing the following instead:
Throw all your classes into a static array, className => filepath/classFile. The auto load function can use that to load classes.
This ensures that you always load the minimum amount of files. This also means you avoid completely silly class names, and parsing of said names.
If it's slow, you can throw on some accelerator, and that will gain you a whole lot more, if it still is slow, you can run things through a 'compile' process, where often used files are just dumped into common files, and the autoload references can be updated to point to the correct place.
If you start running into issues where your autoloading is too slow, which I find hard to believe, you can split that up according to packages, and have multiple autoloading functions, this way only subsets of the array are required, this works best if your packages are defined around modules of your software (login, admin, email, ...)
I'm not a fan of __autoload(). In a lot of libraries (some PEAR libraries, for instance), developersuse class_exists() without passing in the relatively new second parameter. Any legacy code you have could also have this issue. This can cause warnings and errors if you have an __autoload() defined.
If your libraries are clear though, and you don't have legacy code to deal with, it's a fantastic tool. I sometimes wish PHP had been a little smarter about how they managed the behavior of class_exists(), because I think the problem is with that functionality rather than __autoload().
Rolling your own packaging system is probably a bad idea. I would suggest that you go with explicit manual includes, or with autoload (or a combination for that matter).
Related
I am currently working as a student on php project which has grown since the beginning of time and has about 1800 php-files.
The problem is: it is completely without namespaces, or any of the PSR-4, etc. recommendations. The technical debt is strong with this one :).
We want to use composer (and twig and some libraries more) and having problems including this (especially composer). I think it's because of the overwrite of __autoload() via spl_autoload_register() in the composer-autoloader?
Is there a good and fast way to start integrating namespaces without rewriting the whole project?
You can still use Composer with PSR-0 or classmap.
I'd probably go with the classmap first. Advantages: Can deal with multiple classes per file. Can deal with arbitrary file structures.
Once you achieved using Composer's autoloading, you can start removing either the existing autoloader or those require_once/include_once that are likely spread all over the place.
Once you've got rid of all that file loading legacy and have established Composer autoloading, you can try to organize the code according to PSR-0. This will probably require you to rename files and reposition them. It might also be the case that the classes don't have any identifiable prefixes - which is bad for your PSR-0 autoloading because all these files would belong into one single folder.
Note that until this point you haven't changed the name of any class, so the code should run without any changes.
Using namespaces does not have a very noticeable advantage. You'd be forced to rename all classes, rename all usages of that class name, and all in all it won't provide any noticeable benefit if used alone.
On the other hand, you sound like you do want to refactor everything else as well, so switching to namespaces can be used as a signal for "newer" code.
You can use PSR-4 and PSR-0 at the same time, so it won't affect your renaming of classes (besides the necessary class name changes in all the places).
I'm new to php and inherited a website project with hundreds of pages, all procedural (when I do a text search of the files, there isn't even a function definition anywhere). I'm coming from the c# and Java worlds. I'm looking for a way to incrementally add OOP. (They want me to update the front end and I am trying to convince them of fixing the backend at the same time and they don't want to use a framework (dammit)).
While looking into autoloader... Well, here's my understanding. It's a method of registering folders where classes are stored and when you instantiate a class, trait, etc. it searches the folder based on the class/filename/namespace and loads the appropriate definitions.
I have a few questions:
Does autoloader search the folder and load the appropriate definitions on every page lifecycle (or does it cache them)?
Pre-loading:
Is there a way to use autoloader, or some alternative, to pre-load ALL class definitions into memory and make them available across all sessions?
If so, when updating class files, how would I tell this mechanism to reload everything to memory when I make changes to class files?
UPDATE TO QUESTIONS:
Thank you both for your answers and it helps a little, but... I do have a bad habit of posing the wrong question(s) on StackOverflow.
The thing I want to avoid is slowing down pages by adding classes. So let's say I add a library and register the paths with autoloader. A page instanciates a class with multiple dependencies. Let's say that the dependency graph includes 15 files. For each request lifecycle, the server loads the page and 15 other files just on that one page.
Since I am coming from compiled languages, I feel a little strange not loading these classes into memory. All the classes together should not be over say 5MB.
(Or maybe I should just create a RAM Disk and copy all the files in there on boot and just have a symlink?)
Auto loaders in PHP are lazy. When PHP encounters a the use of a class it doesn't know about, it will ask the registered autoloader (or chain of autoloaders) to go find it. It's the autoloader's job to figure out where to get the file the class is defined in and include it. Having some sort of convention for naming your classes and organizing your class files is key to having a useful autoloader, and several conventions have arisen in the PHP community, such as PSR-4.
Does autoloader search the folder and load the appropriate definitions on every page lifecycle (or does it cache them)?
The autoloader(s) is(are) called on every request, but only when the need to autoload a class arises.
Pre-loading: Is there a way to use autoloader, or some alternative, to pre-load ALL class definitions into memory and make them available across all sessions?
I don't believe so, but as the number of classes grow, this becomes more and more wasteful.
Welcome to the wonderful[citation needed] world of legacy PHP, I highly recommend you check out Modernizing Legacy Applications In PHP. It's like a strategy guide for getting from Mordor back to the Shire.
I think you may misunderstand the purpose of autoloading. It is simply instructions on what to do when your code calls for a class that PHP doesn't recognize. That's it. The autoloader just calls requires /path/to/classfile so that PHP will see the class.
Does autoloader search the folder and load the appropriate definitions
on every page lifecycle (or does it cache them)?
There is no caching across requests, so if you make a change to file, the next http request will incorporate those changes. It's just as if you changed any other instruction in your script, for example change echo 1 to echo 2
Pre-loading: Is there a way to use autoloader, or some alternative, to
pre-load ALL class definitions into memory and make them available
across all sessions?
There is no need for this. A well written autoloader has instructions for where to find any class, so loading all possible classes ahead of time is wasteful. If you're still running into undefined classes errors, you need to either improve the autoloader or place the class files in accordance with the current autoloader instructions.
If you really want to preload all your classes, use the auto_prepend_file setting in php.ini. The docs say
Specifies the name of a file that is automatically parsed before the
main file
Set it to an initialization script. In that script have something like:
//put all your class files in this folder
$dir = '/path/to/classes/folder';
$handle = opendir($dir);
//require all PHP files from classes folder
while (false !== ($item = readdir($handle))){
$path = $dir.'/'.$item;
if(is_file($path) && pathinfo($path,PATHINFO_EXTENSION)==='php')
require_once $path;
}
This is simplified. There is significant risk in just including all files in any directory into your script so I would not do this. You would also need to adjust this if you want to include files in subdirectories.
Basically, don't do this. Just have a good autoloader.
No one posted what I was looking for but it seems the best route is the OptCache that's prebuilt into php 5.5 and above (my client is using 5.3 so I didn't know about it).
https://github.com/zendtech/ZendOptimizerPlus
The Zend OPcache
The Zend OPcache provides faster PHP execution through opcode caching
and optimization. It improves PHP performance by storing precompiled
script bytecode in the shared memory. This eliminates the stages of
reading code from the disk and compiling it on future access. In
addition, it applies a few bytecode optimization patterns that make
code execution faster.
In composer documentation section about autoloading I found at least two means to load classes that not corresponds to psr-0/4. First one is to specify classmap property of composer.json file and second one is fill include-path property in my composer.json.
As I can see include-path is more plain feature while classmap caused scanning for classes in specified locations. Who can explain which one should I use in different cases?
Avoid using include-path. It is meant for old software that expects the include_path setting to contain some directories, so that require_once "Relative/Path/To/Class.php" works from ANY location (think of the way PEAR works). Using too many paths impacts performance, because PHP needs to scan starting from the first directory, until it finds the relative path requested.
Classmaps are an always working solution if the classes do not conform to PSR-0 or PSR-4. Autoloading works by knowing the name of the class and finding out the file this class is contained in. PSR-0/4 define a way to know the file name by using and splitting the class name. Classmaps however know every class name and their file name directly. The bad thing about classmaps is that if they grow too large, they also affect performance because loading a huge classmap and then only using about 1% of the contained classes has a big overhead.
include-path and classmap are not mutually exclusive. In fact, they might be both needed: To load the first class, you'd need the classmap (otherwise you'd be forced to explicitly use require_once), and if that file will load dependencies using relative paths inside require_once (and does not know about autoloading), a proper include path has to be set.
If you ever have the chance to change it, I highly recommend to avoid setting the include path, and only use the classmap feature to autoload classes (which means you should remove any include/require functions in that code base). It would be even better if your code can be transformed to be PSR-0 compliant, but this usually is a huge rewriting task for old code with barely any benefit. You'd have to worry about performance only if you really have a HUGE framework with many files and classes - which usually is not the case with older code bases.
Why there is the function spl_autoload_unregister ? If I register autoloader in what cases I would want to unregister it ? seems redundant to me.
And another question in that topic: since now php 5.5 comes with built-in opcache, and in the past many install apc - is there any reason to use autoloader all together ? since all the code now will go to the memory anyhow - isn't it better to just make one file which loads all my php classes ?
You can specify more than one autoload method. So if you have an application that has a lot of autoload methods, it's possible you would want to unregister those methods as well. In practice, this method likely exists more for completeness (not too many projects use so many methods that you would want to unregister an autoload method).
Opcode caching is a different topic. This has nothing to do with autoloading. When PHP is told to execute a file it first parses the file and builds machine level instruction code (operations code, or opcode). The second pass executes the machine level code. Opcode caching (APC, Zend Opcache, etc) simply stores the machine level code so you only need to execute it the next time you call that page, and thus save yourself the extra processing.
Expanded for the comment
You can include all the files if you want, but autoloading does two important things
You can structure your classes within a namespace and use files and
directories to mirror the namespace structure. This makes working with your classes very easy because you can tell quickly where each class file lives
You're including files only as you need them
As for your opcache, your thinking is incorrect. Let's say you include all classes and methods you use. Now, let's say you have a page that only uses 25% of your code base. That means you loaded the other 75% and forced the opcache to cache it. But to what end? Opcache works on a file-by-file basis, not a project level basis. So you would be bloating your code on every page for no gain because autoloading would have included the code you needed anyways, but dynamically.
From php.net:
In PHP 5, this is no longer necessary. You may define an __autoload() function which is automatically called in case you are trying to use a class/interface which hasn't been defined yet. By calling this function the scripting engine is given a last chance to load the class before PHP fails with an error.
Now I am wanting to know, is it bad practice to solely use __autoload to load the appropriate classes on a dynamic site?
The way my site is setup is to include files into the index.php file, for example http://www.site.com/index.php?p=PAGE-I-WANT-TO-LOAD
So if I am on the forums section or the blogs section of my site, I want only appropriate classes and functions to be loaded, so I use autoload but I never include a file manually, should I be using __autoload as a last resort or is what I am doing fine even on a high traffic system?
Bad? No. __autoload() is one of my favorite additions to PHP 5. It removes the responsibility (and annoyance) of manually having to include/require the class files necessary to your application. That being said, it's up to you as the developer to ensure that only the 'appropriate classes' are loaded. This is easily done with a structured naming scheme and directory structure. There are plenty examples online of how to properly use __autoload(), do a Google search and you'll find plenty of information.
Autoload is a good way to load only what classes is needed.
In PHP 5 >= 5.1.2, most of the problems with the old __autoload() dissapeared, thanks to spl_autoload_register().
Now I am wanting to know, is it bad practice to solely use __autoload to load the appropriate classes on a dynamic site?
Not at all. You can rely on autoload, all you need to do is to devise a good naming convention and implement an efficient autoloader.
There is one major issue to consider. Autoloading and Zend Guard do not play well together, because Zend Guard tends to rename things, which will mean that the naming convention you decided to use will most likely not be the same. If you will be using Zend Guard (or any other obfuscator for that matter) you will most likely be forced to include all the files by hand.
Here is a quote from the Zend Guard user guide:
Autoloading classes will not work since the filename on the disk would not
match the obfuscated class name.
The only danger to __autoload() is if you define a poor autoloading function. Generally, all you're going to get in terms of a performance hit is a few disk seeks as PHP looks for the right files that contain your classes. The upside is getting rid of all those annoying include() calls.
If you're worried about performance at this level, then you should already be using an opcode cache such as APC.