Some background first
Our company, a small startup with only four developers, is starting the refactoring of our products into reusable modules to simplify the development process, increase productivity and, along the way, we would like to introduce unit tests where fits.
As usual on a small startup, we can't afford wasting too much development time but, as we see, this is extremely important for the success of our business on a medium and long term.
Currently, we have two end-user products. Both are Laravel (PHP) applications built on top of our own internal business layer, mainly composed of webservices, restful apis and a huge database.
This business layer provides most of the data for these products, but each of them makes completely different use of it. We plan to build other products on the near future besides maintaining and improving those two that are almost finished.
For that to happen, we intend to abstract the common logic of those (and the future) products into reusable and decoupled modules. The obvious choice seems to be Composer, even with our little knowledge about it.
Now to the real question
I would like to ask other opinions on how to develop internal packages on a test driven fashion. Should each module be a composer package with it's own unit tests and requiring it's dependencies, or should we build a single package with each module namespaced?
To clarify a bit, we would like to have, for instance, a CurlWrapper module and that would be required on our InternalWebserviceAPI module (and a few others).
I personally like the idea of having completely separate packages for each module and declaring dependencies on composer.json, which would mentally enforce decoupling and would allow us to publish some of those packages as opensource someday. It also may simplify breaking changes on those modules because we could freeze it's version on the dependents that will need to be updated.
Although, I also think this separation may add a lot of complexity and may be harder to maintain and test, since each module would need to be a project on it's own and we don't have all that man power to keep track of so many small projects.
Is really Composer the ideal solution for our problem? If so, which would recommend: single package or multiple packages?
Edit 1:
I would like to point out that most of these modules are going to be:
Libraries (ie obtaining an ID from an youtube URL or converting dates to "x seconds ago")
Wrappers (like a chainable CURL wrapper)
Facades (of our multiple webservices, those require the other two kinds)
Yes, composer is the way to go and I recommend you to use single packages.
You don't know when you need these modules. It is better to create many single packages and be able to include them all (or a single one), than creating big packages and need to put more time in breaking a package in multiple ones when you need some classes from it.
For instance, see the Symfony2 project. That is a lot of components which are all required for the full-stack Symfony2 framework, but you can also use some components in your own project (like Drupal8 is doing). Moreover, Symfony2 gets more and more packages, it seems so usefull to have small packages that people put time in breaking some big packages in pieces.
An alternative to using single packages: use separate composer.json files for each subproject.
This has the benefit of letting you keep all of your libraries in the same repository. As you refactor the code, you can also partition autoload and dependencies by sub-library.
If you get to the point that you want to spin the library off into its own versioned package, you could go the final step and check it into its own repository.
Related
I'm developing an application that consists of several modules/packages which i also want to offer as standalone packages. I know how to create composer packages but i'm not exactly sure on the best way to do the actual development and need your help on this.
One way would be installing the packages with composer but that would mean that, for each change, i would have to commit and then do a composer update on my app, just to able to test it. Not very practical.
Another way would be to have them included in my app, although having the package internal structure. That would work fine for developing but would pose a problem on publishing individual packages since all the code would belong to the same repository.
I think a good example on this is the way modern frameworks, like Laravel, are available. They have the whole code available in a repository but, at the same time, have each individual component available standalone.
What's the best way (in your opinion) to accomplish this?
Thanks in advance.
Symfony2 uses Git subtree split. That is, a single development repository which is split into multiple repositories later.
Don't make any mistake about it though, the code is the same, but they "are different" repositories, and the procedure to maintaining them is rather long winded.
http://www.craftitonline.com/2012/03/git-subtree-split-this-is-what-symfony2-does-every-night-to-set-standalone-components/
We have a huge code-base (I mean huge, about 2M+ lines) in PHP. I would like to know how you guys managed to integrate composer in this kind of situation.
Specially when the code cannot be decoupled in little projects (Right now) because of the complexity (Even mixed with legacy code) and it's being hold in the same SVN repository.
Why should I be confident in the quality of the composer/packagist libraries?
What happens if packagist goes down?
What should I do if my vendor repository goes down (Github/Bitbucket/Whatever)?
What happens if some of my vendors decide to delete their library?
What if they've been hacked and set the next version tag empty?
I know that this possible problems could be over-passed in one way or another. But the fact that the life of a lot people could be depending on this makes me feel a bit crazy with this kind of decision.
What do you think? What are my best options?
For the first point - if you have legacy, 2M+ tighthly-coupled codebase, common open source projects quality shouldn't bother you ;).
For the rest - you can use staging to build your project together with dependencies and then build a full package there (by that I mean all the dependencies downloaded and bundles). Of course you will still be dependent on external packages on your development cycle, but not in deployment/production. Whenever package goes down, you have time and possibility to replace it.
Composer is a really great tool for bundling yor project together with dependencies, so it's both the answer to question "how to use external dependencies" and also to "how to be independent from them", you only need to specify the point, at which you want to bring this independency into your project.
I think that you should develop with external dependencies in mind, lowering your code base as much as possible and not put these problems on your devs shoulders, they want to use code, libraries, play with tiem... then, somewhere in your deployment process, bundle it all together (staging is a good place). Even if your dependencies will disappear and you will have to spend your development time to replace them:
It will probably still cost you less than handling all on your own.
As far as I understand, every enabled module in a ZF2 application is loaded for every request (unless one uses optimization methods such as that offered by the zf2-lazy-loading-module module). I've been keeping an eye on modules that get published on modules.zendframework.org and I've come across modules which offer extremely limited functionality, such as the AkrabatFormatUkTelephone module which purpose is to format phone numbers to UK format.
Whilst I understand development should focus on creating single purpose modules that are good at doing one thing (instead of modules which do many things but not in a very good way), I'm thinking if we start using modules which offer such limited functionality as the one mentioned, we will need to combine hundreds of modules in order to build a rich application which could be disastrous for performance. Instead I would expect this sort of functionality to be put in a class (e.g. Zend\I18n?) and loaded on demand which would be more optimized. But knowing Akrabat's reputation I'm thinking I must be missing something, hence my question:
Is the loading of modules such as the one I mentioned significantly worse for performance than loading the same functionality via PHP classes (or is it similar due to the way ZF2 has been designed)? Does anybody have any figures (i.e. is it 5%, 10%, 15% slower) about module vs class loading performance?
Don't take this comment as a final answer, as hopefully someone of the ZF2 devs will shed some more insight to it, but generally only Module.php and usually module.config.php will be actively loaded. Everything else will simply be registered and be called on demand. So as long as your Module.php and module.config.php are not TOO big in filesize, the performance shouldn't be THAT big of an issue
In the case of Akrabats example, all that's happening is, the registry of a new ViewHelper. Nothing else. The same with all other view helpers inside of Zend. Performance won't really matter a lot in these cases.
Personally the Skeleton loaded with 80ms on my Webspace and with BjyAuthorize, ZfcBase, ZfcUser and my own module, the loading time ramped up to 100ms. And this is without any sort of memory caching enabled!
Loading a module is not much more than loading any class, like Sam pointed out.
As long as you don't use anything from your module and do things right, it's just beeing registered.
Now what does "do things right" mean?
Just try to put a big nonsense loop inside your module classes bootstrap() method. You will see that this slows down every request on your application, because the bootstrap method of your module is called on every request and it should be used very carefully, only for light weight tasks. The purposes you usually use the bootstrap() method for, won't even slow down your app for a millisecond, but writing a file to the disk in this method could slow down your app for many seconds in each request.
If your app becomes really heavy, you should use the classmap_autoloader and some caching wherever you can. If you did "things right", you won't have any performance problems, just because you have many modules or many classes in your app. One could say, it's just all about algorithms.
Keep going on using best practices, like the one you mentioned. Usually these aren't the bottlenecks of your application, but your own algorithms and failures are.
edit:
When you're using modules from the community, you should always check them for performance issues. Even a module that seems to be very light could be a bottleneck for your application if it has bad algorithms. But the case that you're loading an additional module is not the point of it.
Good question. I would like to contribute a little bit to the reaction of Sam.
Module performance is not solely the loading of the module (which is, as pointed out quite fast), but also the communication of the modules in-between. So this question might boil down to: how slow/fast is the ServiceLocator and Event-driven system in comparison to traditional non-modulair systems?
I recall that ZF2 was build with performance in mind. For instance, the ServiceLocator registers factories, so that objects can be instantiated on-the-fly. So this requires only a few extra in-memory objects and instantiations, I guess this does not impact the total performance for your application much. The EventManager works in much the same way and I have not seen it being overloaded with registered events, even in large applications.
What might slow down, on the other hand, is the loading of the modules configuration. I figure that using a cache might solve this problem. I'm not sure but maybe Zend Optimizer might do this already.
So, in short, applications should scale pretty wel, provided that modules behave well, and do not over-register events or misuse the ServiceLocator.
From the MVC component's perspective there are no modules at all! There's one big configuration file - a result of merge of every module's configuration. Unless your modules don't have a onBootstrap method or don't do much, module loading is as fast as invoking new Module on every one of them, which is painless and memory inexpensive.
The configuration merge procedure, which I mentioned above, happens only in DEV mode which is enabled by default.
There are number of tricks also to speed up your ZF2 application, like:
Enable merged config cache
Use EdpSuperluminal module
Return the ViewModel objects from actions, not arrays
Explicitly set the template name on the ViewModel
Use template maps instead of template path stack alone
Route order in the config matters! Its a LIFO queue (last in-first out).
Make sure you don't load Console modules in HTTP context.
Let the Composer do the autoloading, not ZF2
... and more. There's a quite good talk by Gary Hockin on the ZF2 app performance.
Authorization modules will surely slow down your app. There are number of things going down under the hood: the identity of the user needs to be fetched (from the database?), user needs to be authenticated agains your rules. Surely you can speed things up by using memcached or such, but this requires to have some knowledge about the lifecycle of the ZF2 application, about the modules you use, etc.
Also there is Zend Framework 3 going to be released soon, some things will go faster, but don't expect much. A lot of overhead is a result of your lack of knowledge about ZF2 - no offense!
I'm buidling a new appliction, eventually I'm planning on releasing it onto the Internet for free consumption.
In an effort to reduce the final download size of the package I would like to only bundle the absolute bare ZF components used by my app:
Zend
Zend_Application
Zend_Config
Zend_Config_Ini
...
I could manually do this - though I'd rather not. Is there a tool around that I can point to my application, it can scan the PHP codebase and create a package with all classes referenced.
I know ZF2 uses composer.json to take care of this - however I'm building on ZF 1.11
No, there is no tool, to do this job for you. However, I recommend not to break the ZF1 into pieces, because it's not that trivial to track down the several dependencies between the components. There is also no benefit: The autoloader takes care, that only the classes were loaded, that are required, thus you only save a small amount of disk space. Thats not worth all the effort it takes. This means you will definitely feel no difference wether or not you use ZF1 as a whole, or only partial, unless you find broken dependencies you created yourself.
I am wondering what the best way (for a lone developer) is to
develop a project that depends on code of other projects
deploy the resulting project to the server
I am planning to put my code in svn, and have shared code as a separate project. There are problems with svn:externals which I cannot fully estimate.
I've read
subversion:externals considered to be an anti-pattern, and
How do you organize your version control repository,
but there is one special thing with php-projects (and other interpreted source code): there is no final executable resulting from your libraries. External dependencies are thus always on raw source code.
Ideally I really want to be able to develop simultaneously on one project and the projects it dependends on.
Possible way:
Check out a projects' dependency in a sub folder as a working copy of the trunk. Problems I foresee:
When you want to deploy a project, you might want to freeze its dependencies, right?
The dependency code should not end up as a duplicate in the projects repository, I think.
*(update1: I additionally assume svn:ignore will pose problems if I cannot fall back on symlinks, see my comment)
I am still looking for suggestions that do not require the use junction points. They are a sort of unsupported hack in winxp, which may break some programs*
This leads me to the last part of the question (as one has influence on the other): how do you deploy apps whith such dependencies?
I've looked into BuildOut for Python, but it seems to be tightly related to the python ecosystem (resolving and fetching python modules from the web etc).
I am very eager to learn about your best practices.
One approach might be:
one repository per dependency
a requirements configuration file for your project which documents the dependencies and their versions (probably even your own versions of the dependencies)
automation scripts that handle setup of the development, testing and deployment environments (can be as simple as documenting the setup procedure once and making it configurable and executable)
This has several benefits:
you can easily (or even automatically) check whether your dependencies have become outdated (another better library is available), or have known security vulnerabilities.
more awareness about dependencies
easier to debug/fix/patch problems caused by dependencies
ignoring svn:externals might also ease the pain when you switch to distributed version control like git, bzr, hg in the future.
if you want to set up your environment on another machine (or eventually another developer takes over or joins) it will save you tons of time
Some KISS automation tools that are popular in web-development and server administration:
fabric (python)
buildout (python)
capistrano (ruby)
Summary:
Document your requirements (preferably machine readable -> yaml, ini, json, xml) and handle dependencies outside of your project. It provides you with a bit of indirection which makes automated setup and deployment easier and less dependent on your version control system (separation of concerns, best tool for the job, etc).
This may sound cheap but I think I have a answer for you in this questions thread: svn folder structure organization
As long as you stay within your own repository I wouldnt consider svn:externals as harmful as stated. Just don't overdo it.
Deployment with this strategy is also a piece of cake since it is ALL in one tag (checkout, run it, profit). Your directory structure remains the same on all layers, branches, tags, trunk.
By directing externals to tags (and making tags read-only on the svn server) you can be 100% sure you get the library you expect.