We are working on an image-processing application. It involves applying filters, Gaussian etc. We want to make it a highly concurrent application.
This will be on multiple single core ec2 instances.
Since Imageprocessing is an cpu intensive operation, we are thinking node.js gets blocked in event loop, so thinking to use php. We are not able to find any benchmarks in this area. Any inputs on this will be a great help.
Its a CPU-bound task. Really well optimized PHP or Node will probably perform similarly. I/O concurrency will not affect CPU bound tasks on single core. On many core the I/O may come into play, but realistically most platforms including PHP have efficient strategies for concurrent I/O now. Also you are likely to end up calling out to C or C++ code regardless.
If you really want (cost-effective) performance, drop the single core thing, put some large basically gaming or bitcoin mining PCs in the office, find a nice way to distribute the tasks among the machine(s) and a way to process multiple images concurrently on the GPUs. None of that is in actuality tied to a particular programming language.
PHP is not highly concurrent and each request will block until it's done. Node would be fine as long as it's mainly doing I/O, or waiting for another process to return, e.g. calling convert (ImageMagick), rather than doing any processing itself. The more CPU cores you have to run the actual conversion on, the better.
For image processing, I recommend to use PHP instead of Node.js because there are many great PHP packages that can help you work with images easily. Don't worry about the performance of PHP7 or HHVM :)
Related
I want to use Google translate's v3 PHP library to translate some text into a bunch of different languages. There may be workarounds (though none ideal that I know of), but I'm also just trying to learn.
I wanted to use using multiple calls to translateText, one call per target language. However, to make things faster, I would need to do these requests concurrently, so I was looking into some concurrency options. I was wanting to use calls to translateText instead of constructing a bunch of curl requests manually using curl multi.
I tried the first code example I found from one of the big concurrency libraries I've seen recommended, amphp. I used the function parallelMap, but I'm getting timeout errors when creating processes. I'd guess that I'm probably forking out too many processes at a time.
I'd love to learn if there is an easy way to do concurrency in PHP without having to make a bunch of decisions about how many processes to have running at a time, whether I should use threads vs processes, profiling memory usage, and what PHP thread extension is even any good / if the one I've heard of called "parallel" may be discontinued (as suggested in a comment here).
Any stack overflow post I've found so far is just a link to one giant concurrency library or another that I don't want to have to read a bunch of documentation for. I'd be interested to hear how concurrency like this is normally done / what options there are. I've found many people claim that processes aren't much slower than threads these days, and I can't find quick Google answers as to whether they take a lot more memory than threads. I'm not even positive that a lack of memory is my problem, but it probably is. There has been more complexity involved than I would have expected.
I'm wondering how this is normally handled.
Do people normally just use a pool of workers (processes by default, or threads using, say the "parallel" PHP extension) and set a max number of processes to run at a time to make concurrent requests?
If in a hurry, do people just kind of pick a number that isn't very optimized for how many worker processes to use?
It would be nice if the number of workers to use was set dynamically for you based on how much RAM was available or something, but I guess that's not realistic, since the amount of RAM available can quickly change.
So, would you just need to set up a profiler to see how much ram one worker process/thread uses, or otherwise just make some sort of educated guess as to how many worker processes/threads to use?
I have a multi-process PHP (CLI) application that runs continuously. I am trying to optimize the memory usage because the amount of memory used by each process limits the number of forks that I can run at any given time (since I have a finite amount of memory available). I have tried several approaches. For example, following the advice given by preinheimer, I re-compiled PHP, disabling all extensions and then re-enabling only those needed for my application (mysql, curl, pcntl, posix, and json). This, however, did not reduce the memory usage. It actually increased slightly.
I am nearly ready to abandon the multi-process approach, but I am making a last ditch effort to see if anyone else has any better ideas on how to reduce memory usage. I will post my alternative approach, which involves significant refactoring of my application, below.
Many thanks in advance to anyone who can help me tackle this challenge!
Mutli-process PHP applications (e.g. an application that forks itself using pcntl_fork()) are inherently inefficient in terms of memory because each child process loads an entire copy of the php executable into memory. This can easily equate to 10 MB of memory per process or more (depending on the application). Compiling extensions as shared libraries, in theory, should reduce the memory footprint, but I have had limited success with this (actually, my attempts at this made the memory usage worse for some unknown reason).
A better approach is to use multi-threading. In this approach, the application resides in a single process, but multiple actions can be performed *concurrently** in separate threads (i.e. multi-tasking). Traditionally PHP has not been ideal for multi-threaded applications, but recently some new extensions have made multi-threading in PHP more feasible. See for example, this answer to a question about multithreading in PHP (whose accepted answer is rather outdated).
For the above problem, I plan to refactor my application into a multi-theaded one using pthreads. This requires a significant amount of modifications, but it will (hopefully) result in a much more efficient overall architecture for the application. I will update this answer as I proceed and offer some re-factoring examples for anyone else who would like to do something similar. Others feel free to provide feedback and also update this answer with code examples!
*Footnote about concurrence: Unless one has a multi-core machine, the actions will not actually be performed concurrently. But they will be scheduled to run on the CPU in different small time slices. From the user perspective, they will appear to run concurrently.
I need to run Linux-Apache-PHP-MySQL application (Moodle e-learning platform) for a large number of concurrent users - I am aiming 5000 users. By concurrent I mean that 5000 people should be able to work with the application at the same time. "Work" means not only do database reads but writes as well.
The application is not very typical, since it is doing a lot of inserts/updates on the database, so caching techniques are not helping to much. We are using InnoDB storage engine. In addition application is not written with performance in mind. For instance one Apache thread usually occupies about 30-50 MB of RAM.
I would be greatful for information what hardware is needed to build scalable configuration that is able to handle this kind of load.
We are using right now two HP DLG 380 with two 4 core processors which are able to handle much lower load (typically 300-500 concurrent users). Is it reasonable to invest in this kind of boxes and build cluster using them or is it better to go with some more high-end hardware?
I am particularly curious
how many and how powerful servers are
needed (number of processors/cores, size of RAM)
what network equipment should
be used (what kind of switches,
network cards)
any other hardware,
like particular disc storage
solutions, etc, that are needed
Another thing is how to put together everything, that is what is the most optimal architecture. Clustering with MySQL is rather hard (people are complaining about MySQL Cluster, even here on Stackoverflow).
Once you get past the point where a couple of physical machines aren't giving you the peak load you need, you probably want to start virtualising.
EC2 is probably the most flexible solution at the moment for the LAMP stack. You can set up their VMs as if they were physical machines, cluster them, spin them up as you need more compute-time, switch them off during off-peak times, create machine images so it's easy to system test...
There are various solutions available for load-balancing and automated spin-up.
If you can make your app fit, you can get use out of their non-relational database engine as well. At very high loads, relational databases (and MySQL in particular) don't scale effectively. The peak load of SimpleDB, BigTable and similar non-relational databases can scale almost linearly as you add hardware.
Moving away from a relational database is a huge step though, I can't say I've ever needed to do it myself.
I'm not so sure about hardware, but from a software point-of-view:
With an efficient data layer that will cache objects and collections returned from the database then I'd say a standard master-slave configuration would work fine. Route all writes to a beefy master and all reads to slaves, adding more slaves as required.
Cache data as objects returned from your data-mapper/ORM and not HTML, and use Memcached as your caching layer. If you update an object then write to the db and update in memcached, best use IdentityMap pattern for this. You'll probably need quite a few Memcached instances although you could get away with running these on your web servers.
We could never get MySQL clustering to work properly.
Be careful with the SQL queries you write and you should be fine.
Piotr, have you tried asking this question on moodle.org yet? There are a couple of similar scoped installations whose staff members answer that currently.
Also, depending on what your timeframe for deployment is, you might want to check out the moodle 2.0 line rather than the moodle 1.9 line, it looks like there are a bunch of good fixes for some of the issues with moodle's architecture in that version.
also: memcached rocks for this. php acceleration rocks for this. serverfault is probably the better *exchange site for this question though
Right now I'm running 50 PHP (in CLI mode) individual workers (processes) per machine that are waiting to receive their workload (job). For example, the job of resizing an image. In workload they receive the image (binary data) and the desired size. The worker does it's work and returns the resized image back. Then it waits for more jobs (it loops in a smart way). I'm presuming that I have the same executable, libraries and classes loaded and instantiated 50 times. Am I correct? Because this does not sound very effective.
What I'd like to have now is one process that handles all this work and being able to use all available CPU cores while having everything loaded only once (to be more efficient). I presume a new thread would be started for each job and after it finishes, the thread would stop. More jobs would be accepted if there are less than 50 threads doing the work. If all 50 threads are busy, no additional jobs are accepted.
I am using a lot of libraries (for Memcached, Redis, MogileFS, ...) to have access to all the various components that the system uses and Python is pretty much the only language apart from PHP that has support for all of them.
Can Python do what I want and will it be faster and more efficient that the current PHP solution?
Most probably - yes. But don't assume you have to do multithreading. Have a look at the multiprocessing module. It already has an implementation of a Pool included, which is what you could use. And it basically solves the GIL problem (multithreading can run only 1 "standard python code" at any time - that's a very simplified explanation).
It will still fork a process per job, but in a different way than starting it all over again. All the initialisations done- and libraries loaded before entering the worker process will be inherited in a copy-on-write way. You won't do more initialisations than necessary and you will not waste memory for the same libarary/class if you didn't actually make it different from the pre-pool state.
So yes - looking only at this part, python will be wasting less resources and will use a "nicer" worker-pool model. Whether it will really be faster / less CPU-abusing, is hard to tell without testing, or at least looking at the code. Try it yourself.
Added: If you're worried about memory usage, python may also help you a bit, since it has a "proper" garbage collector, while in php GC is a not a priority and not that good (and for a good reason too).
Linux has shared libraries, so those 50 php processes use mostly the same libraries.
You don't sound like you even have a problem at all.
"this does not sound very effective." is not a problem description, if anything those words are a problem on their own. Writing code needs a real reason, else you're just wasting time and/or money.
Python is a fine language and won't perform worse than php. Python's multiprocessing module will probably help a lot too. But there isn't much to gain if the php implementation is not completly insane. So why even bother spending time on it when everything works? That is usually the goal, not a reason to rewrite ...
If you are on a sane operating system then shared libraries should only be loaded once and shared among all processes using them. Memory for data structures and connection handles will obviously be duplicated, but the overhead of stopping and starting the systems may be greater than keeping things up while idle. If you are using something like gearman it might make sense to let several workers stay up even if idle and then have a persistent monitoring process that will start new workers if all the current workers are busy up until a threshold such as the number of available CPUs. That process could then kill workers in a LIFO manner after they have been idle for some period of time.
I'd like to have your opinion about writing web apps in PHP vs. a long-running process using tools such as Django or Turbogears for Python.
As far as I know:
- In PHP, pages are fetched from the hard-disk every time (although I assume the OS keeps files in RAM for a while after they've been accessed)
- Pages are recompiled into opcode every time (although tools from eg. Zend can keep a compiled version in RAM)
- Fetching pages every time means reading global and session data every time, and re-opening connections to the DB
So, I guess PHP makes sense on a shared server (multiple sites sharing the same host) to run apps with moderate use, while a long-running process offers higher performance with apps that run on a dedicated server and are under heavy use?
Thanks for any feedback.
After you apply memcache, opcode caching, and connection pooling, the only real difference between PHP and other options is that PHP is short-lived, processed based, while other options are, typically, long-lived multithreaded based.
The advantage PHP has is that its dirt simple to write scripts. You don't have to worry about memory management (its always released at the end of the request), and you don't have to worry about concurrency very much.
The major disadvantage, I can see anyways, is that some more advanced (sometimes crazier?) things are harder: pre-computing results, warming caches, reusing existing data, request prioritizing, and asynchronous programming. I'm sure people can think of many more.
Most of the time, though, those disadvantages aren't a big deal. You can scale by adding more machines and using more caching. The average web developer doesn't need to worry about concurrency control or memory management, so taking the minuscule hit from removing them isn't a big deal.
With APC, which is soon to be included by default in PHP compiled bytecode is kept in RAM.
With mod_php, which is the most popular way to use PHP, the PHP interpreter stays in web server's memory.
With APC data store or memcache, you can have persistent objects in RAM instead of for example always creating them all anew by fetching data from DB.
In real life deployment you'd use all of above.
PHP is fine for either use in my opinion, the performance overheads are rarely noticed. It's usually other processes which will delay the program. It's easy to cache PHP programs with something like eAccelerator.
As many others have noted, PHP nor Django are going to be your bottlenecks. Hitting the hard disk for the bytecode on PHP is irrelevant for a heavily trafficked site because caching will take over at that point. The same is true for Django.
Model/View and user experience design will have order of magnitude benefits to performance over the language itself.
PHP is a language like Java etc.
Only your executable is the php binary and not the JVM! You can set another MAX-Runtime for PHP-Scripts without any problems (if your shared hosting provider let you do so).
Where your apps are running shouldn't depend on the kind of the server. It should depend on the ressources used by the application (CPU-Time,RAM) and what is given by your Server/Vserver/Shared Host!
For performance tuning reasons you should have a look at eAccelerator etc.
Apache supports also modules for connection pooling! See mod_dbd.
If you need to scale (like in a cluster) you can use distributed memory caching systems like memcached!