I was trying to setup a multi-threaded socket application, but whenever I ran it I got an error because pcntl_fork() was disabled by default. Is this because it is dangerous or unstable? Should I look for some other way to multithread, or is it just disabled because it is not often used?
pcntl_fork() is not for multithreading, it only... well, forks the current proccess. Be sure to check the documentation for more information on the function.
The best reason I can think of it's disable by default, it's because PHP was never meant for parallel computing, it's merely a scripting language (a very powerful one at that). As Martin suggestted on his answer in a similar question, you're better off running a CRON or another program.
It simply is a function that should not be used in the average shared-hosting apache environment. Forking there would possibly clone a rather big process and besides that the function can be abused to bring down a poorly configured server (fork bomb).
Using it e.g. in a commandline based PHP script is perfectly fine.
Related
Some background:
I am building a server application in php that will need to execute a number of independent tasks on a user request. Theres is a severe requirement on speed for my application so I would like to execute all of those tasks in parallel.
I've looked at several solutions (e.g gearman, rabbitMQ, zeroMQ) and I've decided to go with zeroMQ (fast, good docs, flexible, and doesn't require a broker). This solves the communication/sync problem between the threads for me.
Question:
I would like to initiate the tasks only when the server receives a request (not to have a long running process). So I receive a request -> start parallel computation -> return the result of the computation to the client. One solution for that seems to be pcntl_fork however the docs mention that there ares some issues with using it in a production server env but doesn't really specify what they are?
My other option is to use proc_open, but I like it less because it would require me to serialize the inputs in some way which seems less flexible and fast then forking. Does it have any advantages over pcntl_fork?
Is there another solution (still using php :p)?
Tread carefully, I see several red-flags in your question that lead me to believe you are concerned about things that maybe you don't need to be, and you probably aren't concerned with things you should be.
You say you have a severe requirement for speed - have you validated that normal single threaded PHP is not fast enough? Run any benchmarks, figured out your bottlenecks? If your speed requirement is that great, you might even consider using a different language, for all of PHP's charms it's never going to be the most efficient hammer in the toolbox. Java is a good option for all-out speed, and node.js is a good option if your bottlenecks are IO dependent. My main concern is that, absent more information, this question smells of premature optimization. This may be unfair and you may have omitted those details because it wasn't the heart of your question, but as an outsider I at least wanted to make sure that you think about these things if you haven't already.
You want to avoid long-running processes - why? There's nothing inherently wrong with long-running processes - but it does feel wrong when what you're used to is the pseudo-efficient "on-demand" nature of Apache+mod_php. Be sure you're not trying to avoid something just because you're not used to it.
What you seem to be describing is performing parallel processing from within your PHP web-app - just like any other web page you write, Apache initiates your PHP script, that script forks another process and, rather than performing its actions serially, performs them in parallel, completes and returns to the user at the completion of the page-render. If that's correct, then here is the answer to your original question:
You cannot use pcntl_fork from within a web process, only from the command line. The details of this are on the page you linked to, down in the comments:
It's not a matter of "should not", it's "can not". Even though I have compiled in PCNTL with --enable-pcntl, it turns out that it only compiles in to the CLI version of PHP, not the Apache module. [...] function_exists('pcntl_fork') was returning false even though it compiled correctly. It turns out it returns true just fine from the CLI, and only returns false for HTTP requests. The same is true of ALL of the pcntl_*() functions.
... which means that either you'll have to initiate your forking process as a separate long-running process, or you'll have to start it on demand with proc_open, there is no way to get it to work the way I assume you want it to.
I recently read about http://php.net/pcntl and was woundering how good that functions works and if it would be smart to use multithreading in PHP since it isn't a core function of PHP.
I would want to trigger events that don't require feedback through it like fireing a cronjob execution manually.
All of it is supposed to run in a web app written with Zend Framework
The pcntl package works quite fine - it just uses the according unix functions. The only shortage is that you can't use them if php is invoked from a web server context. i.e. you can use it in shell scripts, but not on web pages - at least not without using a hack like calling a forking script with exec or similar.
[edit]
I just found a page explaining why mod_php cannot fork. Basically it's a security issue.
[/edit]
This is not thread control, this is process control. The library for the threads is pthreads (POSIX threads) and it's not included in PHP, so there are no multi-threading functions in PHP.
As of multiprocessing, you cannot use that in mod_php, as that would be a giant security hole (spawned process would have all the web-server's privileges).
The only possible way to have php code executing in multiple threads is to run php as a module of a threaded web server, which is useless because threads are fully isolated and your code has no control over them. As far as i know, pcntl only manages processes, not threads.
If I needed to do manual crontab executions or the like from PHP, I'd probably use a queue. Have a database table that you append jobs to. Another process, either from a cron or running as a daemon, executes the jobs as they show up.
Another way to do it is to set up a separate script and do an HTTP GET to it. It's not quite threading, but it's one way of shelling to another command in PHP.
For example, if I wanted to run /usr/bin/somescript.sh on demand, I'd have a somescript.php that did a system call. This would be on a virtual host only accessible from localhost.
I'd do a socket call to the webserver and GET the script. The key is to not read on the socket so it doesn't block. If I wanted to check the return value of somescript.php, I'd do it later in my main script to prevent blocking.
If somescript.php takes a long time to execute (longer than the calling script), you'll have to do some magic to stop apache from killing the script when the socket is closed.
Multiplatform PHP Multithreading engine
http://anton.vedeshin.com/articles/lightweight-and-multiplatform-php-multithreading-engine
Examples of multithreading working in PHP (with excerpts from their project pages):
Cron Multi-Threaded.
As of October 25th, 2011, this module has reached "end of life" and is deprecated in favor of projects such as Elysia Cron. This module wasn't completely useless in that a core patch inspired by Cron MT was committed to D7.
Boost.
... provides static page caching for Drupal enabling a very significant performance and scalability boost for sites that receive mostly anonymous traffic. For shared hosting this is your best option in terms of improving performance. On dedicated servers, you may want to consider Varnish instead.
I'm working on an embedded system that is programmed with PHP 4.4.9 - unfortunately without the PCNTL extension.
I need to create a script that runs in the background as a daemon. You'd usually do this using fork(), or in the PHP case, pcntl_fork() - but this function is not available. A shell is also missing, so I can't use the standard tools.
So, what other ways are there to cleanly start a process in the background?
As kingCrunch says, you really should upgrade.
Firstly, there's more to making a daemon than just calling pcntl_fork(). You might want to read the Unix programming FAQ and the Unix socket FAQ.
Next, you've not mentioned how you intend to solve the problem of concurrency - while forking is one solution to this it is not the only reason for using fork() in a daemon.
So you've really got 2 problems to solve, first how you daemonize the program then how you handle concurrency.
Note that one approach to the latter which obviates the former is to run the server from [x]inetd.
Another approach to solving the concurrency problem is to run a single threaded server and use socket_select (or stream_select) to multiplex the connections - but I'm not sure how well that is supported in PHP 4 - there is a good example here.
A simple solution would be to write a simple wrapper program in C using daemon() to bootstrap the program. Or you could start it up directly from inittab. Or for a solution with complex management facilities have a look at DJB's daemontools
I have a C# application that needs to call a PHP script, and get the output, in the fastest possible way. The options I explored:
Executing the script with PHP CLI (Pro: Easy / Cons: No Opcode Cache / Precompilation ]
Compiling the PHP (Phalanger, Hiphop, etc.) [Pro: No Webserver / Con: Compatibility ]
Using an embedded webserver (AppWeb, Cherokee, Lighttpd) [Pro: Simple / Cons: Deployment ]
Are there any other options left?
EDIT: The best possible option would be to make use of the build-in FastCGI server of PHP, by running php-cgi.exe -b 127.0.0.1. But there seems no (C#) code to talk to a server available. While there are so many server-side libraries (like FCGIDotNet and SharpCGI), they all implement the server-side of the protocol.
One other option could be to run the PHP CLI script as a daemon (good blog post on this here).
If the script has a particularly long startup/cleanup, then running it as a daemon would mean that you only do this once.
The downside is that you'd need to write a way of communicating with that daemon, to get the data from C# to it. You'd also need to keep an eye out on its memory usage over time.
The best method is always going to be specific to your script though.
As you may know, PHP originally stood for "Personal Home Page", it is now said to stand for "PHP: Hypertext Preprocessor", which literally states it's usage, and basically is not something that a developer would embed into his application just to have a scripting option.
Unless you have some really specific piece of software and strong arguments, I'd suggest you to stick to Lua, or similar scripting libraries which are easy to embed and maintain. Otherwise, use it the way as everyone is using it, CLI. Or else be prepared to face the consequences.
I recently read about http://php.net/pcntl and was woundering how good that functions works and if it would be smart to use multithreading in PHP since it isn't a core function of PHP.
I would want to trigger events that don't require feedback through it like fireing a cronjob execution manually.
All of it is supposed to run in a web app written with Zend Framework
The pcntl package works quite fine - it just uses the according unix functions. The only shortage is that you can't use them if php is invoked from a web server context. i.e. you can use it in shell scripts, but not on web pages - at least not without using a hack like calling a forking script with exec or similar.
[edit]
I just found a page explaining why mod_php cannot fork. Basically it's a security issue.
[/edit]
This is not thread control, this is process control. The library for the threads is pthreads (POSIX threads) and it's not included in PHP, so there are no multi-threading functions in PHP.
As of multiprocessing, you cannot use that in mod_php, as that would be a giant security hole (spawned process would have all the web-server's privileges).
The only possible way to have php code executing in multiple threads is to run php as a module of a threaded web server, which is useless because threads are fully isolated and your code has no control over them. As far as i know, pcntl only manages processes, not threads.
If I needed to do manual crontab executions or the like from PHP, I'd probably use a queue. Have a database table that you append jobs to. Another process, either from a cron or running as a daemon, executes the jobs as they show up.
Another way to do it is to set up a separate script and do an HTTP GET to it. It's not quite threading, but it's one way of shelling to another command in PHP.
For example, if I wanted to run /usr/bin/somescript.sh on demand, I'd have a somescript.php that did a system call. This would be on a virtual host only accessible from localhost.
I'd do a socket call to the webserver and GET the script. The key is to not read on the socket so it doesn't block. If I wanted to check the return value of somescript.php, I'd do it later in my main script to prevent blocking.
If somescript.php takes a long time to execute (longer than the calling script), you'll have to do some magic to stop apache from killing the script when the socket is closed.
Multiplatform PHP Multithreading engine
http://anton.vedeshin.com/articles/lightweight-and-multiplatform-php-multithreading-engine
Examples of multithreading working in PHP (with excerpts from their project pages):
Cron Multi-Threaded.
As of October 25th, 2011, this module has reached "end of life" and is deprecated in favor of projects such as Elysia Cron. This module wasn't completely useless in that a core patch inspired by Cron MT was committed to D7.
Boost.
... provides static page caching for Drupal enabling a very significant performance and scalability boost for sites that receive mostly anonymous traffic. For shared hosting this is your best option in terms of improving performance. On dedicated servers, you may want to consider Varnish instead.