Is there anyway to work with threads in PHP via Apache (using a browser) on Linux/Windows?
The mere fact that it is possible to do something, says nothing whatever about whether it is appropriate.
The facts are, that the threading model used by pthreads+PHP is 1:1, that is to say one user thread to one kernel thread.
To deploy this model at the frontend of a web application inside of apache doesn't really make sense; if a front end controller instructs the hardware to create even a small number of threads, for example 8, and 100 clients request the controller at the same time, you will be asking your hardware to execute 800 threads.
pthreads can be deployed inside of apache, but it shouldn't be. What you should do is attempt to isolate those parts of your application that require what threading provides and communicate with the isolated multi-threading subsystem via some sane form of RPC.
I wrote pthreads, please listen.
Highly discouraged.
The pcntl_fork function, if allowed at all in your setup, will fork the Apache worker itself, rather than the script, and most likely you won't be able to claim the child process after it's finished.
This leads to many zombie Apache processes.
I recommend using a background worker pool, properly running as a daemon/service or at least properly detached from a launching console (using screen for example), and your synchronous PHP/Apache script would push job requests to this pool, using a socket.
Does it help?
[Edit] I would have offered the above as a commment, but I did not have enough reputation to do so (I find it weird btw, to not be able to comment because you're too junior).
[Edit2] pthread seems a valid solution! (which I have not tried so I can't advise)
The idea of "thread safe" can be very broad. However PHP is on the very, furthest end of the spectrum. Yes, PHP is threadsafe, but the driving motivation and design goals focus on keeping the PHP VM safe to operate in threaded server environments, not providing thread safe features to PHP userspace. A huge amount of sites use PHP, one request at a time. Many of these sites change very slowly - for example statistics say more sites serve pages without Javascript still than sites that use Node.js on the server. A lot of us geeks like the idea of threads in PHP, but the consumers don't really care. Even with multiple packages that have proved it's entirely possible, it will likely be a long time before anything but very experimental threads exist in PHP.
Each of these examples (pthreads, pht, and now parallel) worked awesome and actually did what they were designed to do - as long as you use very vanilla PHP. Once you start using more dynamic features, and practically any other extensions, you find that PHP has a long way to go.
Related
Some background:
I am building a server application in php that will need to execute a number of independent tasks on a user request. Theres is a severe requirement on speed for my application so I would like to execute all of those tasks in parallel.
I've looked at several solutions (e.g gearman, rabbitMQ, zeroMQ) and I've decided to go with zeroMQ (fast, good docs, flexible, and doesn't require a broker). This solves the communication/sync problem between the threads for me.
Question:
I would like to initiate the tasks only when the server receives a request (not to have a long running process). So I receive a request -> start parallel computation -> return the result of the computation to the client. One solution for that seems to be pcntl_fork however the docs mention that there ares some issues with using it in a production server env but doesn't really specify what they are?
My other option is to use proc_open, but I like it less because it would require me to serialize the inputs in some way which seems less flexible and fast then forking. Does it have any advantages over pcntl_fork?
Is there another solution (still using php :p)?
Tread carefully, I see several red-flags in your question that lead me to believe you are concerned about things that maybe you don't need to be, and you probably aren't concerned with things you should be.
You say you have a severe requirement for speed - have you validated that normal single threaded PHP is not fast enough? Run any benchmarks, figured out your bottlenecks? If your speed requirement is that great, you might even consider using a different language, for all of PHP's charms it's never going to be the most efficient hammer in the toolbox. Java is a good option for all-out speed, and node.js is a good option if your bottlenecks are IO dependent. My main concern is that, absent more information, this question smells of premature optimization. This may be unfair and you may have omitted those details because it wasn't the heart of your question, but as an outsider I at least wanted to make sure that you think about these things if you haven't already.
You want to avoid long-running processes - why? There's nothing inherently wrong with long-running processes - but it does feel wrong when what you're used to is the pseudo-efficient "on-demand" nature of Apache+mod_php. Be sure you're not trying to avoid something just because you're not used to it.
What you seem to be describing is performing parallel processing from within your PHP web-app - just like any other web page you write, Apache initiates your PHP script, that script forks another process and, rather than performing its actions serially, performs them in parallel, completes and returns to the user at the completion of the page-render. If that's correct, then here is the answer to your original question:
You cannot use pcntl_fork from within a web process, only from the command line. The details of this are on the page you linked to, down in the comments:
It's not a matter of "should not", it's "can not". Even though I have compiled in PCNTL with --enable-pcntl, it turns out that it only compiles in to the CLI version of PHP, not the Apache module. [...] function_exists('pcntl_fork') was returning false even though it compiled correctly. It turns out it returns true just fine from the CLI, and only returns false for HTTP requests. The same is true of ALL of the pcntl_*() functions.
... which means that either you'll have to initiate your forking process as a separate long-running process, or you'll have to start it on demand with proc_open, there is no way to get it to work the way I assume you want it to.
I'm starting to consider websockets as a solution to replace long polling in a new build PHP app I am commissioning.
I have a few questions which I wonder if people could help me out with.
Can a Nodejs server call PHP and if it did wouldn't it suffer the same shortcomings as just going through Apache in terms of the connections? We all know nodejs is non blocking and Apache etc isn't but if Nodejs is just making a call to a PHP server in it's own procedure would that not bottle neck in a similar way?
Are PHP and websockets a good match?
Are there any good js libraries besides socketio which apparently only works with Nodejs?
Has anyone found a good tutorial which uses websockets and a PHP backend maybe using something like that Ratchet PHP library which might help me get on my way?
Thoughts would be muchly appreciated.
Please excuse my paraphrasing of your questions.
1: Can Node.js call PHP, and wouldn't that have the same shortcomings as Apache?
Calling a run-once PHP script will have the same general shortcomings as calling a web page, except that you are removing an extra layer of processing. Apache or any web server itself is such a thin layer that, while you'll save some time, the savings will be insignificant.
If PHP is more effective at gathering data for your clients than Node.js, for whatever reason, then it might be wise to include PHP in your application.
2: Are PHP and WebSockets a good match?
Traditional PHP scripts are normally intended to be run once per request. The vast majority of PHP developers are unfamiliar with event driven development, and PHP itself does not (yet) have support for asynchronous processing.
PHP is a fast, mature scripting language that is only getting faster, even with all of its many warts and shortcomings. (Some say that its weak typing is a shortcoming. Others say that it's a shortcoming that its typing isn't weak enough.)
That said, the minimum that any language needs in order to implement WebSockets is the ability to open up a basic TCP port and listen for requests. For PHP, it is implemented as a thin wrapper around the C sockets library, and there are additional extensions and frameworks available that can also change the feel of working in TCP sockets with PHP.
PHP's garbage collector has also matured. Memory leaks come either from gross disregard for the memory space (I'm looking at you, Zend Framework) or from intentional sabotage of the garbage collection system by developers who think they're clever or want to prove how easy it is to defeat the GC. (Spoiler: It's easy in every language, if you know the details!)
It is quite possible and very easy to set up a daemon (long running background process) in PHP. It's even possible to make it well behaved enough to gracefully restart and hand its connections off to a new version of the same script, or even the same script on the same server running different versions of PHP, though this is treading out of scope just a tiny little bit.
As for whether it's a good match, that is completely up to the developer. Are you willing, able, and happy to work with PHP to write a WebSockets server, or to use one of the existing servers? Yes? Then you're a good match for PHP and WebSockets.
3: JS Libraries for WebSockets
I honestly haven't researched them.
4: Tutorials for using PHP and Websockets
I'm personally fond of this tutorial: http://www.phpbuilder.com/articles/application-architecture/optimization/creating-real-time-applications-with-php-and-websockets.html
Although I have it on good authority that the specifics of that tutorial will soon be obsolete for that specific WebSockets server. (There will still be an actively maintained legacy branch for that server, though.)
In case of link rot:
Using the PHP-Websockets server (available on Github, will be homed soon), extend the base WebSocketServer abstract class and implement the abstract methods process(), connected(), and closed().
There's much better information at the link above, though, so follow it as long as the link exists.
It would hit the same bottleneck if you go through apache. This can be remedied by using a different web server, such as lighthttpd or nginx. You won't even need node at all.
PHP does not have decent shared memory making the biggest advantages of a WebSockets irrelevent. It should be decent enough if you don't want interaction between users, but even then I would have to frown upon the usage of PHP. PHP is great for a lot of things, but real-time communication is not one of them.
You might want to look at https://github.com/einaros/ws.
PHP is not a good back-end. Anything with an execution model that isn't run-and-forget in its own sandbox, such as Node, .NET, C/C++ and Java are good matches. PHP is suited for short running executions, such as actual web sites and even web services -- but not real time connections.
I have a PHP script that must run 30 parallel times each with a different argument. What is the best way to do this so that each script can have as much even exposure to the processor as possible?
Problem description
Like some other users are telling(me too) you should give a little bit more explanation (maybe code samples). For example should these tasks run for ever or just once when php script is being called?
Message Queue
First off I think if possible it should be avoided to run so many tasks at once but schedule(be gentle to PC) them with a message queue like for instance beanstalkd
PHP solution
I don't think PHP is the right tool for your problem because of thread model(no). Threads are lightweight and creating new process is heavy. You could do it like stroncium is explaining. My opinion is that running this code on shared host will not be appreciated because if all users would run long running processes they would over utilize(use too much PC) the server.
Quoto from nettuts
There's no better resource than PHP's creator for knowing what PHP is capable of. Rasmus Lerdorf created PHP in 1995, and since then the language has spread like wildfire through the developer community, changing the face of the Internet. However, Rasmus didn't create PHP with that intent. PHP was created out of a need to solve web development problems.
However, you can't use PHP for everything. Lerdorf is the first to admit that PHP is really just a tool in your toolbox, and that even PHP has limitations.
Better language
Like I said previously I don't think PHP is the right tool.
Some languages which I think could solve the problem better:
java
python
C
Off course a lot more languages which support thread model are right tool for the job, but PHP isn't orginally designed for tasks like this. Even the creator of php Rasmus confirms this. You can read about this on this list from nettuts which I think has some pretty good points.
Google app engine
Last I would advice you to have a look at taskqueu api from google app engine. Because this is also a real good option ;). I might even consider it the best option. you have a free quote and the the costs are fair if you exceed quote. The task queue uses webhooks so that the hooks could be coded in PHP.
PHP itself haven't threads support. But you can just run few copies of your script simultaneously by using popen() or proc_open().
Sometimes multicurl is used for this purposes(when popen and alikes are resricted).
I don't think its CPU affinity that you have to worry about (so much), its how I/O bound each process is bound (pardon the pun) to become.
If using a UNIX like operating system, you can try using the nice command to adjust for processes that you predict will be doing more disk / network / database access, but I don't think you'll see any significant speed up.
If all processes are going to handle the same amount of I/O, you are probably better off just letting the kernel's scheduler do its job.
A little more information regarding what your jobs are actually accomplishing would be extremely helpful.
If you run it CLI you can fork 29-30 child processes and run the code there. You can have one main process with open sockets to each child or serial link them if you want to. You'd mostly have to hope the kernel will balance the processes if they have the same priority.
Given the simplicity of the question, I suggest you look for the simplest answer. Off the top, I'd say you might consider using one instance looping through 30 arguments.
I'm developing a web app for an Apache shared hosting server. I have already written some code in Perl but I recently found out, to my surprise, the shared hosting provider does not provided mod_perl or a way to install it.
I have been a bit worried that running a Perl web app through CGI without mod_perl would make it very slow? Should I switch all of my code to PHP instead, would that be faster?
The reason I chose Perl in the first place is, I'm very familiar with Perl more than PHP. Also I wanted to be able to use my Perl libraries outside the realm of web development.
So if any of you are experienced with Apache web development, can you shed some light as to which direction should I take.
For the sake of this question, lets say the web application will get 500+ hits a day.
Which would be faster PHP or Perl without mod_perl?
Thanks in advance for the help.
At only 500 hits a day, you could write your code in just about anything and not have to worry about slow downs. 500 hits a day evens out to about 1 page every 3 minutes. Even assuming a non-normal distribution of hits, you shouldn't really worry about this with such small traffic numbers.
PHP would be faster.
However, with only 500 hits per day, using cgi would not be a problem. Not even with 500 hits an hour.
Much depends on your architecture. Modern Perl frameworks aren't well suited for use as CGI (long start-up times). If you use CGI, Catalyst probably is a bad idea. That said, using classical architecture it should be quite manageable.
Unless your shared host is running PHP as a CGI application (not mod_php or FastCGI), PHP is almost1 always going to be faster. While Perl, running as a CGI, could probably handle your 500 hits a day, an application/page developed with CGI is going to be sluggish.
CGI works by spawning a new process to run your program for each request. Both mod_php and FastCGI applications mitigate this by spawning a set a number of processes and then using these to run your application. In other words, a new processes isn't being spawned for each request. (This is an oversimplified explanation, please don't use in a CS Term Paper. See mod_php and FastCGI docs for more info)
You could come up with pathological examples where it wouldn't be, but then you'd be the kind of person to come up with pathological examples of things, and no one wants that
Speed shouldn't be your concern. Both languages are suitable for web applications.
For the volume of traffic you're looking at, Perl with vanilla CGI shouldn't be an issue, although I would second the earlier recommendations to check out FastCGI as another option which your hosting service may provide.
Or another option would be to look for a different hosting company...
Expanding on what Alan Storm said, you might be able to use Perl with FCGI instead.
FCGI works by having a sort of stand-alone server, a daemon if you like, that connects with your web server via FCGI protocol and delegates/dispatches requests.
This is faster than normal CGI, as this emulates a sort of "servlet" model, the application is persistent, and there is no need for a new initialization on every call like there is with normal CGI.
I have not yet learned how to do this myself, but I believe Catalyst has this option, so its just a matter of learning how to replicate this.
FastCGI/FCGI should be available on drastically more hosts than plain old mod_perl, as FCGI applications are not web-server specific, and some web servers implement PHP via a fcgi utility.
And I've experimented with FCGI webserving a little, and preliminary tests say it can handle at least 500 req/s , far faster than the above concerns of 500/day or 500/hour.
It's possible to hack fastcgi support into a hosting account that doesn't support it. I compiled the fastcgi library with the install prefix set to the same thing as the home directory on the hosting account. Then I synced it up and set up catalyst to use the small cgi-fcgi bridge. It worked well. Nice and fast, because the cgi bridge is just a tiny little executable. The catalyst process persisted in the background just fine.
The answer in everyones mind is: who cares.
500 requests per day is nothing.
Just use whats fastest to implement / maintain and move on.
For lighter web frameworks that will work using CGI then have a look at....
Squatting
CGI::Application
CGI::Lazy
It depends mostly on how complex your code is and how it's put together; if you run it as CGI, perl will compile your script and modules on each invocation, and will have to reconnect to your database for each request. If your code is complex enough, this may take a few seconds per pageview, which may hamper user experience.
If your codebase and used modules isn't huge though, there should be no problem at all.
You can do a perl -c on your code to get a feel for how long perl startup and your compilation time is.
I found this PECL package called threads, but there is not a release yet. And nothing is coming up on the PHP website.
From the PHP manual for the pthreads extension:
pthreads is an Object Orientated API that allows user-land multi-threading in PHP. It includes all the tools you need to create multi-threaded applications targeted at the Web or the Console. PHP applications can create, read, write, execute and synchronize with Threads, Workers and Stackables.
As unbelievable as this sounds, it's entirely true. Today, PHP can multi-thread for those wishing to try it.
The first release of PHP4, 22 May 2000, PHP was shipped with a thread safe architecture - a way for it to execute multiple instances of it's interpreter in separate threads in multi-threaded SAPI ( Server API ) environments. Over the last 13 years, the design of this architecture has been maintained and advanced: It has been in production use on the worlds largest websites ever since.
Threading in user land was never a concern for the PHP team, and it remains as such today. You should understand that in the world where PHP does it's business, there's already a defined method of scaling - add hardware. Over the many years PHP has existed, hardware has got cheaper and cheaper and so this became less and less of a concern for the PHP team. While it was getting cheaper, it also got much more powerful; today, our mobile phones and tablets have dual and quad core architectures and plenty of RAM to go with it, our desktops and servers commonly have 8 or 16 cores, 16 and 32 gigabytes of RAM, though we may not always be able to have two within budget and having two desktops is rarely useful for most of us.
Additionally, PHP was written for the non-programmer, it is many hobbyists native tongue. The reason PHP is so easily adopted is because it is an easy language to learn and write. The reason PHP is so reliable today is because of the vast amount of work that goes into it's design, and every single decision made by the PHP group. It's reliability and sheer greatness keep it in the spot light, after all these years; where it's rivals have fallen to time or pressure.
Multi-threaded programming is not easy for most, even with the most coherent and reliable API, there are different things to think about, and many misconceptions. The PHP group do not wish for user land multi-threading to be a core feature, it has never been given serious attention - and rightly so. PHP should not be complex, for everyone.
All things considered, there are still benefits to be had from allowing PHP to utilize it's production ready and tested features to allow a means of making the most out of what we have, when adding more isn't always an option, and for a lot of tasks is never really needed.
pthreads achieves, for those wishing to explore it, an API that does allow a user to multi-thread PHP applications. It's API is very much a work in progress, and designated a beta level of stability and completeness.
It is common knowledge that some of the libraries PHP uses are not thread safe, it should be clear to the programmer that pthreads cannot change this, and does not attempt to try. However, any library that is thread safe is useable, as in any other thread safe setup of the interpreter.
pthreads utilizes Posix Threads ( even in Windows ), what the programmer creates are real threads of execution, but for those threads to be useful, they must be aware of PHP - able to execute user code, share variables and allow a useful means of communication ( synchronization ). So every thread is created with an instance of the interpreter, but by design, it's interpreter is isolated from all other instances of the interpreter - just like multi-threaded Server API environments. pthreads attempts to bridge the gap in a sane and safe way. Many of the concerns of the programmer of threads in C just aren't there for the programmer of pthreads, by design, pthreads is copy on read and copy on write ( RAM is cheap ), so no two instances ever manipulate the same physical data, but they can both affect data in another thread. The fact that PHP may use thread unsafe features in it's core programming is entirely irrelevant, user threads, and it's operations are completely safe.
Why copy on read and copy on write:
public function run() {
...
(1) $this->data = $data;
...
(2) $this->other = someOperation($this->data);
...
}
(3) echo preg_match($pattern, $replace, $thread->data);
(1) While a read, and write lock are held on the pthreads object data store, data is copied from its original location in memory to the object store. pthreads does not adjust the refcount of the variable, Zend is able to free the original data if there are no further references to it.
(2) The argument to someOperation references the object store, the original data stored, which it itself a copy of the result of (1), is copied again for the engine into a zval container, while this occurs a read lock is held on the object store, the lock is released and the engine can execute the function. When the zval is created, it has a refcount of 0, enabling the engine to free the copy on completion of the operation, because no other references to it exist.
(3) The last argument to preg_match references the data store, a read lock is obtained, the data set in (1) is copied to a zval, again with a refcount of 0. The lock is released, The call to preg_match operates on a copy of data, that is itself a copy of the original data.
Things to know:
The object store's hash table where data is stored, thread safe, is
based on the TsHashTable shipped with PHP, by Zend.
The object store has a read and write lock, an additional access lock is provided for the TsHashTable such that if requires ( and it does, var_dump/print_r, direct access to properties as the PHP engine wants to reference them ) pthreads can manipulate the TsHashTable outside of the defined API.
The locks are only held while the copying operations occur, when the copies have been made the locks are released, in a sensible order.
This means:
When a write occurs, not only are a read and write lock held, but an
additional access lock. The table itself is locked down, there is no
possible way another context can lock, read, write or affect it.
When a read occurs, not only is the read lock held, but the
additional access lock too, again the table is locked down.
No two contexts can physically nor concurrently access the same data from the object store, but writes made in any context with a reference will affect the data read in any context with a reference.
This is shared nothing architecture and the only way to exist is co-exist. Those a bit savvy will see that, there's a lot of copying going on here, and they will wonder if that is a good thing. Quite a lot of copying goes on within a dynamic runtime, that's the dynamics of a dynamic language. pthreads is implemented at the level of the object, because good control can be gained over one object, but methods - the code the programmer executes - have another context, free of locking and copies - the local method scope. The object scope in the case of a pthreads object should be treated as a way to share data among contexts, that is it's purpose. With this in mind you can adopt techniques to avoid locking the object store unless it's necessary, such as passing local scope variables to other methods in a threaded object rather than having them copy from the object store upon execution.
Most of the libraries and extensions available for PHP are thin wrappers around 3rd parties, PHP core functionality to a degree is the same thing. pthreads is not a thin wrapper around Posix Threads; it is a threading API based on Posix Threads. There is no point in implementing Threads in PHP that it's users do not understand or cannot use. There's no reason that a person with no knowledge of what a mutex is or does should not be able to take advantage of all that they have, both in terms of skill, and resources. An object functions like an object, but wherever two contexts would otherwise collide, pthreads provides stability and safety.
Anyone who has worked in java will see the similarities between a pthreads object and threading in java, those same people will have no doubt seen an error called ConcurrentModificationException - as it sounds an error raised by the java runtime if two threads write the same physical data concurrently. I understand why it exists, but it baffles me that with resources as cheap as they are, coupled with the fact the runtime is able to detect the concurrency at the exact and only time that safety could be achieved for the user, that it chooses to throw a possibly fatal error at runtime rather than manage the execution and access to the data.
No such stupid errors will be emitted by pthreads, the API is written to make threading as stable, and compatible as is possible, I believe.
Multi-threading isn't like using a new database, close attention should be paid to every word in the manual and examples shipped with pthreads.
Lastly, from the PHP manual:
pthreads was, and is, an experiment with pretty good results. Any of its limitations or features may change at any time; that is the nature of experimentation. It's limitations - often imposed by the implementation - exist for good reason; the aim of pthreads is to provide a useable solution to multi-tasking in PHP at any level. In the environment which pthreads executes, some restrictions and limitations are necessary in order to provide a stable environment.
Here is an example of what Wilco suggested:
$cmd = 'nohup nice -n 10 /usr/bin/php -c /path/to/php.ini -f /path/to/php/file.php action=generate var1_id=23 var2_id=35 gen_id=535 > /path/to/log/file.log & echo $!';
$pid = shell_exec($cmd);
Basically this executes the PHP script at the command line, but immediately returns the PID and then runs in the background. (The echo $! ensures nothing else is returned other than the PID.) This allows your PHP script to continue or quit if you want. When I have used this, I have redirected the user to another page, where every 5 to 60 seconds an AJAX call is made to check if the report is still running. (I have a table to store the gen_id and the user it's related to.) The check script runs the following:
exec('ps ' . $pid , $processState);
if (count($processState) < 2) {
// less than 2 rows in the ps, therefore report is complete
}
There is a short post on this technique here: http://nsaunders.wordpress.com/2007/01/12/running-a-background-process-in-php/
There is nothing available that I'm aware of. The next best thing would be to simply have one script execute another via CLI, but that's a bit rudimentary. Depending on what you are trying to do and how complex it is, this may or may not be an option.
In short: yes, there is multithreading in php but you should use multiprocessing instead.
Backgroud info: threads vs. processes
There is always a bit confusion about the distinction of threads and processes, so i'll shortly describe both:
A thread is a sequence of commands that the CPU will process. The only data it consists of is a program counter. Each CPU core will only process one thread at a time but can switch between the execution of different ones via scheduling.
A process is a set of shared resources. That means it consists of a part of memory, variables, object instances, file handles, mutexes, database connections and so on. Each process also contains one or more threads. All threads of the same process share its resources, so you may use a variable in one thread that you created in another. If those threads are parts of two different processes, then they cannot access each others resources directly. In this case you need inter-process communication through e.g. pipes, files, sockets...
Multiprocessing
You can achieve parallel computing by creating new processes (that also contain a new thread) with php. If your threads do not need much communication or synchronization, this is your choice, since the processes are isolated and cannot interfere with each other's work. Even if one crashes, that doesn't concern the others. If you do need much communication, you should read on at "multithreading" or - sadly - consider using another programming language, because inter-process communication and synchronization introduces a lot of complexion.
In php you have two ways to create a new process:
let the OS do it for you: you can tell your operation system to create a new process and run a new (or the same) php script in it.
for linux you can use the following or consider Darryl Hein's answer:
$cmd = 'nice php script.php 2>&1 & echo $!';
pclose(popen($cmd, 'r'));
for windows you may use this:
$cmd = 'start "processname" /MIN /belownormal cmd /c "script.php 2>&1"';
pclose(popen($cmd, 'r'));
do it yourself with a fork: php also provides the possibility to use forking through the function pcntl_fork(). A good tutorial on how to do this can be found here but i strongly recommend not to use it, since fork is a crime against humanity and especially against oop.
Multithreading
With multithreading all your threads share their resources so you can easily communicate between and synchronize them without a lot of overhead. On the other side you have to know what you are doing, since race conditions and deadlocks are easy to produce but very difficult to debug.
Standard php does not provide any multithreading but there is an (experimental) extension that actually does - pthreads. Its api documentation even made it into php.net.
With it you can do some stuff as you can in real programming languages :-) like this:
class MyThread extends Thread {
public function run(){
//do something time consuming
}
}
$t = new MyThread();
if($t->start()){
while($t->isRunning()){
echo ".";
usleep(100);
}
$t->join();
}
For linux there is an installation guide right here at stackoverflow's.
For windows there is one now:
First you need the thread-safe version of php.
You need the pre-compiled versions of both pthreads and its php extension. They can be downloaded here. Make sure that you download the version that is compatible with your php version.
Copy php_pthreads.dll (from the zip you just downloaded) into your php extension folder ([phpDirectory]/ext).
Copy pthreadVC2.dll into [phpDirectory] (the root folder - not the extension folder).
Edit [phpDirectory]/php.ini and insert the following line
extension=php_pthreads.dll
Test it with the script above with some sleep or something right there where the comment is.
And now the big BUT: Although this really works, php wasn't originally made for multithreading. There exists a thread-safe version of php and as of v5.4 it seems to be nearly bug-free but using php in a multi-threaded environment is still discouraged in the php manual (but maybe they just did not update their manual on this, yet). A much bigger problem might be that a lot of common extensions are not thread-safe. So you might get threads with this php extension but the functions you're depending on are still not thread-safe so you will probably encounter race conditions, deadlocks and so on in code you did not write yourself...
You can use pcntl_fork() to achieve something similar to threads. Technically it's separate processes, so the communication between the two is not as simple with threads, and I believe it will not work if PHP is called by apache.
If anyone cares, I have revived php_threading (not the same as threads, but similar) and I actually have it to the point where it works (somewhat) well!
Project page
Download (for Windows PHP 5.3 VC9 TS)
Examples
README
pcntl_fork() is what you are searching for, but its process forking not threading.
so you will have the problem of data exchange. to solve them you can use phps semaphore functions ( http://www.php.net/manual/de/ref.sem.php ) message queues may be a bit easier for the beginning than shared memory segments.
Anyways, a strategy i am using in a web framework that i am developing which loads resource intensive blocks of a web page (probably with external requests) parallel:
i am doing a job queue to know what data i am waiting for and then i fork off the jobs for every process. once done they store their data in the apc cache under a unique key the parent process can access. once every data is there it continues.
i am using simple usleep() to wait because inter process communication is not possible in apache (children will loose the connection to their parents and become zombies...).
so this brings me to the last thing:
its important to self kill every child!
there are as well classes that fork processes but keep data, i didn't examine them but zend framework has one, and they usually do slow but reliably code.
you can find it here:
http://zendframework.com/manual/1.9/en/zendx.console.process.unix.overview.html
i think they use shm segments!
well last but not least there is an error on this zend website, minor mistake in the example.
while ($process1->isRunning() && $process2->isRunning()) {
sleep(1);
}
should of course be:
while ($process1->isRunning() || $process2->isRunning()) {
sleep(1);
}
There is a Threading extension being activley developed based on PThreads that looks very promising at https://github.com/krakjoe/pthreads
Just an update, its seem that PHP guys are working on supporting thread and its available now.
Here is the link to it:
http://php.net/manual/en/book.pthreads.php
I have a PHP threading class that's been running flawlessly in a production environment for over two years now.
EDIT: This is now available as a composer library and as part of my MVC framework, Hazaar MVC.
See: https://git.hazaarlabs.com/hazaar/hazaar-thread
I know this is a way old question, but you could look at http://phpthreadlib.sourceforge.net/
Bi-directional communication, support for Win32, and no extensions required.
Ever heard about appserver from techdivision?
It is written in php and works as a appserver managing multithreads for high traffic php applications. Is still in beta but very promesing.
There is the rather obscure, and soon to be deprecated, feature called ticks. The only thing I have ever used it for, is to allow a script to capture SIGKILL (Ctrl+C) and close down gracefully.