I'm writing a php code for a web server where it's required to do some heavy duty processes when requested before returning the results to the users.
My question is: does the apache server creates a separate thread/process for each client or should I use multi-threading to separate them?
The processes include calling the execution of other applications through cmd and downloading files to the server.
Well every request to the web server is a separate process which will try to use a free core from the CPU, and if there isn't a free one currently, it will go on a que and wait.
You can't have multithreading in php with apache within a single web request. You simply can't. Usually at each request apache forks a new O.S. process.
This is configurable, but typically chosen when working with php, since many methods of php standard library are not thread safe.
When I had to handle heavy computation I always choose to make the user request asynchronous, and let a third-process daemon to do the actual computation in background. In this case, after the user request, I let the client to poll the daemon (through others web-requests) to know when the computation is done.
Related
I have a webpage that when users go to it, multiple (10-20) Ajax requests are instantly made to a single PHP script, which depending on the parameters in the request, returns a different report with highly aggregated data.
The problem is that a lot of the reports require heavy SQL calls to get the necessary data, and in some cases, a report can take several seconds to load.
As a result, because one client is sending multiple requests to the same PHP script, you end up seeing the reports slowly load on the page one at a time. In other words, the generating of the reports is not done in parallel, and thus causes the page to take a while to fully load.
Is there any way to get around this in PHP and make it possible for all the requests from a single client to a single PHP script to be processed in parallel so that the page and all its reports can be loaded faster?
Thank you.
As far as I know, it is possible to do multi-threading in PHP.
Have a look at pthreads extension.
What you could do is make the report generation part/function of the script to be executed in parallel. This will make sure that each function is executed in a thread of its own and will retrieve your results much sooner. Also, set the maximum number of concurrent threads <= 10 so that it doesn't become a resource hog.
Here is a basic tutorial to get you started with pthreads.
And a few more examples which could be of help (Notably the SQLWorker example in your case)
Server setup
This is more of a server configuration issue and depends on how PHP is installed on your system: If you use php-fpm you have to increase the pm.max_children option. If you use PHP via (F)CGI you have to configure the webserver itself to use more children.
Database
You also have to make sure that your database server allows that many concurrent processes to run. It won’t do any good if you have enough PHP processes running but half of them have to wait for the database to notice them.
In MySQL, for example, the setting for that is max_connections.
Browser limitations
Another problem you’re facing is that browsers won’t do 10-20 parallel requests to the same hosts. It depends on the browser, but to my knowledge modern browsers will only open 2-6 connections to the same host (domain) simultaneously. So any more requests will just get queued, regardless of server configuration.
Alternatives
If you use MySQL, you could try to merge all your calls into one request and use parallel SQL queries using mysqli::poll().
If that’s not possible you could try calling child processes or forking within your PHP script.
Of course PHP can execute multiple requests in parallel, if it uses a Web Server like Apache or Nginx. PHP dev server is single threaded, but this should ony be used for dev anyway. If you are using php's file sessions however, access to the session is serialized. I.e. only one script can have the session file open at any time. Solution: Fetch information from the session at script start, then close the session.
we use nginx and php-fpm as our game server.
we want to make sure requests from one player are processed one by one.
Then multi-threading bugs are reduced greatly in our game.
we do not know how to config nginx that way.
thank you.
The web-server itself cannot be run in single-thread mode as far as I know.
I think there is a solution to this problem. You need a queue to process a player's requests.
There are two options to create a thread-safe queue.
One is to write an interface to a thread-safe queue application which resides on the server's memory for PHP. PHP can simply add requests to this thread-safe app and then the app can run them in order.
OR You can simply store requests in a database (as they support simultaneous insertion) and then run a program which reads the requests from the db and executes them in order.
However this will add overhead to execution process.
i read somewhere that php has threading ability called pcntl_fork(). but i still dont get what it means? whats its purpose?
software languages have multithreading abilities from what i understand, correct me if im wrong, that its the ability for a parent object to have children of somekind.
thanks
From wikipedia: "In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system. It generally results from a fork of a computer program into two or more concurrently running tasks."
Basically, having threads is having the ability to do multiple things within the same running application (or, process space as said by RC).
For example, if you're writing a chat server app in PHP, it would be really nice to "fork" certain tasks so if the server gets hung up processing something like a file transfer or very slow client it can spawn a thread to take care of the file transfer while the main application continues to transfer messages between clients without delay. Last time I'd used PHP, the threading was clunky/not very well supported.
Or, on the client end, while sending said file, it would be a good idea to thread the file transfer, otherwise you wouldn't be able to send messages to the server while sending the file.
It is not a very good metaphor. Think of a thread like a worker or helper that will do work or execute code for you in the background while your program could possibly be doing other tasks such as taking user input.
Threading means you can have more than one line of execution within the same process space. Note that this is different than a multi-process paradigm as processes will not share the same memory space.
This wiki link will do a good job getting you up to speed with threads. Having said that, the function pcntl_fork() in PHP, appears to create a child process which falls inline with the multi-process paradigm.
In layman's terms, since a thread gives you more than one line of execution within a program, it allows you to do more than one thing at the same time. Technically, you're not always doing these things simultaneously as in a single core processor, you're really just time-slicing so it appears you're doing more than one thing at a time.
A pretty straight-forward use of threads is how connections to a web server are handled. If you didn't have multiple threads, your application would listen for a connection on a socket, accept the connection when a client requested a connection, and then would process whatever page the client asked for. This seems well and good until you have a page that takes 5 seconds to load and you have 2 clients connecting at the same time. One of the clients will sit and wait for the server to accept their connection for ~5 seconds, because the 1st client is using the only line of execution to serve the page and it can't do that and accept the 2nd connection.
Now if you have multiple threads, you'll have one thread (i.e. the listener thread) that only accepts connections. As soon as the connection is accepted by the listener thread, he will pass the connection on to another thread. We'll call it the processor thread. Once the connection is passed on to the processor thread, the listener thread will immediately go back to waiting for a new connection. Meanwhile, the processor thread will use it's own execution line to serve the page that takes 5 seconds. In the scenario above, the 2nd client would have it's connection accepted immediately after the 1st client was handed to the 1st processor thread and an additional 2nd processor thread would be created to handle the request from the 2nd client. This would typically allow you to serve both clients the data in a little over 5 seconds while the single-threaded app would take ~10 seconds.
Hope this helps with your understanding of application threading.
Threading mean not to allow sequential behavior inside your code, whatever is your programming language..
I have a web service running written in PHP-MYSQL. The script involves fetching data from other websites like wikipedia,google etc. The average execution time for a script is 5 secs(Currently running on 1 server). Now I have been asked to scale the system to handle 60requests/second. Which of the approach should I follow.
-Split functionality between servers (I create 1 server to fetch data from wikipedia, another to fetch from google etc and a main server.)
-Split load between servers (I create one main server which round robin the request entirely to its child servers with each child processing one complete request. What about MYSQL database sharing between child servers here?)
I'm not sure what you would really gain by splitting the functionality between servers (option #1). You can use Apache's mod_proxy_balancer to accomplish your second option. It has a few different algorithms to determine which server would be most likely to be able to handle the request.
http://httpd.apache.org/docs/2.1/mod/mod_proxy_balancer.html
Apache/PHP should be able to handle multiple requests concurrently by itself. You just need to make sure you have enough memory and configure Apache correctly.
Your script is not a server it's acting as a client when it makes requests to other sites. The rest of the time its merely a component of your server.
Yes, running multiple clients (instances of your script - you don't need more hardware) concurrently will be much faster than running the sequentially, however if you need to fetch the data synchronously with the incoming request to your script, then coordinating the results of the seperate instances will be difficult - instead you might take a look at the curl_multi* functions which allow you to batch up several requests and run them concurrently from a single PHP thread.
Alternately, if you know in advance what the incoming request to your webservice will be, then you should think about implementing scheduling and caching of the fetches so they are already available when the request arrives.
I know about PHP not being multithreaded but i talked with a friend about this: If i have a large algorithmic problem i want to solve with PHP isn't the solution to simply using the "curl_multi_xxx" interface and start n HTTP requests on the same server. This is what i would call PHP style multithreading.
Are there any problems with this in the typical webserver environment? The master request which is waiting for "curl_multi_exec" shouldn't count any time against its maximum runtime or memory length.
I have never seen this anywhere promoted as a solution to prevent a script killed by too restrictive admin settings for PHP.
If i add this as a feature into a popular PHP system will there be server admins hiring a russian mafia hitman to get revenge for this hack?
If i add this as a feature into a
popular PHP system will there be
server admins hiring a russian mafia
hitman to get revenge for this hack?
No but it's still a terrible idea for no other reason than PHP is supposed to render web pages. Not run big algorithms. I see people trying to do this in ASP.Net all the time. There are two proper solutions.
Have your PHP script spawn a process
that runs independently of the web
server and updates a common data
store (probably a database) with
information about the progress of
the task that your PHP scripts can
access.
Have a constantly running daemon
that checks for jobs in a common
data store that the PHP scripts can
issue jobs to and view the progress
on currently running jobs.
By using curl, you are adding a network timeout dependency into the mix. Ideally you would run everything from the command line to avoid timeout issues.
PHP does support forking (pcntl_fork). You can fork some processes and then monitor them with something like pcntl_waitpid. You end up with one "parent" process to monitor the children it spanned.
Keep in mind that while one process can startup, load everything, then fork, you can't share things like database connections. So each forked process should establish it's own. I've used forking for up 50 processes.
If forking isn't available for your install of PHP, you can spawn a process as Spencer mentioned. Just make sure you spawn the process in such a way that it doesn't stop processing of your main script. You also want to get the process ID so you can monitor the spawned processes.
exec("nohup /path/to/php.script > /dev/null 2>&1 & echo $!", $output);
$pid = $output[0];
You can also use the above exec() setup to spawn a process started from a web page and get control back immediately.
Out of curiosity - what is your "large algorithmic problem" attempting to accomplish?
You might be better to write it as an Amazon EC2 service, then sell access to the service rather than the package itself.
Edit: you now mention "mass emails". There are already services that do this, they're generally known as "spammers". Please don't.
Lothar,
As far as I know, php don't work with services, like his concorrent, so you don't have a way for php to know how much time have passed unless you're constantly interrupting the process to check the time passed .. So, imo, no, you can't do that in php :)