How server manage different user's requests at a time? - php

can you tell me how server handles different http request at a time. If 10 users logged in a site and send request for a page at the same time what will happen?

Usually, each of the users sends a HTTP request for the page. The server receives the requests and delegates them to different workers (processes or threads).
Depending on the URL given, the server reads a file and sends it back to the user. If the file is a dynamic file such as a PHP file, the file is executed before it's send back to the user.
Once the requested file has been sent back, the server usually closes the connection after a few seconds.
For more, see: HowStuffWorks Web Servers

HTTP uses TCP which is a connection-based protocol. That is, clients establish a TCP connection while they're communicating with the server.
Multiple clients are allowed to connect to the same destination port on the same destination machine at the same time. The server just opens up multiple simultaneous connections.
Apache (and most other HTTP servers) have a multi-processing module (MPM). This is responsible for allocating Apache threads/processes to handle connections. These processes or threads can then run in parallel on their own connection, without blocking each other. Apache's MPM also tends to keep open "spare" threads or processes even when no connections are open, which helps speed up subsequent requests.
The program ab (short for ApacheBench) which comes with Apache lets you test what happens when you open up multiple connections to your HTTP server at once.
Apache's configuration files will normally set a limit for the number of simultaneous connections it will accept. This will be set to a reasonable number, such that during normal operation this limit should never be reached.
Note too that the HTTP protocol (from version 1.1) allows for a connection to be kept open, so that the client can make multiple HTTP requests before closing the connection, potentially reducing the number of simultaneous connections they need to make.
More on Apache's MPMs:
Apache itself can use a number of different multi-processing modules (MPMs). Apache 1.x normally used a module called "prefork", which creates a number of Apache processes in advance, so that incoming connections can often be sent to an existing process. This is as I described above.
Apache 2.x normally uses an MPM called "worker", which uses multithreading (running multiple execution threads within a single process) to achieve the same thing. The advantage of multithreading over separate processes is that threading is a lot more light-weight compared to opening separate processes, and may even use a bit less memory. It's very fast.
The disadvantage of multithreading is you can't run things like mod_php. When you're multithreading, all your add-in libraries need to be "thread-safe" - that is, they need to be aware of running in a multithreaded environment. It's harder to write a multi-threaded application. Because threads within a process share some memory/resources between them, this can easily create race condition bugs where threads read or write to memory when another thread is in the process of writing to it. Getting around this requires techniques such as locking. Many of PHP's built-in libraries are not thread-safe, so those wishing to use mod_php cannot use Apache's "worker" MPM.

Apache 2 has two different modes of operation. One is running as a threaded server the other is using a mode called "prefork" (multiple processes).

The requests will be processed simultaneously, to the best ability of the HTTP daemon.
Typically, the HTTP daemon will spawn either several processes or several threads and each one will handle one client request. The server may keep spare threads/processes so that when a client makes a request, it doesn't have to wait for the thread/process to be created. Each thread/process may be mapped to a different processor or core so that they can be processed more quickly. In most circumstances, however, what holds the requests is network I/O, not lack of raw computing, so there is frequently no slowdown from having a number of processors/cores significantly lower than the number of requests handled at one time.

The server (apache) is multi-threaded, meaning it can run multiple programs at once. A few years ago, a single CPU could switch back and forth quickly between multiple threads, giving on the appearance that two things were happening at once. These days, computers have multiple processors, so the computer can actually run two threads of code simultaneously. That being said, threads aren't really mapped to processors in any simple way.
With that ability, a PHP program can be thought of as a single thread of execution. If two requests reach the server at the same time, two threads can be used to process the request simultaneously. They will probably both get about the same amount of CPU, so if they are doing the same thing, they will complete at approximately the same time.
One of the most common issues with multi-threading is "race conditions"-- where you two requests are doing the same thing ("racing" to do the same thing), if it is a single resource, one of them is going to win. If they both insert a record into the database, they can't both get the same id-- one of them will win. So you need to be careful when writing code to realize other requests are going on at the same time and may modify your database, write files or change globals.
That being said, the programming model allows you to mostly ignore this complexity.

Related

how are concurrent requests handled in PHP (using - threads, thread pool or child processes)

I understand that PHP supports handling multiple concurrent connections and depending on server it can be configured as mentioned in this answer
How does server manages multiple connections does it forks a child process for each request or does it handle using threads or does it handles using a thread pool?
The linked answer says a process is forked and then the author in comment says threads or process, which makes it confusing, if requests are served using child-processes, threads or thread pool?
As I know, every webserver has it's own kind of handling multpile simultanous request.
Usually Apache2 schould fork a child process for each new request. But you can somehow configure this behaviour as mentioned in your linked StackOverflow answer.
Nginx for example gets every request in one thread (processes new connections asyncronously like Node.js does) or sometimes uses caching (as configured; Nginx could also be used as a load balancer or HTTP proxy). It's a thing of choosing the right webserver for your application.
Apache2 could be a very good webserver but you need more loadbalancing when you want to use it in production. But it also has good power when having multiply short lasting connections or even documents which don't change at all (or using caching).
Nginx is very good if you expect many long lasting connections with somehow long processing time. You don't need that much loadbalancing then.
I hope, I was able to help you out with this ;)
Sources:
https://httpd.apache.org/docs/2.4/mod/worker.html
https://anturis.com/blog/nginx-vs-apache/
I recommend you to also look at: What is thread safe or non-thread safe in PHP?
I think the answer depends on how the web server and the cgi deploy.
In my company, we use Nginx as the web server and php-fpm as cgi, so the concurrent request is handled as process by php-fpm, not thread.
We configure the max number of process, and each request is handled by a single php process, if more requests(larger than the max number of process) come , they wait.
So, I believe PHP itself can support all of them, but how to use it, that depends.
After doing some research I ended up with below conclusions.
It is important to consider how PHP servers are set to be able to get insights into it.For setting up the server and PHP on your own, there could be three possibilities:
1) Using PHP as module (For many servers PHP has a direct module interface (also called SAPI))
2) CGI
3) FastCGI
Considering Case#1 PHP as module, in this case the module is integrated with the web server itself and now it puts the ball entirely on web server how it handles requests in terms of forking process, using threads, thread pools, etc.
For module, Apache mod_php appears to be very commonly used, and the Apache itself handles the requests using processes and threads in two models as mentioned in this answer
Prefork MPM uses multiple child processes with one thread each and
each process handles one connection at a time.
Worker MPM uses
multiple child processes with many threads each. Each thread handles
one connection at a time.
Obviously, other servers may take other approaches but, I am not aware of same.
For #2 and #3, web server and PHP part are handled in different processes, and how a web server handles the request and how it is further processed by application(PHP part) varies. For e.g.: NGINX may handle the request using asynchronous non-blocking I/O and Apache may handle requests using threads, but, how the request would be processed by FastCGI or CGI application is a different aspect as described below. Both the aspects i.e. how web server handles requests and how PHP part is processed would be important for PHP servers performance.
Considering #2, CGI protocol has makes web server and application (PHP) independent of each other and CGI Protocol requires application and web server to be handled using different process and the protocol does not promote reuse of the same process, which in turn means a new process is required to handle each request.
Considering#3, FastCGI protocol overcomes the limitation of CGI by allowing process re-use. If you check IIS FastCGI link FastCGI addresses the performance issues that are inherent in CGI by providing a mechanism to reuse a single process over and over again for many requests.
FastCGI maintains compatibility with non-thread-safe libraries by
providing a pool of reusable processes and ensuring that each process
handles only one request at a time.
That said, in case of FastCGI it appears that the server maintains a process pool and it uses the process pool to handle incoming client requests and since, the process pool does not require thread safe check, it provides a good performance.
PHP does not handle requests. The web server does.
For Apache HTTP Server, the most popular is "mod_php". This module is actually PHP itself, but compiled as a module for the web server, and so it gets loaded right inside it.
Since with mod_php, PHP gets loaded right into Apache, if Apache is going to handle concurrency using its Worker MPM (that is, using Threads)
For nginx PHP is totally outside of the web server with multiple PHP processes
It gives you choice sometimes to use non-thread safe or thread safe PHP.
But setlocale() function (when supported) is actually modifies the operation system process status and it is not thread safe.
You should remember it when you are not sure of how legacy code works.

How to process multiple parallel requests from one client to one PHP script

I have a webpage that when users go to it, multiple (10-20) Ajax requests are instantly made to a single PHP script, which depending on the parameters in the request, returns a different report with highly aggregated data.
The problem is that a lot of the reports require heavy SQL calls to get the necessary data, and in some cases, a report can take several seconds to load.
As a result, because one client is sending multiple requests to the same PHP script, you end up seeing the reports slowly load on the page one at a time. In other words, the generating of the reports is not done in parallel, and thus causes the page to take a while to fully load.
Is there any way to get around this in PHP and make it possible for all the requests from a single client to a single PHP script to be processed in parallel so that the page and all its reports can be loaded faster?
Thank you.
As far as I know, it is possible to do multi-threading in PHP.
Have a look at pthreads extension.
What you could do is make the report generation part/function of the script to be executed in parallel. This will make sure that each function is executed in a thread of its own and will retrieve your results much sooner. Also, set the maximum number of concurrent threads <= 10 so that it doesn't become a resource hog.
Here is a basic tutorial to get you started with pthreads.
And a few more examples which could be of help (Notably the SQLWorker example in your case)
Server setup
This is more of a server configuration issue and depends on how PHP is installed on your system: If you use php-fpm you have to increase the pm.max_children option. If you use PHP via (F)CGI you have to configure the webserver itself to use more children.
Database
You also have to make sure that your database server allows that many concurrent processes to run. It won’t do any good if you have enough PHP processes running but half of them have to wait for the database to notice them.
In MySQL, for example, the setting for that is max_connections.
Browser limitations
Another problem you’re facing is that browsers won’t do 10-20 parallel requests to the same hosts. It depends on the browser, but to my knowledge modern browsers will only open 2-6 connections to the same host (domain) simultaneously. So any more requests will just get queued, regardless of server configuration.
Alternatives
If you use MySQL, you could try to merge all your calls into one request and use parallel SQL queries using mysqli::poll().
If that’s not possible you could try calling child processes or forking within your PHP script.
Of course PHP can execute multiple requests in parallel, if it uses a Web Server like Apache or Nginx. PHP dev server is single threaded, but this should ony be used for dev anyway. If you are using php's file sessions however, access to the session is serialized. I.e. only one script can have the session file open at any time. Solution: Fetch information from the session at script start, then close the session.

CGI is inefficient, but what is being used that is so different nowaydays?

My understanding is that creating CGI scripts is a thing of a past, and has deemed inefficient because of the way it forks every time its being called. However I don't see what the different is considering when you call a web page with php scripts embeded, it still in some way forks to another process, so why is CGI deemed inefficient?
There are two "mainstream" ways around forking for every request:
You can load the interpreter directly into your server's process space, and prefork a number of set instances during startup. mod_php and mod_python take roughly this tactic.
You can create a persistent process for the interpreter, and then either prefork or spawn threads for each request, communicating with the server over sockets. FastCGI is used this way.
Event-driven servers, while not exactly mainstream, are becoming more common for good reason. They rely on the knowledge that most websites spend most of their time blocking on I/O, just spinning their metaphorical gears. Whenever a request needs to do any I/O, the server is free to start handling another request without starting another thread/process by using select() and friends. This is really the only way to solve the C10k problem.
CGI forks every request. This means the "weight" of forking/initializing is done every time.
FCGI or mod_php only fork on server start or on load increase. This means setup is only done once per process (but they don't share memory).
Facebook's HipHop doesn't fork (it transforms the PHP into thread-safe C++ code). This allows all the PHP "process" to live in a single multithreaded C++ binary, resulting in a substantial speed up and decrease in memory usage.
There is FastCGI, when you start some instances of the php binary, and they handle queries one after another, this way those processes do not start over and over again for each request.
Also PHP is typically run as apache module, which has a lifecycle tied to the webserver, so no additional overhead between handling requests, when Apache identifies a file as PHP script it calls mod_php which was started togather with the webserver itself.
CGI's overhead was mainly from having to fork() the webserver process handling the request, firing up a shell, and then running an external program.
Since the request was an external program, almost everying to do with the request had to be copied into the new shell's environment variables - query string, remote address, authentication data, etc.. POST data was passed to the script via its stdin. All of this took time to copy and configure, adding to the overhead.
This is one reason why query strings had to be length limited. Some operating systems had (and still do) length limits on the name of environment variables, and definitely had limits on how large an individual variable's contents could be. The HTTP spec itself has no limits, but because of the CGI mechanism and underlying OS limits, limited length query strings became the norm.
Microsoft IIS allows the script to be interpreted in-process by loading a library into its address space. (Newer versions of IIS work more like FastCGI in that the actual script execution is performed in a separate web worker process, although this is subject to the configuration settings for the virtual directory.)

load balancing in php

I have a web service running written in PHP-MYSQL. The script involves fetching data from other websites like wikipedia,google etc. The average execution time for a script is 5 secs(Currently running on 1 server). Now I have been asked to scale the system to handle 60requests/second. Which of the approach should I follow.
-Split functionality between servers (I create 1 server to fetch data from wikipedia, another to fetch from google etc and a main server.)
-Split load between servers (I create one main server which round robin the request entirely to its child servers with each child processing one complete request. What about MYSQL database sharing between child servers here?)
I'm not sure what you would really gain by splitting the functionality between servers (option #1). You can use Apache's mod_proxy_balancer to accomplish your second option. It has a few different algorithms to determine which server would be most likely to be able to handle the request.
http://httpd.apache.org/docs/2.1/mod/mod_proxy_balancer.html
Apache/PHP should be able to handle multiple requests concurrently by itself. You just need to make sure you have enough memory and configure Apache correctly.
Your script is not a server it's acting as a client when it makes requests to other sites. The rest of the time its merely a component of your server.
Yes, running multiple clients (instances of your script - you don't need more hardware) concurrently will be much faster than running the sequentially, however if you need to fetch the data synchronously with the incoming request to your script, then coordinating the results of the seperate instances will be difficult - instead you might take a look at the curl_multi* functions which allow you to batch up several requests and run them concurrently from a single PHP thread.
Alternately, if you know in advance what the incoming request to your webservice will be, then you should think about implementing scheduling and caching of the fetches so they are already available when the request arrives.

How do PHP's p* connect methods work?

My understanding is that PHP's p* connections is that it keeps a connection persistent between page loads to the service (be it memcache, or a socket etc). But are these connections thread safe? What happens when two pages try to access the same connection at the same time?
In the typical unix deployment, PHP is installed as a module that runs inside the apache web server, which in turn is configured to dispatch HTTP requests to one of a number of spawned children.
For the sake of efficiency, apache will often spawn these processes ahead of time (pre-forking them) and maintain them, so that they can dispatch more than one request, and save the overhead of starting up a process for every request that comes in.
PHP works on the principle of starting every request with a clean environment; no script variables persist between page loads. (Contrast this with mod_perl or python, where applications often manifest subtle bugs due to unexpected state hangovers).
This means that the typical resource allocated by a PHP script, be it an image handle for GD or a database connection, will be released at the end of a request.
Some resources, particularly Oracle database connections, have quite a high cost to establish, so it is desirable to somehow cache that connection between dispatched web requests.
Enter persistent resources.
The way these work is that any given apache child process may maintain a resource beyond the scope of a request by registering it in a "persistent list" of resources. The persistent list is not cleaned up at the end of the request (known as RSHUTDOWN internally). When you use a pconnect function, it will look up the persistent list entry for a given set of unique credentials and return that, if it exists, or establish a new connection with those credentials.
If you have configured apache to maintain 200 child processes, you should expect to see that many connections established from your web server to your database machine.
If you have many web servers and a single database machine, you may end loading your database machine much more than you anticipated.
With a threaded SAPI, the persistent list is maintained per thread, so it should be thread safe and have similar benefits, but the usual caveat about PHP not being recommended to run in threaded SAPI applies--while PHP is itself thread safe, so many libraries that it uses may have thread safety problems of their own and cause you a good number of headaches.
The manual's page Persistent Database Connections might get you a couple of informations about persistent connections.
It doesn't say anything specific about thread safety, still ; I've quite never seen anything about that anywhere, as far as I remember, so I suppose it "just works OK". My guess would be a connection is re-used only if not already used by another thread at the same time, but it's just some kind of (logical) wild guess...
Generally speaking, PHP will make one persistent connection per process or thread running on the webserver. Because of this, a process or thread will not access the connection of another process or thread.
Instead, when you make a database connection PHP will check to see if one is already open (in the process or thread that is handling the page request) and if it is then it will use it, otherwise it will just initialize a new one.
So to answer your question, they aren't necessarily thread safe but because of how they operate there isn't a situation where two threads or processes will access the same connection.
Generally speaking, when a PHP script requests a persistent connection, PHP will look for one in the connection pool with the same connection parameters.
If one is found that is NOT being used, it is given to the script, and returned to the pool at the end of the script.

Categories