php, multithreading and other doubts - php

morning
I have some doubts about the the way php works. I cant find the answer anywhere on books so I thought to hit the stack ;)
so here it goes:
lets assume we have one single server with php+apache installed. Here are my beliefs:
1 - php can handle one request at a time. Doesn't matter if apache can handle more than 1 thread at a time because eventually the invoked php interpreter is single threaded.
2 - from belief 1 follows that I believe if the server receives 4 calls at the same very time these calls are queued up and executed 1 at a time. Who makes the request last gets the response last.
3 - from 1 and 2 follows that if I cron-call a url corresponding to a script that does some heavy-lifting/time consuming stuff I slow down the server up to the moment the script returns.
Whats true? whats false?
cheers

My crystal ball suggests that you are using PHP sessions and you have having simultaneous requests (either iframes or AJAX) getting queued. The problem is that the default session handler uses files and session_start() locks the data file. You should read your session data quickily and then call session_write_close() to release the file.

I see no reason why would PHP be not able to handle multiple requests at the same time. That said, it may be semi-true for handling requests of single client, depending on the type of script.
Many scripts use sessions. When session_start() is called, session is being opened and locked. When execution of script ends, session is being closed and unlocked (this can be done manually). When there are multiple requests for the same session, first requests opens and locks session, and the second request has to wait until session is unlocked. This might make an impression that multiple PHP scripts cannot be executed at the same time, but that's true (partly) only for requests that use the same session (in other words - requests from the same browser). Requests from two clients (browsers) may be processed parallelly as long as they don't use resources (files, DB tables etc) that are being locked/unlocked in other requests.

Related

PHP requests one by one or simultaneously

I have got some questions about PHP and how requests work under the hood.
1) Let's say I wrote my PHP application and upload it to server. Now there's a function that I wrote and if user goes to that route which executes that function, something happens.
Question is: if one user makes the request and another user also makes the request, does the second user have to wait until first user's request is done? (by saying request is done, I mean until the function I wrote gets executed through the end). Is this the correct guess or it doesn't matter which function gets executed. Until the request isn't done, the second request never starts?
2) I have my PHP application. Imagine two persons make the request at the same time which writes data to the database (not writing, but updating). Let's say I used load balancers. if one user makes request to balancer1 and another user makes request to balancer2, what I want to do is if first user's call updates the database, second user's request has to stop immediately (it shouldn't be updated).
The scenario is that I have jwt token in my database which is used to do requests on third party tool. it has expiration 1 hour. Let's say 1 hour has passed. If one user makes the call to update the token and in the way second user also makes the call to update the token, what will happen is second user will update the token and first user's token will be invalid, which is bad.
PHP can handle multiple requests at the same time, but requests from the same user will be processed one by one if the user's PHP session is locked by the first request. Second request will be proceeded when session would be closed.
For example, if you run a PHP script with sleep(30) in one browser tab:
<?
session_start();
sleep(30);
And another script in another tab:
<?
session_start();
echo 'hello';
The second script won't be executed until the first one is finished.
It's important to note this behavior because sessions are used in almost every app.
If you have a route which is served by a controller function, for each request there is a separate instantiation of the controller. For example: user A and user B request same route laravel.com/stackoverflow, the controller is ready to respond to each request, independent of how many users are requesting at the same time. You can consider similar as a principle of processes for any service. For example, Laravel is running on PHP. So PHP makes process threads each time we need PHP to process any script. Similarly Laravel instantiates the controller for each request.
For same user sending multiple requests, it will be still processed like point 1.
If you want to process particular requests one by one, you can queue the jobs. For example let us say you want to process a payment. You have 5 requests happening. So the controller will be taking all requests simultaneously but the controller function can dispatch a queued job and those are processed one by one.
Considering two persons try to request same route which has an DB update function, you can read a nice article here about optimistic and pessimistic locking.
I should be voting to close this - its way too broad....but I'll give it a go.
If the requests depend on a resource which can only perform one task at a time, then they cannot "run" concurrently. It is quite possible you may have a single CPU core, or a single disk - however at the level of the HTTP request (in the absence of code to apply mutex locking) they will appear to run at the same time - that is what multi-tasking is all about. The execution thread will often be delayed waiting for something else to happen and at that point the OS task scheduler will check to see if there are any other tasks waiting to be run. You can easily test this yourself:
<?php
$started=time();
sleep(20);
print "Ran for " . (time() - $started) " seconds";
(try accessing this in different browser windows around the same time - or in 2 iframes on the same window)
Compare this with:
<?php
$started=time();
$fh=fopen("/tmp/concurency_test", "w");
flock($fh, LOCK_EX);
sleep(20);
flock($fh, LOCK_UN);
print "Ran for " . (time() - $started) " seconds";
This also demonstrates just one of the reasons why you should not use flat files for storing data on your server. Note that the default session handler in PHP uses file based locking for the duration the session data is open by the script.
Databases employ a variety of strategies to avoid reverting to single operation queuing - most commonly versioning. That does not address the problem you describe: 2 clients should never be using the same session token - that is why the session token is seperate from the credentials in a well designed system.

Can't have several threads per session

I am buidling some webapp and have implemented long-polling (and a command queue in my db) so my server can send commands to my cleint asynchronously, etc. The commands are encoded into json and sent over ajax calls for the client to server, and via long-polling for the server to client way.
Everything was working just fine, until I included my "Authentication module" in the ajax.php file. This module wraps the session stuff and calls session_start().
The problem is that, my long polling routine can wait up to 21 seconds before comming back to the client. During this time, the server won't run anything from the same session. It's instead executed right after the long polling ajax call returned.
I understand there's probably a restriction of only 1 thread per session at a time, and that the requests are queued up.
Now here's the question : What is the best way to address this? Is there a setting to allow several threads per sessions (3 would be fine, in my case). Or should I just send tell the client what is his SessionID (i have some sessions table in my db, to track which user is connected to which session(s)). The client could then send it along with any ajax calls so authentication module could be bypassed.
On the later option, iam afraid it open's up a bunch of security problems because of eventual session spoofing. I would need to send a "random string" to each session, to make sure you can't spoof too easily, but even then, it's not perfect...
Thanks for your awnsers :)
Nicolas Gauthier
It's a well known issue/fact that PHP locks session files for the duration of their usage in order to prevent race conditions.
If you take a look at the PHP source code, (ext/session/mod_files.c) you can see that the ps_files_open function locks the session file, and ps_files_close unlocks it.
If you call session_start() right at the beginning of your long-running script, and do not explicitly close the session file, it will be locked until the script terminates, where PHP will release all file locks during script shutdown.
While you are not using the session, you should call session_write_close to flush the session data to disk, and release the lock so that your other "threads" can read the data.
I'm sure you can imagine what would happen if the file was not locked.
T1: Open Session
T2: Open Session
...
T2: Write Data
T1: Write Data
The data written by thread 2 will be completely overwritten by thread 1, and at the same time, any data that thread 1 wanted to write out, was not available to thread 2.

Use asynchronous long polling?

I have an Zend–based application that uses long polling. Basically it makes a HTTP POST request, which blocks the application until it either returns or times out after 20 seconds.
I have a need to make a second request (which is currently non-parallel), where unfortunately if the first request hangs, it ends up being 20 seconds (= timeout) before the second request executes.
What is the best way to make my application asynchronous, or at the very least do non-blocking HTTP request I/O?
If your both requests use session (session_start() call) and you don't close the session in long polling script, then the session is locked for other scripts using the same session for all the time the long polling runs. These scripts must therefore wait (i think they hang on session_start(), but not sure) for closing the session, by default this is done automatically at the end of a script.
So if you don't need session in long polling, don't start it or close it (call session_write_close()) before the code that runs for 20s in your case (i.e. before main iteration in long polling).
Hope this helps.
Mmmh, maybe you should add some more information to your questions.
If the 2 requests aren't related (i.e. the second one doesn't need the first one to be finished) you can perform several queries without waiting for the first one to finish. But of course you cannot do it without some Javascript.
For example you could use jQuery ajax function in asynchronous mode (by default it's asynchronous). You can chain several ajax calls in jQuery the second one will not wait for the first one to be finished (but be careful with ajax timeout settings).

apache not multi process the same php

I create one php to send one file, but before send this file need to check some situations, one situation is a maximum access, can only exist 5 download file per time.
I create php but apache processing one request per time, not all in the same time.
e.g if I make 3 request and put sleep(3) in php file, first request slow 3 seconds, second 6 seconds, and third 9 seconds.
I`m Not understand much about php and apache.
anyone can help me ?
If you use sessions, session is locked per request, so the second, third etc. request must wait until the first finish the process.
If you expect another request while a long process running with the same session you should call
session_write_close()
http://php.net/manual/en/function.session-write-close.php
explicitly. But only if you don't want to write to the session later in the process.
Edit:
If you want to reopen the session later, you can call
session_start() http://hu.php.net/manual/en/function.session-start.php
(before any output).

php asynchronous call and getting response from the background job

I have done some google search on this topic and couldn't find the answer to my question.
What I want to achieve is the following:
the client make an asynchronous call to a function in the server
the server runs that function in the background (because that function is time consuming), and the client is not hanging in the meantime
the client constantly make a call to the server requesting the status of the background job
Can you please give me some advices on resolving my issue?
Thank you very much! ^-^
You are not specifying what language the asynchronous call is in, but I'm assuming PHP on both ends.
I think the most elegant way would be this:
HTML page loads, defines a random key for the operation (e.g. using rand() or an already available session ID [be careful though that the same user could be starting two operations])
HTML page makes Ajax call to PHP script to start_process.php
start_process.php executes exec /path/to/scriptname.php to start the process; see the User Contributed Notes on exec() on suggestions how to start a process in the background. Which one is the right for you, depends mainly on your OS.
long_process.php frequently writes its status into a status file, named after the random key that your Ajax page generated
HTML page makes frequent calls to show_status.php that reads out the status file, and returns the progress.
Have a google for long running php processes (be warned that there's a lot of bad advice out there on the topic - including the note referred to by Pekka - this will work on Microsoft but will fail in unpredicatable ways on anything else).
You could develop a service which responds to requests over a socket (your client would use fsockopen to connect) - some simple ways of acheiving this would be to use Aleksey Zapparov's Socket server (http://www.phpclasses.org/browse/package/5758.html) which handles requests coming in via a socket however since this runs as a single thread it may not be very appropriate for something which requiers a lot of processing. ALternatively, if you are using a non-Microsoft system then yo could hang your script off [x]inetd however, you'll need to do some clever stuff to prevent it terminating when the client disconnects.
To keep the thing running after your client disconnects then the PHP code must be running from the standalone PHP executable (not via the webserver) Spawn a process in a new process group (see posix_setsid() and pcntl_fork()). To enable the client to come back and check on progress, the easiest way to achieve this is to configure the server to write out its status to somewhere the client can read.
C.
Ajax call run method longRunningMethod() and get back an idendifier (e.g an id)
Server runs the method, and sets key in e.g. sharedmem
Client calls checkTask(id)
server lookup the key in sharedmem and check for ready status
[repeat 3 & 4 until 5 is finished]
longRunningMethod is finished and sets state to finished in sharedmem.
All Ajax calls are per definition asynchronous.
You could (although not a strictly necessary step) use AJAX to instantiate the call, and the script could then create a reference to the status of the background job in shared memory (or even a temporary entry in an SQL table, or even a temp file), in the form of a unique job id.
The script could then kick off your background process and immediately return the job ID to the client.
The client could then call the server repeatedly (via another AJAX interface, for example) to query the status of the job, e.g. "in progress", "complete".
If the background process to be executed is itself written in PHP (e.g. a command line PHP script) then you could pass the job id to it and it could provide meaningful progress updates back to the client (by writing to the same shared memory area, or database table).
If the process to executed it's not itself written in PHP then I suggest wrapping it in a command line PHP script so that it can monitor when the process being executed has finished running (and check the output to see if was successful) and update the status entry for that task appropriately.
Note: Using shared memory for this is best practice, but may not be available if you are using shared hosting, for example. Don't forget you want to have a means to clean up old status entries, so I would store "started_on"/"completed_on" timestamps values for each one, and have it delete entries for stale data (e.g. that have a completed_on timestamp of more than X minutes - and, ideally, that also checks for jobs that started some time ago but were never marked as completed and raises an alert about them).

Categories