I have a PHP script which opens a socket connection(using fsockopen) and takes about 15secs to complete/return the result to the browser. Mean while, if the browser sends a second request it is serialized. This is giving a bad user experience because if the user clicks 3 times, then the third request which gets sent after 30sec is the one that gets the response -- the first 2 requests from browser prespective are getting lost.
I do not have any session in my script, but tried putting session_write_close() at the beginning of my script which didnt help.
Also session.auto_start in the php.ini = 0.
Any ideas as to how to make the client requests from the same browser parallel??
Thanks
Gary
1) Download and install Firefox
2) Download and install Firebug
3) Add a sleep(10) to your PHP script so that it pauses for a few seconds before returning its response
4) Open up your webpage, and watch the outbound connections with Firebug. You should see several that are open and do not yet have a response. They should all return at about the same time, when each one finishes the 10 second delay.
If you do not see multiple connections open at the same time, and return at approximately the same time, then you need to look at your front end code. AJAX requests are asynchronous and can run in parallel. If you are seeing them run serially instead, then it means you need to fix your JavaScript code, not anything on the server.
Parallel asynchronous Ajax requests using jQuery
You should if at all possible install(*nix) redis.
To install just do simple
make
With lpush/brpop you can handle this stuff asynchronously and keep order intact. If you spawn couple of worker threads you could even handle multiple requests simultaneous. the predis client library is pretty solid
Related
Suppose a page takes a long time to generate, some large report for example, and the user closes the browser, or maybe they press refresh, does the PHP engine stop generating the page from the original request?
And if not, what can one do to cope with users refreshing a page a lot that causes an expensive report to be generated.
I have tried this and it seems that it does not stop any running query on the database. But that could be an engine problem, not PHP.
Extra info:
IIS7
MS SQL Server via ODBC
When you send a request to the server, it is executed on the server without any communication with the browser until information is sent back to the browser. When PHP tries to send data back to the browser, it will fail and therefore the script will exit.
However, if you have a lot of code executing before any headers are sent, this will continue to execute until the headers are sent and a failed response is received.
PHP knows when a connection has been closed when it tries to output some data (and fails). echo, print, flush, etc. Aside from this, no, it doesn't; everything else is happening on the server end.
There is little in the way of passing back information about the browser state once a request has been made (or in your case, in progress)
To know if a user is still connected to your site, you will need to implement a long poll / comet or perhaps a web socket.
Alternatively - you may want to run the long query initiated via an ajax call - while keeping the main browser respsonsive (not white screened). This allows you to detect if the browser is closed during the long query with a Javascript event onbeforeunload() to notify your backend that the user has left. (I'm not sure how you would interupt a query in progress from another HTTP request though)
PHP have two functions to control this. set_time_limit(num) able to increase the limit before a page execution "dies". If you don't expand that limit, a page running "too long" will die. Bad for a long process. Also you need ignore_user_abort(TRUE) so the server don't close the PHP process if the server detect that the page has ben closed in the client side.
You may also need to check for memory leaks if you are writing something that use much memory and run for several hours.
When you send a request to the server the server will go away and perform the appropriate actions. IIS/SQL Server does not know if the browser has been closed (and it is not IIS/SQL Server's responsibility to understand this) so it will execute the commands (as told to do so by the PHP engine until it has finished or until the engine kills any transactions). Since your report could be dynamic, IIS will not cache page requests, SQL Server however can cache the last previously ran queries therefore you will see some performance gain from the database backend.
On executing two very simple ajax POST requests (successivly), the Apache server seems to always respond in the same order as they were requested, although the second request takes significant less amount of time to process than the first request.
The time it takes the server to process Request1 is 30 seconds.
The time it takes the server to process Request2 is 10 seconds.
var deferred1 = dojo.xhrPost(xhrArgs1);
var deferred2 = dojo.xhrPost(xhrArgs2);
I expect Apache to achieve some "parallelization" on my dual core machine, which is obviously not happening.
When I execute each request at the same time in a separate broswer then works ok, the Request2 is returned first.
Facts:
httpd.conf has: ThreadsPerChild 50, MaxRequestsPerChild 50
PHP version : 5.2.5
Apache's access log states that both client requests are received at about the same time, which is as expected.
The Php code on the server side is something as simple as sleep(30)/sleep(10)
Any idea about why I don't get the "parallelization" when run from the same browser?
Thanks
When your two requests are sent from the same browser, they both share the same session.
When sessions are stored in files (that's the default), there is a locking mecanism that's used, to ensure that two scripts will not use the same session at the same time -- allowing that could result in the session data of the first script being overwriten by the second one.
That's why your second script doesn't start before the first one is finished : it's waiting for the lock (created by the first script) on the session data to be released.
For more informations, take a look at the manual page of session_write_close() -- which might be a solution to your problem : close the session before the sleep() (quoting) :
Session data is usually stored after
your script terminated without the
need to call session_write_close(),
but as session data is locked to
prevent concurrent writes only one
script may operate on a session at any
time. When using framesets
together with sessions you will
experience the frames loading one by
one due to this locking. You can
reduce the time needed to load all the
frames by ending the session as soon
as all changes to session variables
are done.
Browsers typically have a limit of two connections to the same site (although you may increase that limit in some browsers). Some browsers will keep one connection for downloading things like images etc. and another connection for XHR. Which means that your two XHR calls actually goes out in the same connection, one after the other.
Your browser will return immediately after each XHR call because they are async, but internally it may just batch up the requests.
When you run on two different browsers, obviously they each have the two connections, so the two XHR requests go out in different connections. No problem here.
Now it depends on the browser. If the browser allows you to occupy both connections with XHR calls, then you can get up to two requests running simultaneously. Then it will be up to the server which one to do first.
In any rate, if you try with three (or any number >2) XHR requests simultaneously, you will not get more than 2 executed on the server at the same time on modern browsers.
I created a script that gets data from some web services and our database, formats a report, then zips it and makes it available for download. When I first started I made it a command line script to see the output as it came out and to get around the script timeout limit you get when viewing in a browser. But because I don't want my user to have to use it from the command line or have to run php on their computer, I want to make this run from our webserver instead.
Because this script could take minutes to run, I need a way to let it process in the background and then start the download once the file has been created successfully. What's the best way to let this script run without triggering the timeout? I've attempted this before (using the backticks to run the script separately and such) but gave up, so I'm asking here. Ideally, the user would click the submit button on the form to start the request, then be returned to the page instead of making them stare at a blank browser window. When the zip file they exists (meaning the process has finished), it should notify them (via AJAX? reloaded page? I don't know yet).
This is on windows server 2007.
You should run it in a different process. Make a daemon that runs continuously, hits a database and looks for a flag, like "ShouldProcessData". Then when you hit that website switch the flag to true. Your daemon process will see the flag on it's next iteration and begin the processing. Stick the results in to the database. Use the database as the communication mechanism between the website and the long running process.
In PHP you have to tell what time-out you want for your process
See PHP manual set_time_limit()
You may have another problem: the time-out of the browser itself (could be around 1~2 minutes). While that time-out should be changeable within the browser (for each browser), you can usually prevent the time-out user side to be triggered by sending some data to the browser every 20 seconds for instance (like the header for download, you can then send other headers, like encoding etc...).
Gearman is very handy for it (create a background task, let javascript poll for progress). It does of course require having gearman installed & workers created. See: http://www.php.net/gearman
Why don't you make an ajax call from the page where you want to offer the download and then just wait for the ajax call to return and also set_time_limit(0) on the other page.
I have a PHP script that is kicked off via ajax. This PHP script uses exec() to run a separate PHP script via the shell.
The script that is called via exec() may take 30 seconds or so to complete. I need to update the UI once it is finished.
Which of these options is preferred?
a) Leave the HTTP connection open for the 30 seconds and wait for it to finish.
b) Have exec() run the PHP script in the background and then use ajax polling to check for completion (every 5 seconds or so).
c) Something else that I haven't thought of.
Thank you, Brian
Poll the server for updates every few seconds. When you leave connections open for that long a period of time there's always the possibility that they may be dropped by the server or their browser (browsers timeout if an HTTP request takes too long).
Option b) feels a little too stateful to me. Does the server need to receive a request once the 30 seconds are done, else it gets into a bad state? (like it doesn't relinquish resources or something of the like) If so, definitely go with a) methinks.
As for c), maybe you'll find something on the AJAX Pattern's Web Site under Browser-Server Diaglog.
The AJAX option seems good to me. One alternative is Comet (Ajax Push) style to minimize required traffic: Server sends signal to client(browser) when it has to say something (update UI).
a)
could have problems with timeout and locks server requests (usually you set limit server accepts connections). you could lock the server if many users add requests on the server. on http environment i would only keep open connections as long as necessary.
b)
if it is 30 seconds long i would poll not so often like each second. i would increase the polling time. is the execution time always 30 seconds? example polling style (payload is json):
# trigger job/execution
POST /job
=> response gives 301 redirect to /jobs/{job-id}
# polling
GET /jobs/{job-id}
=> {status:busy}
or
=> {status:completed,result:...}
but in the end it depends on the problem, i like b) more but it adds more effort to implement. maybe you have more details? is it a high traffic scenario?
I've written in PHP a script that takes a long time to execute [Image processing for thousands of pictures]. It's a meter of hours - maybe 5.
After 15 minutes of processing, I get the error:
ERROR
The requested URL could not be retrieved
The following error was encountered while trying to retrieve the URL: The URL which I clicked
Read Timeout
The system returned: [No Error]
A Timeout occurred while waiting to read data from the network. The network or server may be down or congested. Please retry your request.
Your cache administrator is webmaster.
What I need is to enable that script to run for much longer.
Now, here are all the technical info:
I'm writing in PHP and using the Zend Framework. I'm using Firefox. The long script that is processed is done after clicking a link. Obviously, since the script is not over I see the web page on which the link was and the web browser writes "waiting for ...".
After 15 minutes the error occurs.
I tried to make changes to Firefox threw about:config but without any success. I don't know, but the changes might be needed somewhere else.
So, any ideas?
Thanks ahead.
set_time_limit(0) will only affect the server-side running of the script. The error you're receiving is purely browser-side. You have to send SOMETHING to keep the browser from deciding the connection's dead - even a single character of output (followed by a flush() to make sure it actually get sent out over the wire) will do. Maybe once every image that's processed, or on a fixed time interval (if last char sent more than 5 minutes ago, output another one).
If you don't want any intermediate output, you could do ignore_user_abort(TRUE), which will allow the script to keep running even if the connection gets shut down from the client side.
If the process runs for hours then you should probably look into batch processing. So you just store a request for image processing (in a file, database or whatever works for you) instead of starting the image processing. This request is then picked up by a scheduled (cron) process running on the server, which will do the actual processing (this can be a PHP script, which calls set_time_limit(0)). And when processing is finished you could signal the user (by mail or any other way that works for you) that the processing is finished.
use set_time_limit
documentation here
http://nl.php.net/manual/en/function.set-time-limit.php
If you can split your work in batches, after processing X images display the page with some javascript (or META redirects) on it to open the link http://server/controller/action/nextbatch/next_batch_id.
Rinse and repeat.
batching the entire process also has the added benefit that once something goes wrong, you don't have to start out the entire thing anew.
If you're running on a server of your own and can get out of safe_mode, then you could also fork background processes to do the actual heavy lifting, independent of your browser view of things. If you're in a multicore or multiprocessor environment, you can even schedule more than one running process at any time.
We've done something like that for large computation scripts; synchronization of the processes happened over a shared database---but luckily enough, they processes were so independent that the only thing we needed to see was their completion or termination.