I am working in a tool in PHP that processes a lot of data and takes a while to finish. I would like to keep the user updated with what is going on and the current task processed.
What is in your opinion the best way to do it? I've got some ideas but can't decide for the most effective one:
The old way: execute a small part of the script and display a page to the user with a Meta Redirect or a JavaScript timer to send a request to continue the script (like /script.php?step=2).
Sending AJAX requests constantly to read a server file that PHP keeps updating through fwrite().
Same as above but PHP updates a field in the database instead of saving a file.
Does any of those sound good? Any ideas?
Thanks!
Rather than writing to a static file you fetch with AJAX or to an extra database field, why not have another PHP script that simply returns a completion percentage for the specified task. Your page can then update the progress via a very lightweight AJAX request to said PHP script.
As for implementing this "progress" script, I could offer more advice if I had more insight as to what you mean by "processes a lot of data". If you are writing to a file, your "progress" script could simply check the file size and return the percentage complete. For more complex tasks, you might assign benchmarks to particular processes and return an estimated percentage complete based on which process has completed last or is currently running.
UPDATE
This is one suggested method to "check the progress" of an active script which is simply waiting for a response from a request. I have a data mining application that I use a similar method for.
In your script that makes the request you're waiting for (the script you want to check the progress of), you can store (either in a file or a database, I use a database as I have hundreds of processes running at any time which all need to track their progress, and I have another script that allows me to monitor progress of these processes) a progress variable for the process. When the process begins, set this to 1. You can easily select an arbitrary number of 'checkpoints' the script will pass and calculate the percentage given the current checkpoint. For a large request, however, you might be more interested in knowing the approximate percent the request has completed. One possible solution would be to know the size of the returned content and set your status variable according to the percentage received at any moment. I.e. if you receive the request data in a loop, each iteration you could update the status. Or if you are downloading to a flat file you could poll the size of the file. This could be done less accurately with time (rather than file size) if you know the approximate time the request should take to complete and simply compare against the script's current execution time. Obviously neither of these are perfect solutions, but I hope they'll give you some insight into your options.
I suggest using the AJAX method, but not using a file or a database. You could probably use session values or something like that, that way you don't have to create a connection or open a file to do anything.
In the past, I've just written messages out to the page and used flush() to flush the output buffer. Very simple, but it may not work correctly on every web server or with every web browser (as they may do their own internal buffering).
Personally, I like your second option the best. Should be reliable and fairly simple to implement.
I like option 2 - using AJAX to read a status file that PHP writes to periodically. This opens up a lot of different presentation options. If you write a JSON object to the file, you can easily parse it and display things like a progress bar, status messages, etc...
A 'dirty' but quick-and-easy approach is to just echo out the status as the script runs along. So long as you don't have output buffering on, the browser will render the HTML as it receives it from the server (I know WordPress uses this technique for it's auto-upgrade).
But yes, a 'better' approach would be AJAX, though I wouldn't say there's anything wrong with 'breaking it up' use redirects.
Why not incorporate 1 & 2, where AJAX sends a request to script.php?step=1, checks response, writes to the browser, then goes back for more at script.php?step=2 and so on?
if you can do away with IE then use server sent events. its the ideal solution.
Related
Simple question, but I can't seem to find the answer.
My php code takes a really long time to process because I'm generating a report from a large database. I coded an html table to display the results in a web page, but the page loads (and gets sent to clients) before my php code finishes because all the table values are empty. I run the query on phpMyAdmin and it works, but it just takes a long time. Ideas? Are there any other ways I can display the report in a table format besides seeing it in a webpage? Can I make the webpage wait until the code finishes?
There are several approaches
one is using
ob_start();
// processing
ob_flush();
flush();
the next is adding pagination, aka limiting the result size.
SELECT * FROM table LIMIT 0,10
SELECT * FROM table LIMIT 10,10
SELECT * FROM table LIMIT 20,10
of course it all depends on your code, without seeing your code there's only guessing what the reason might be
Can I make the webpage wait until the code finishes?
It's really, really difficult to write PHP code which implements asynchronous database calls - which it would need to do if the PHP script completes before the MySQL script. Just change strip out all the asynchronous handlers in the PHP code and make the MySQL calls blocking and it will not exit before the queries complete - but I very much doubt that is what your code really is doing.
but the page loads (and gets sent to clients)
This is confused too - if you're generating HTMLthen the page is laoded after the HTML is sent to the client - not before.
Simple question
No it's not - it's very confused!
The prudent way
One of the correct approaches, at least one I'd recommend, would be to, upon request from the user, add the job to a queue that is handled by a background process, for example, a PHP command line script running from a cron job. While that is going on, you can periodically request job status from the server via an AJAX call from your webpage, display progress, if you can, and present the user with the result once the job is finished. Since command line PHP scripts don't have time limits, you don't have to worry about timeouts.
Another way is what is implemented, for example, in 37signals' Highrise - they take the request add a job but display a page saying "It will be ready when it's ready," and when it is ready, they send an email to the user saying "Here's your file, come here and download."
The quick fix
To answer the question "Can I make the webpage wait until the code finishes?" – there is the set_time_limit() function that does exactly what you want.
I need to create an event listener. I'm a novice so be kind :)
Basically I am on page1.php (php file); I want inside a loop to go check page2.xml (xml file) for some information which should be received at some point. Either check it all the time, or wait and every 5 minutes or so to see if some information has been received there. Either of them work for me.
If no info has been received after a few minutes, then I want to run again the loop (until it is received), otherwise, move forward and do something with my newly received information. This part I have no problem with, just the event listener itself. I couldn't find the function I should be using anywhere. :( I only need to check and retrieve the content of the xml file every so often.
I am not so sure how I should go about this if there isn't just a function which does this, but I couldn't find much when I searched for "event listener php".
Any help would be appreciated: reference to tutorials/sample code/even just telling me what keywords I should be looking for or what I need to learn first in order to do this.
Thanks!
Well, first you should understand the terminology you're using. PHP is not an event-driven language, it is a request-driven language. A request comes into the web-server, PHP parses it and a response is sent back to the requester. At no point are there events triggered that you can process or handle. You can implement your own "event system" but ultimately this is much more work than what your use-case entails.
Your best bet is likely utilizing AJAX and continuously making requests to your PHP script until you return the data that you are looking for. Ultimately you will need to learn about the XMLHttpRequest JavaScript object. After you understand how to make asynchronous requests utilizing JavaScript you can look at the setInterval() method for how to repeatedly make a request.
Once you can repeatedly make asynchronous requests it should be a relatively simple process of creating a webpage where you can trigger the AJAX requests to be sent.
There is no need for a loop in your PHP code. The loop is effectively done on the other end. Here's a textual workflow that you might follow:
Go to a site designed to trigger your AJAX calls and trigger them.
Make your async request to your PHP script.
Inside your PHP script open up the XML file and check for the necessary content.
Return a response in the form of a JSON object. One response can mean the data wasn't updated, the other response means the data was updated.
Parse the response, if the data was not updated repeat from step (2). If the data was updated continue to step (6).
Display a celebratory greeting that your data was updated or a notice that we are still waiting for the data to be updated. Perhaps you can have the number of tries as well, off to the side.
I did the following:
Automatically saved to database every time something new came in.
Then ran a php loop that every few minutes checked to see if there is something new in the database which fits the parameters of this new event (including that it happenned within the timeframe of the past few minutes). I used flush(); and then sleep(120); in the loop to get the loop to keep running every few minutes, until the new info came in in which case it will break(); or die();.
I did something like this writing an inbox parser in PHP. You're best option is to:
Code page1.php in which you just need to do 2 things: read XML from page2.xml and
if there is something "new" just execute the data-parsing
code.
Setup a Cron job (if you're under linux) to execute every 5 minutes or so (Cron command is something like: php /path/to/page1.php). In the same way, if you're running Windows you can setup a scheduled task and execute the same command. Be aware that the full path to your PHP installation should be in PATH environment variable.
I have a PHP script something like:
$i=0;
for(;$i<500;++i) {
//Do some operation with files numbered 0 to 500;
}
The thing is, the script works and displays the end results, but the operation takes a while and watching a blank screen can be frustrating. I was thinking if there is some way I can continuously update the page at the client's end, detailing which file is currently being worked upon. That is, can I display and continuously update what is the current value of $i?
The Solution
Thanks everyone! The output buffering is working as suggested. However, David has offered valuable insight and am considering that approach as well.
You can buffer and control the output from the PHP script.
However, you may want to consider the scalability of this design. In general, heavy processes shouldn't be done online. Your particular case may be an edge in that the wait is acceptable, but consider something like this as an alternative for an improved user experience:
The user kicks off a process. This can be as simple as setting a flag on a record in the database or inserting some "to be processed" records into the data.
The user is immediately directed to a page indicating that the process has been queued.
An offline process (either kicked off by the PHP script on the server or scheduled to run regularly) checks the data and does the heavy processing.
In the meantime, the user can refresh the page (manually, by navigating elsewhere and coming back to check, or even use an AJAX polling mechanism to update the page) to check the status of the processing. In this case, it sounds like you'd have several hundred records in a database table queued for processing. As each one finishes, it can be flagged as done. The page can just check how many are left, which one is current, etc. from the data.
When the processing is completed, the page shows the result.
In general this is a better user experience because it doesn't force the user to wait. The user can navigate around the site and check back on progress as desired. Additionally, this approach scales better. If your heavy processing is done directly on the page, what happens when you have many users or the data processing load increases? Will the page start to time out? Will users have to wait longer? By making the process happen outside of the scope of the website you can offload it to better hardware if needed, ensure that records are processed in serial/parallel as business rules demand (avoid race conditions), save processing for off-peak hours, etc.
Check out PHP's Output Buffering.
Try to use:
flush();
http://php.net/manual/ru/function.flush.php
Try the flush() function. Calling this function forces PHP to send whatever output it has so far to the client, instead of waiting for the script to end.
However, some web servers will only send the output once the entire page is done being built, so calling flush() would have no effect in this case.
Also, browsers themselves buffer input, so you may run into problems there. For example, certain versions of IE won't start displaying the page until 256 bytes has been received.
Ok, I didn't really now how to formulate this question, and especially not the title. But i'll give it a try and hope i'm being specific enough while trying to keep it relevant to others.
I you want to run a php script in the background (via ajax) every X seconds that returns data from a database, how do you do this the best way without using to much of the server resources?
My solution looks like this:
A user visits a webpage, ever x seconds that page runs a javascript. The javascript calls a PHP script/file that calls the database, retrieves the data and returns the data to the javascript. The javascript then prints the data to the page. My fear is that this way of solving it will put a lot of pressure on the server if there is a lot (10 000) simultaneous visitors on the page. Is there another way to do this?
That sounds like the best way, given the spec/requirement you set out.
Another way is to have an intermediary step. If you are going to have a huge amount of traffic (otherwise this does not introduce any benefit, but to the contrary may overcomplicat/slow the process), add another table that records the last time a dataset was pulled, and a hard file (say, XML) which if the 'last time' was deemed too long ago, is created from a new query, this XML then feeds the result returned to the user.
So:
1.Javascript calls PHP script (AJAX)
2.PHP pings DB table which contains last time data was fully output
3.If time is too great, 'main' query is rerun and XML file is regenerated from output
ELSE skip to 4
4.Fetch the XML file and output as appropriate for returned AJAX
You can do it the other way, contacting the client just when you need it and wasting less resources.
Comet it's the way to go for this option:
Comet is a programming technique that
enables web servers to send data to
the client without having any need for
the client to request it. This
technique will produce more responsive
applications than classic AJAX. In
classic AJAX applications, web browser
(client) cannot be notified in real
time that the server data model has
changed. The user must create a
request (for example by clicking on a
link) or a periodic AJAX request must
happen in order to get new data fro
the server.
I'm searching on how to do this but my searches aren't turning up things that are talking about what I'm trying to do so maybe I'm not searching with the right terms or this isn't possible, but figured I would ask here for help.. this is what I am trying to do..
I have PHP scripts that are called asyncrhonously, so it is called and it just runs, the calling PHP doesn't wait for a response, so it can go on to do other stuff / free things up so another asynch php process can be run.
I would still like to get back a result from these "zombie" scripts or whatever you want to call them, however the only way I can think of doing it that I know for sure will work is something like make this "zombie" script save its final output to a database and then have my AJAX UI make periodic requests to this database to check if the needed value exists in the place it is supposed to.. which would allow it to get the output from the zombie PHP script..
I am thinking it would be better if somehow this zombie script could do a sort of page refresh to the AJAX ui but the ajax ui would intercept this and just take the received data from PHP and use it as needed (such as display in a DIV for user to see).. basically I'm wondering if you can make PHP force this kind of thing rather than needing to involve a database in this and making AJAX do repeated requests to check for a specific value that way..
Thanks for any advice
No, a background script has no way to influence the client's front-end because it has no connection to it.
Starting a background script, having the script write status data into a shared space - be it a database or a memcache or a similar solution - and polling the status through Ajax is usually indeed the best way to go.
One alternative may be Comet. It's a technique where a connection is kept open over a long time, and updated actively from the server side (instead of frequent client-side Ajax polling). I have no practical experience with this but I imagine it most probably needs server side tweaking to be doable in PHP - it's not the best platform for long-running stuff. See this question for some approaches.