I'm using PHP/MySQL, although I think this question is essentially language/db ambivalent. I have a PHP script that connects to one API, gets the response data, parses it, and then sends it to a different API for storage in its database. Sometimes this process fails because of an error with one of the APIs. I would therefore like to easily track its success/failure.
I should clarify that "success" in this case is defined as the script getting the data it needs from the first API and successfully having it processed by the second API. Therefore, "failure" could result from 3 possible things:
First API throws an error
Second API throws an error
My script times out.
This script will run once a day. I'd like to store the success or failure result in a database so that I can easily visit a webpage and see the result. I'm currently thinking of doing the following:
Store the current time in a variable at the start of the script.
Insert that timestamp into the database right away.
Once the script has finished, insert that same timestamp into the database
again.
If the script fails, log the reason for failure in the DB.
I'd then gauge success or failure based on whether a single timestamp has two entries in the database, as opposed to just one.
Is this the best way to do it, or would something else work better? I don't see any reason why this wouldn't work, but I feel like some recognized standard way of accomplishing this must exist.
A user declared shutdown function might be an alternative: using register_shutdown_function() you can decalre a callback to be executed when the script terminates, whethe rsuccessfully, user-aborted, or timed-out
You could use a lock file :
at the very beginning of your script, you create a lock file somewhere on the filesystem
at the very end of your script, if everything worked good, you delete it from filesystem
Then you've just to monitor the directory where you've placed these files. With the lock file's creation date you can find which day didn't work.
You can combine this system by a monitoring script that sends alerts if lock files are present and have a creation date older than a given interval (let's say 1 hour for example if your script usually runs in a few minutes).
Related
I have a PHP script to pull user specific data from a 3rd party source and dump it into a local table, which I want to execute every X mins when a user is logged in, but it takes about 30 seconds to run, which I don't want the user to experience. I figured the best way of doing this would be to timestamp each successful pull, then place some code in every page footer that checks the last pull and executes the PHP script in the background if it was more than X minutes ago.
I don't want to use a cron job to do this, as there are global and session variables specific to the user that I need when executing the pull script.
I can see that popen() would allow me to run the PHP script, but I can't seem to find any information relating to whether this would be run as the visitor within their session (and therefore with access to the user specific global or session variables) or as a separate session altogether.
Will popen() solve my problem?
Am I going about this the right way or is there a better method to execute the script at intervals while the user is logged in?
Any help or advice is much appreciated, as always!
Cheers
Andy
Maybe an idea to put the session data also in a table.
That way you can easily access it from your background process. You only have to pass the session_id() or the user id as argument so the process knows which user it is currently processing.
No, because PHP still needs to wait on the process started by popen() before it can exit
Probably not, because the solution doesn't fit the architectural constraints of the system
Your reasoning for not using a cron job isn't actually sound. You may have given up on exploring that particular path too quickly and drawn yourself into a corner with trying to fit a square peg in a round hole.
The real problem you're having is figuring out how to do inter-process communication between the web-request and the PHP running from your crontab. This can be solved in a number of ways, and more easily then trying to work against PHP's architectural constraints.
For example, you can store the session ids in your database or some other storage, and access the session files from the PHP script running in your crontab, independently of the web request.
Let's say you determine that a user is logged in based on the last request they made to your server by storing this information in some data store as a timestamp, along with the current session id in the user table.
The cron job can startup every X minutes, look at the database or persistence store for any records that have an active timestamp within the given range, pull those session ids and do what's needed.
Now the question of how do you actually get this to scale effectively if the processing time takes more than 30 seconds can be solved independently as well. See this answer for more details on how to deal with distributed job managers/queues from PHP.
I would use Javascript and AJAX requests to call the script from the frontend and handle the result.
You could then set the interval in JavaScript and send an AJAX-Request each interval tick.
AJAX is what you need:
http://api.jquery.com/jquery.ajax/
The use the method "success" to do something after those 30 seconds.
I have a PHP script which contains many database queries, and copies several database tables, and as such, it takes quite a long time to complete. The problem I am getting, is that it is timing out. However, it appears to be completed, which is what is confusing.
The script is suppose to redirect to view once completed. However, even after extending the time limit to 5 minutes, it gives me the timing out error page. However, when I check the database, all of the tables have been copied completely, indicating that the script was completed.
Am I missing something easy here? Is there a general reason it would time out as opposed to redirecting to the view? I would post some of the code, but the entire script is approximately 1000 lines of code, so it seems a bit extensive to show here.
Also, I am using CodeIgniter.
Thanks in advance for your help!
It's possible that the PHP script is not timing out, but the browser you're using has given up waiting for any result. If thats the case you'll need to handle the whole thing differently. For example, run the script in the background and report periodic updates via AJAX or something.
Think of it this way:
Your browser asks your server for a web page and waits for the results.
Your server runs your PHP script, which then asks MySQL to run a query, and waits for results.
MySQL runs the query and returns a result to PHP.
PHP does some more processing and returns a result to the browser.
At step 3, PHP may have timed out and is no longer there. MySQL didn't know that while it was working, it just did its job and then handed a result back to nothing.
At step 4, the browser may have timed out and dropped the connection. PHP wouldn't know that, so it did its job and then returned a result to nothing.
It's two separate timeouts in this example, but your query was completed either way.
I have a MVC framework with a controller in it. A controller downloads images from server. I need to refresh my database with those images every 5 minutes. So, I planned to create php script which downloads the file and persists it to my database. In order to do this every 5 minutes. I will be setting up Cron job.
Now the question is,
What is the best practise to handle errors inside php script?
Because Cron will keep executing at every 5 minutes without knowing that the last queried image is already lost and not being saved.
How do I notify myself that something unusual happend and I need to maintain the DB consistency by my self (Which I don't mind for few instances).
What is the best practise to handle errors inside php script? Because
Cron will keep executing at every 5 minutes without knowing that the
last queried image is already lost and not being saved.
use asserts as described here: http://php.net/manual/en/function.assert.php
How do I notify myself that something unusual happend and I need to
maintain the DB consistency by my self (Which I don't mind for few
instances).
use mail() in asserts
Use try-catch along with database transactions (if possible). You can dump errors to error_log() and either set that up to generate email or add email to your error handler.
In addition to the other comments. I have often found it useful in cron scripts that could run into problems or take longer than the desired execution interval to where multiple execution instances could be running, to provide some text file that indicates last execution time, execution success, etc. that you can inspect to determine if the script should run as scheduled. It could be something as simple as writing a file at script start and deleting it on successful execution, and then checking for this file on next execution to decide whether to run or not.
I am working in a tool in PHP that processes a lot of data and takes a while to finish. I would like to keep the user updated with what is going on and the current task processed.
What is in your opinion the best way to do it? I've got some ideas but can't decide for the most effective one:
The old way: execute a small part of the script and display a page to the user with a Meta Redirect or a JavaScript timer to send a request to continue the script (like /script.php?step=2).
Sending AJAX requests constantly to read a server file that PHP keeps updating through fwrite().
Same as above but PHP updates a field in the database instead of saving a file.
Does any of those sound good? Any ideas?
Thanks!
Rather than writing to a static file you fetch with AJAX or to an extra database field, why not have another PHP script that simply returns a completion percentage for the specified task. Your page can then update the progress via a very lightweight AJAX request to said PHP script.
As for implementing this "progress" script, I could offer more advice if I had more insight as to what you mean by "processes a lot of data". If you are writing to a file, your "progress" script could simply check the file size and return the percentage complete. For more complex tasks, you might assign benchmarks to particular processes and return an estimated percentage complete based on which process has completed last or is currently running.
UPDATE
This is one suggested method to "check the progress" of an active script which is simply waiting for a response from a request. I have a data mining application that I use a similar method for.
In your script that makes the request you're waiting for (the script you want to check the progress of), you can store (either in a file or a database, I use a database as I have hundreds of processes running at any time which all need to track their progress, and I have another script that allows me to monitor progress of these processes) a progress variable for the process. When the process begins, set this to 1. You can easily select an arbitrary number of 'checkpoints' the script will pass and calculate the percentage given the current checkpoint. For a large request, however, you might be more interested in knowing the approximate percent the request has completed. One possible solution would be to know the size of the returned content and set your status variable according to the percentage received at any moment. I.e. if you receive the request data in a loop, each iteration you could update the status. Or if you are downloading to a flat file you could poll the size of the file. This could be done less accurately with time (rather than file size) if you know the approximate time the request should take to complete and simply compare against the script's current execution time. Obviously neither of these are perfect solutions, but I hope they'll give you some insight into your options.
I suggest using the AJAX method, but not using a file or a database. You could probably use session values or something like that, that way you don't have to create a connection or open a file to do anything.
In the past, I've just written messages out to the page and used flush() to flush the output buffer. Very simple, but it may not work correctly on every web server or with every web browser (as they may do their own internal buffering).
Personally, I like your second option the best. Should be reliable and fairly simple to implement.
I like option 2 - using AJAX to read a status file that PHP writes to periodically. This opens up a lot of different presentation options. If you write a JSON object to the file, you can easily parse it and display things like a progress bar, status messages, etc...
A 'dirty' but quick-and-easy approach is to just echo out the status as the script runs along. So long as you don't have output buffering on, the browser will render the HTML as it receives it from the server (I know WordPress uses this technique for it's auto-upgrade).
But yes, a 'better' approach would be AJAX, though I wouldn't say there's anything wrong with 'breaking it up' use redirects.
Why not incorporate 1 & 2, where AJAX sends a request to script.php?step=1, checks response, writes to the browser, then goes back for more at script.php?step=2 and so on?
if you can do away with IE then use server sent events. its the ideal solution.
I currently have a class Status, and I call it throughout my code as I preform various tasks, for example - when I upload an image, I call $statusHandler = new Status(121), when I resize it, I call $statusHandler = new Status(122).
Every statusID corresponds to a certain text stored in the database.
Class status retrieves the text, and stores in $_SESSION.
I have another file, getstatus.php, that returns the current status.
My idea was to call getstatus.php every 500 miliseconds with ajax (via jquery), and append the text to the webpage.
This way the user gets (almost) real-time data about what calculations are going on in the background.
The problem is that I only seem to be getting the last status.
I thought that it just was a result of things happening too quickly, so I ran sleep after calling new Status. This delayed the entire output of the page, meaning PHP didn't output any text until it completed running through the code.
Does PHP echo data only after it finishes running through all the code, or does it do it real-time, line-by-line?
If so, can anyone offer any workarounds so that I can achieve what I want?
Thanks.
The default session implementation (flat file) sets a file lock on the session-file when session_start() is called. This lock is not released until the session mechanism shuts down, i.e. when the script has ended or session_write_close() is executed. Another request/thread that wants to access the same session has to wait until this lock has been released.