I am having trouble figuring out why flock() is not behaving properly in the following scenario.
The following code is placed into two different PHP scripts one "test1.php" and the other "test2.php". The point of the code is to create a file which no other process (which properly uses the flock() code) should be able to write to. There will be many different PHP scripts which try to obtain an exclusive lock on this file, but only one should have access at any given time and all the rest should fail gracefully when they fail to get the lock.
The way I am testing this is very simple. Both "test1.php" and "test2.php" are placed in a web accessible directory on my server. Then from a browser such as Firefox, the first script will be executed, and then immediately after, the second script is executed from a different browser tab. This seams to work when the code is run from two different PHP scripts such as "test1.php" and "test2.php", but when the code is run twice from the same "test1.php" script or "test2.php" script the second script that is run will not immediately return with a failure.
The only reason I can think of for this, is that flock() treats all PHP processes with the same file name as the same process. If this is the case, then when "test1.php" or "test2.php" are run twice (from two different browser tabs) PHP sees them as the same process and thus does not fail the lock. But to me, it does not makes sense for PHP to be designed like that, thus I am hear to see if anyone else can solve this problem for me.
Thanks in advance!
<?
$file = 'command.bat';
echo "Starting script...";
flush();
$handle = fopen($file, 'w+');
echo "Lets try locking...";
flush();
if(is_resource($handle)){
echo "good resource...";
flush();
if(flock($handle, LOCK_EX | LOCK_NB) === TRUE){
echo "Got lock!";
flush();
sleep(100);
flock($fp, LOCK_UN);
}else{
echo "Failed to get lock!";
flush();
}
}else{
echo "bad resource...";
flush();
}
exit;
Any help with the above is greatly appreciated!
Thank you,
Daniel
I had the same situation and found the problem to be with the browser.
When making multiple requests to the same URL, even if doing so across tabs or windows, the browser is "smart" enough to wait until the first request completes, and then the browser attempts to run the subsequent request(s).
So, while it may look like the lock is not working, what is actually happening is that the browser (both Chrome and Firefox) is waiting for the first request to complete before running the second request.
You can verify that this is the case by opening the same URL once in Chrome and once in Firefox. By doing so, as I did, you would probably see that the lock is indeed working as expected.
flock has many restrictions, including multi-threaded servers, NFS volumes, etc.
The accepted solution is apparently to attempt to create a link instead.
Lots of discussion on this topic: http://www.php.net/manual/en/function.flock.php
Related
I'm trying to find a way in which I can echo out the output of an exec call, and then flush that to the screen while the process is running. I have written a simple PHP script which accepts a file upload and then converts the file if it is not the appropriate file type using FFMPEG. I am doing this on a windows machine. Currently my command looks like so:
$cmd = "ffmpeg.exe -i ..\..\uploads\\".$filename." ..\..\uploads\\".$filename.".m4v 2>&1";
exec( $cmd, $output);
I need something like this:
while( $output ) {
print_r( $output);
ob_flush(); flush();
}
I've read about using ob_flush() and flush() to clear the output buffer, but I only get output once the process has completed. The command works perfectly, It just doesn't update the Page while converting. I'd like to have some output so the person knows what's going on.
I've set the time out
set_time_limit( 10 * 60 ); //5 minute time out
and would be very greatful if someone could put me in the right direction. I've looked at a number of solutions which come close one Stackoverflow, but none seem to have worked.
Since the exec call is a blocking call you have no way of using buffers to get status.
Instead you could redirect the output in the system call to a log file. Let the client query the server for progress update in which case the server could parse the last lines of the log file to get information about current progress and send it back to the client.
exec() is blocking call, and will NOT return control to PHP until the external program has terminated. That means you cannot do anything to dump the output on a line-by-line basis because PHP is suspended while the external app is running.
For what you want, you need to use proc_open, which returns a filehandle you can read from in a loop. e.g.
$fh = proc_open('.....');
while($line = fgets($fh)) {
print($line);
flush();
}
There are two problems with this approach:
The first is that, as #Marc B notes, the fact that exec will block until it's finished. You'll have to devise some way of measuring progress.
The second is that using ob_flush() in this way amounts to holding the connection between server & client open and dribbling the data out a little at a time. This is not something that the HTTP protocol was designed for and while it might work sometimes, it's not going to work consistently - different browsers and different servers will time out differently. The better way to do it is via AJAX calls: using Javascript's setTimeout() function (or setInterval()), make a call to the server periodically and have the server send back a progress report.
I've been looking for a way to prevent running a php script simultaneously. So I found a way (on this site) to prevent this. This is where I came with (test file).
Link to found solution on stackoverflow: How to prevent PHP script running more than once?
test.php
echo "started: ".microtime()."<br>";
$lock = $_SERVER['DOCUMENT_ROOT'].'/tmp/test.lock';
$f = fopen($lock, 'x');
if($f === false){
die("\nCan't aquire lock\n");
}else{
// Do processing
echo "Working: ".microtime()."<br>";
sleep(5);
echo "Still working: ".microtime()."<br>";
sleep(5);
echo "Ready: ".microtime()."<br>";
fclose($f);
unlink($lock);
}
When running this script for the first time, the output will be like this:
started: 0.87157000 1389879936
Working: 0.87532100 1389879936
Still working: 0.87542000 1389879941
Ready: 0.87551800 1389879946
Now when I run the same script in the same browser simultaneously, both will be executed, however the second one is executed after the first one. So not simultaneously, but twice. I didn't expect that because it should die if the test.lock file already exists.
So running the script in the same browsers with two tabs, this is the result:
tab1:
started: 0.87157000 1389879936
Working: 0.87532100 1389879936
Still working: 0.87542000 1389879941
Ready: 0.87551800 1389879946
tab2:
started: 0.92684500 1389879946
Working: 0.92911700 1389879946
Still working: 0.92920300 1389879951
Ready: 0.92930400 1389879956
As you can see, the script in the 2e tab is started when the script in the first tab is finished. Isn't that weared?
When I do this with different browsers, the script started as second is terminated, so it works.
browser 1:
started: 0.62890800 1389880056
Working: 0.63861900 1389880056
Still working: 0.63878800 1389880061
Ready: 0.63893300 1389880066
Browser 2:
started: 0.10137700 1389880058
Warning: fopen(/home/users/domain/tmp/test.lock) [function.fopen]: failed to open stream: File exists in /home/users/domain/test.php on line 8
Can't aquire lock
The question
I'm now able to prevent executing the script simultaneous, but how to prevent the second script in the same browser of being executed after the first script is finished!
I'm guessing that two tabs in the same browser count as the same client, the browser will use the same session. And the server will answer requests from the same session sequentially. That is why you can be logged into the same service with multiple tabs. (E.g. have several tabs open in stackoverflow)
Requests from a different session (browser) may be processed simultaneously. I guess this depends on your server.
You can't really prevent the script from being executed twice with a simple lock-file. You can only prevent simultaneous execution, as you have demonstrated.
If you wanted to prevent the same client from executing a script too often you'd need to keep track of the last time they executed the script. (possibly in a cookie / database)
I have a hefty PHP script.
So much so that I have had to do
ini_set('memory_limit', '3000M');
set_time_limit (0);
It runs fine on one server, but on another I get: Out of memory (allocated 1653342208) (tried to allocate 71 bytes) in /home/writeabo/public_html/propturk/feedgenerator/simple_html_dom.php on line 848
Both are on the same package from the same host, but different servers.
Above Problem solved new problem below for bounty
Update: The script is so big because it rawls a site and parsers data from 252 pages, including over 60,000 images, which it makes two copies of. I have since broken it down into parts.
I have another problem now though. when I am writing the image from outside site to server like this:
try {
$imgcont = file_get_contents($va); // $va is an img src from an array of thousands of srcs
$h = fopen($writeTo,'w');
fwrite($h,$imgcont);
fclose($h);
} catch(Exception $e) {
$error .= (!isset($error)) ? "error with <img src='" . $va . "' />" : "<br/>And <img src='" . $va . "' />";
}
All of a sudden it goes to a 500 internal server error page and I have to do it again, at which point it works, because files are only copied it they don't already exist. Is there anyway I can receive the 500 response code and send it back it to the url to make it go again? As this is to all be an automated process?
If this is memory related, I would personally use copy() rather than file_get_contents(). It supports the file wrappers the same way, and I don't see any advantage in loading the whole file in memory just to write it back on the filesystem.
Otherwise, your error_log might give you more information as of why the 500 happens.
There are three parties involved here:
Remote - The server(s) that contain the images you're after
Server - The computer that is running your php script
Client - Your home computer if you are running the script from a web browser, or the same computer as the server if you are running it from Cron.
Is the 500 error you are seeing being generated by 'Remote' and seen by 'Server' (i.e. the images are temporarily unavailable);
Or is it being generated by 'Server' and seen by 'Client' (i.e. there is a problem with your script).
If it is being generated by 'Remote', then see Ali's answer for how to retry.
If it is being generated by your script on 'Server', then you need to identify exactly what the error is - the php error logs should give you more information. I can think of two likely causes:
Reaching PHP's time limit. PHP will only spend a certain amount of time working before returning a 500 error. You can set this to a higher value, or regularly re-set the timer with a call to set_time_limit(), but that won't work if your server is configured in safe mode.
Reaching PHP's memory limit. You seem to have encoutered this already, but worth making sure you're script still isn't eating lots of memory. Consider outputing debug data (possibly only if you set $config['debug_mode'] = true or something). I'd suggest:
try {
echo 'Getting '.$va.'...';
$imgcont = file_get_contents($va); // $va is an img src from an array of thousands of srcs
$h = fopen($writeTo,'w');
fwrite($h,$imgcont);
fclose($h);
echo 'saved. Memory usage: '.(memory_get_usage() / (1024 * 1024)).' <br />';
unset($imgcont);
} catch(Exception $e) {
$error .= (!isset($error)) ? "error with <img src='" . $va . "' />" : "<br/>And <img src='" . $va . "' />";
}
I've also added a line to remove the image from memory, incase PHP isn't doing this correctly itself (in theory that line shouldn't be necessary).
You can avoid both problems by making your script process fewer images at a time and calling it regularly - either using Cron on the server (the ideal solution, although not all shared webhosts allow this), or some software on your desktop computer. If you do this, make sure you consider what will happen if there are two copies of the script running at the same time - will they both fetch the same image at the same time?
So it sounds like you're running this process via a web browser. I'm guessing that you may be getting the 500 error from Apache timing out somehow after a certain period of time or the process dies or something funky. I would suggest you do one of the following:
A) Move the image downloading to a background process, you can run the crawl script in the browser which will write the urls of the images to be downloaded to the db or something and another script will fire up via cron and fetch all the images. You could also have this script work in batches of 100 or so at a time to keep memory consumption down
B) Call the script directly from the command line (this is really the preferred method for something like this anyway, and you should still probably separate the image fetching to another script)
C) If the command line is not an option for some reason, have your browser loaded script touch a file, and have a cron that runs every minute and looks for the file to exist. Then it fires up your script, you can have the output written to a file for you to check later or send an email when it's completed
Is there anyway I can receive the 500 response code and send it back it to the url to make it go again? As this is to all be an automated process?
Here's the simple version of how I would do it:
function getImage($va, $writeTo, $retries = 3)
{
while ($retries > 0) {
if ($imgcont = file_get_contents($va)) {
file_put_contents($writeTo, $imgcont);
return true;
}
$retries--;
}
return false;
}
This doesn't create the file unless we successfully get our image file, and will retry three times by default. You will of course need to add any require exception handling, error checking, etc.
I would definitely stop using file_get_contents() and write the files in chunks, like this:
$read = fopen($url, 'rb');
$write = fope($local, 'wb');
$chunk = 8096;
while (!feof($read)) {
fwrite($write, fread($read, $chunk));
}
fclose($fp);
This will be nicer to your server, and should hopefully solve your 500 problems. As for "catching" a 500 error, this is simply not possible. It is an irretrievable error thrown by your script and written to the client by the web server.
I'm with Swish, this is not really the kind of task that PHP is intended for - you'de be much better using some sort of server side scripting.
Is there anyway I can receive the 500 response code and send it back it to the url to make it go again?
Have you considered using another library? Fetching files from an external server seems to me more like a job for curl or ftp than file_get_content &etc. If the error is external, and you're using curl, you can detect the 500 return code and handle it appropriately without crashing. If not, then maybe you should split your program into two files - one of which fetches a single file/image, and the other that uses curl to repeatedly call the first one. Unless the 500 error means that all php execution crashes, you would be able to detect the failure and handle it.
Something like this pseudocode:
file1.php:
foreach(list_of_files as filename){
do {
x = call_curl('file2.php', filename);
}
while(x == 500);
}
file2.php:
filename=$_GET['filename'];
results = use_curl_to_get_page(filename);
echo results;
Thanks for all your input. I had seperated everything by the time I wrote this question, so the crawler, fired the image grabber, etc.
I took on board the solution to split the number of images, and that also helped.
I also added a try, catch round the file read.
This was only being called from the browser during testing, but now that it is all up and running it is going to be a cron job.
Thanks Swish and Benubird for your particularly detailed and educational answers. Unfortunately I had no cooperation with the developers on the backend where the images are coming from (long and complicated story).
Anyway, all good now so thanks. (Swish how do you call a script from the command line, my knowledge of this field is severely lacking?)
I know this is a bit generic, but I'm sure you'll understand my explanation. Here is the situation:
The following code is executed every 10 minutes. Variable "var_x" is always read/written to an external text file when its refereed to.
if ( var_x != 1 )
{
var_x = 1;
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
var_x = 0;
}
else
{
// exit script as it's already running.
}
The problem is: if I simulate a hardware failure (do a hard reset when the script is executing) then the main script logic will never execute again because "var_x" will always be "1". (I already have logic to work out the restore point).
Thanks.
You should lock and unlock files with flock:
$fp = fopen($your_file);
if (flock($fp, LOCK_EX)) { )
{
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
flock($fp, LOCK_UN);
}
else
{
// exit script as it's already running.
}
Edit:
As flock seems not to work correctly on Windows machines, you have to resort to other solutions. From the top of my head an idea for a possible solution:
Instead of writing 1 to var_x, write the process ID retrieved via getmypid. When a new instance of the script reads the file, it should then lookup for a running process with this ID, and if the process is a PHP script. Of course, this can still go wrong, as there is the possibility of another PHP script obtaining the same PID after a hardware failure, so the solution is far from optimal.
Don't you think this would be better solved using file locks? (When the reset occurs file locks are reset as well)
http://php.net/flock
It sounds like you're doing some kind of manual semaphore for process management.
Rather than writing to a file, perhaps you should use an environment variable instead. That way, in the event of failure, your script will not have a closed semaphore when you restore.
How does the server knows that i've closed the browser in a code like this?
<?php
$i = 0;
while (1) {
echo "a";
flush();
$fp = fopen("$i.txt", "w");
fclose($fp);
sleep(1);
$i++;
}
?>
If i close the browser, the script stops and no more files are created.
This is because when you try to output something, such as echo "a"; flush();", PHP sees that the request has been aborted, and therefore stops the request.
Just a quick note. This only happens when you output something. I'm guessing this is because PHP was primarily used for templating, and designed mainly for outputting content. Well, if the content is not going to go anywhere, why continue processing the script?
If you don't want it to stop. Do one of the following:
Option A: Don't output anything.
flush() and echo are both considered outputs, along with many other functions. PHP only checks to see if a user has aborted when it goes to send content, so not outputting anything will make sure it doesn't check. Although that is probably not as reliable as...
Option B: use ignore_user_abort(true)
This will make sure that the script continues to output even if the user leaves the page. You can then check with connection_aborted() to find out if the connection has been aborted.
You can read all of this on PHP's Connection Handling Documentation.