is there a way for me to check to see if a file is copied before continuing to execute a php loop?
i have a for loop, and within the loop it is going to copy a file. now, i want it so that it waits until the current file is copied before continuing the loop.
example:
for ($i = 1; $i <= 10; $i++)
{
$temp = $_FILES['tmp_name'];
$extension = '.jpg';
copy("$temp_$i_$extension", "$local_$i_$extension");
// not sure what to do here
if (FILE_DONE_COPYING())
{
CONTINUE_LOOP();
}
else
{
PAUSE_LOOP();
}
}
thats just an example. i have no clue how to do this...can anyone chime in?
That's what copy() does in PHP - it blocks until the file is copied. There's nothing you need to do, except checking the return value to see if the operation was successful.
PHP is taking it line by line, step by step, so it's waiting until copy() is completed
for ($i = 1; $i <= 10; $i++)
{
$temp = $_FILES['tmp_name'];
$extension = '.jpg';
$result = copy("$temp_$i_$extension", "$local_$i_$extension");
if($result){
//done
}
else{
//failed
}
}
copy returns true on success and false on failure. Check for that.
Unless you go through the trouble of using threading and have copy fired asynchronously, PHP will not move to the line after copy until after it has completed.
copy does wait for completion before continuing execution. It is a syncronous call. But, it can return false if it didn't work, and your copy wont work since $temp_ and $i_ are not defined variables. So maybe you are thinking the copy isn't finishing, when it actually just isn't working at all.
You should use:
copy("{$temp}_{$i}_$extension", "{$local}_{$i}_$extension");
OR
copy($temp.'_'.$i.'_'.$extension, $local.'_'.$i.'_'.$extension);
What makes you think that copy() will return before it has finished?
You could of course compare filesize of original file and copy to be sure the process is complete.
You could use a while loop with sleep calls to delay checking, and just exit the while loop once the file exists under the new name.
I know this is an ancient question but I feel I really need to talk about this problem. COPY is a great command - BUT - it does not work all of the time. I can honestly tell you this. Why? Why does it not always work? Simple - the Operating System is at fault. Here are two examples. One is using a standard disk drive and the second one deals with a Ram disk. The COPY command reads CHUNKS of a file and writes these chunks out to the destination. This means it really does NOT just do a File_get_contents but instead does the fopen(IN), fopen(OUT), while( !EOF(IN) ){ fread(IN), fwrite(OUT) } and then fclose(IN), and fclose(out). It should be noted that these commands try to make sure everything goes ok but if the disk drive buffers what it does - then the file might take a second or two to finish. This can be seen by having a file_exists() on the output file's name. It can come back as FALSE(it IS NOT there). This is because the disk drive's hardware has not caught up with the commands.
I even installed the AMD RamDisk software and ran a program using the above commands (both file_get_contents->file_put_contents and the fopen-fread/fwrite-fclose commands). The same thing happened then also. Every now and then (not always) the file_exists() function returned FALSE because the test got there before the file had finished being created. Don't ask me why - it just would do this.
So what do I suggest? Use the SLEEP() command. Maybe use three(3) seconds (so SLEEP(3);) -after- the COPY() command. I also determined that a CHMOD(, 0777); was a good idea also. With a SLEEP() command after it so it has time to apply the changes. (Which is probably closer to one second).
Now, remember - everyone's hardware is different. So some hardware might work better or faster than the one I am using. So - this is one of those - try it if you are having problems. Or don't - if you are not having problems. It is that simple. So - this is happening to me - I'm using it - it works now that the system gets three seconds to breath - but it might not do anything for you - who has an atomic powered Willy-Wonka mobile which does the impossible before breakfast.
Got it? Good. :-)
Related
I have an issue with my non-unique hit counter.
The script is as below:
$filename = 'counter.txt';
if (file_exists($filename)) {
$current_value = file_get_contents($filename);
} else {
$current_value = 0;
}
$current_value++;
file_put_contents($filename, $current_value);
When I'm refreshing my website very often (like 10 times per second or even faster), the value in the text file are getting reset to 0.
Any guess for fixing this issue?
This is a pretty poor way to maintain a counter, but your problem is probably that when you fire multiple requests at the site, one of the calls to file_exists() is getting a false because one of the other processes is removing and recreating the file.
If you want this to work consistantly you are going to have to lock the file between read and write See flock on php manual
Of course without the file lock you would also be getting incorrect counts anyway, when 2 processes manage to read the same value from the file.
Locking the file would also potentially slow your system down as 2 or more processes queue for access to the file.
It would probably be a better idea to store your counter in a database, as they are designed for coping with this kind of quick fire access and ensuring every process is properly queued and released.
Does it help if you add a check if file_get_contents isn't returning false?
$value = file_get_contents($filename);
if($value !== false)
{
$current_value = $value
}
I've been completely unsuccessful finding an answer to this question. Hopefully someone here can help.
I have a PHP script (a WordPress template, to be specific) that automatically imports and processes images when a user hits it. The problem is that the image processing takes up a lot of memory, particularly if multiple users are accessing the template at the same time and initiating the image processing. My server crashed multiple times because of this.
My solution to this was to not execute the image-processing function if it was already running. Before the function started running, I would check a database entry named image_import_running to see if it was set to false. If it was, the function then ran. The very first thing the function did was set image_import_running to true. Then, after it was all finished, I set it back to false.
It worked great -- in theory. The site hasn't crashed since, I can tell you that. But there are two major problems with it:
If the user closes the page while it's loading, the script never finishes processing the images and therefore never sets image_import_running back to false. The template will never process images again until it's manually set to false.
If the script times out while it's processing images -- and that's a strong possibility if there are many images in the queue -- you have essentially the same problem as No. 1: the script never gets to the point where it sets image_import_running back to false.
To handle No. 1 (the first one of the two problems I realized), I added ignore_user_abort(true) to the script. Did it work? I don't know, because No. 2 is still an issue. That's where I'm stumped.
If I could ask the server whether the script was running or not, I could do something like this:
if($import_running && $script_not_running) {
$import_running = false;
}
But how do I set that $script_not_running variable? Beats me.
I've shared this entire story with you just in case you have some other brilliant solution.
Try using
ignore_user_abort(true); it will continue to run even if the person leaves and closes the browser.
you might also want to put a number instead of true false in the db record and set a maximum number of processes that can run together
As others have suggested, it would be best to move the image processing out of the request itself.
As an interim "fix", store a timestamp alongside image_import_running when a processing job begins (e.g., image_import_commenced). This is a very crude mechanism, but if you know the maximum time that a job can run before timing out, the script can check whether that period of time has elapsed.
e.g., if image_import_running is still true but the current time is more than 10 minutes since image_import_commenced, run the processing anyway.
What about setting a transient with an expiry time that would throttle the operation?
if(!get_transient( 'import_running' )) {
set_transient( 'import_running', true, 30 ); // set a 30 second transient on the import.
run_the_import_function();
}
I would rather store the job into database flagging it pending and set a cron job to execute the processing one job at a time.
For Me i use just this simple idea with a text document. for example run.txt file
in the top script use :
if((file_get_contents('run.txt') != 'run'){ // here the script will work
$file = fopen('run.txt', 'w+');
fwrite($file, 'run');
fclose('run.txt');
}else{
exit(); // if it find 'run' in run.txt the script will stop
}
And add this in the end of your script file
$file = fopen('run.txt', 'w+');
fwrite($file, ''); //will delete run word for the next try ;)
fclose('run.txt');
That will check if script already work by checking runt.txt contents
if run word exist in run.txt it will not run
Running a cron would definitively be a better solution. Idea to store url in a table is a good one.
To answer to the original question, you may run a ps auxwww command with exec (Check this page: How to get list of running php scripts using PHP exec()? ) and move your function in a separated php file.
exec("ps auxwww|grep myfunction.php|grep -v grep", $output);
Just add following on the top of your script.
<?php
// Ensures single instance of script run at a time.
$fileName = basename(__FILE__);
$output = shell_exec("ps -ef | grep -v grep | grep $fileName | wc -l");
//echo $output;
if ($output > 2)
{
echo "Already running - $fileName\n";
exit;
}
// Your php script code.
?>
Given a simple code like :
$file = 'hugefile.jpg';
$bckp_file = 'hugeimage-backup.jpg';
// here comes some manipulation on $bckp_file.
The assumed problem is that if the file is big or huge - let´s say a jpg - One would think that it will take the server some time to copy it (by time I mean even a few milliseconds) - but one would also assume that the execution of the next line would be much faster ..
So in theory - I could end up with "no such file or directory" error when trying to manipulate file that has not yet created - or worse - start to manipulate a TRUNCATED file.
My question is how can I assure that $bckp_file was created (or in this case -copied) successfully before the NEXT line which manipulates it .
What are my options to "pause" , "delay" the next line execution until the file creation / copy was completed ?
right now I can only think of something like
if (!copy($file, $bckp_file)) {
echo "failed to copy $file...\n";
}
which will only alert me but will not resolve anything (same like having the php error)
or
if (copy($file, $bckp_file)) {
// move the manipulation to here ..
}
But this is also not so valid - because let´s say the copy was not executed - I will just go out of the loop without achieving my goal and without errors.
Is that even a problem or am I over-thinking it ?
Or is PHP has a bulit-in mechanism to ensure that ?
Any recommended practices ?
any thoughts on the issue ? ??
What are my options to "pause" , "delay" the next line execution until the file is creation / copy was completes
copy() is a synchronous function meaning that code will not continue after the call to copy() until copy() either completely finishes or fails.
In other words, there's no magic involved.
if (copy(...)) { echo 'success!'; } else { echo 'failure!'; }
Along with synchronous IO, there is also asynchronous IO. It's a bit complicated to explain in technical detail, but the general idea of it is that it runs in the background and your code hooks into it. Then, whenever a significant event happens, the background execution alerts your code. For example, if you were going to async copy a file, you would register a listener to the copying that would be notified when progress was made. That way, your code could do other things in the mean time, but you could also know what progress was being made on the file.
PHP handles file uploads by saving the whole file in a temporary directory on the server before executing any of script (so you can use $_FILES from the beginning), and it's safe to assume all functions are synchronous -- that is, PHP will wait for each line to execute before moving to the next line.
I have a hefty PHP script.
So much so that I have had to do
ini_set('memory_limit', '3000M');
set_time_limit (0);
It runs fine on one server, but on another I get: Out of memory (allocated 1653342208) (tried to allocate 71 bytes) in /home/writeabo/public_html/propturk/feedgenerator/simple_html_dom.php on line 848
Both are on the same package from the same host, but different servers.
Above Problem solved new problem below for bounty
Update: The script is so big because it rawls a site and parsers data from 252 pages, including over 60,000 images, which it makes two copies of. I have since broken it down into parts.
I have another problem now though. when I am writing the image from outside site to server like this:
try {
$imgcont = file_get_contents($va); // $va is an img src from an array of thousands of srcs
$h = fopen($writeTo,'w');
fwrite($h,$imgcont);
fclose($h);
} catch(Exception $e) {
$error .= (!isset($error)) ? "error with <img src='" . $va . "' />" : "<br/>And <img src='" . $va . "' />";
}
All of a sudden it goes to a 500 internal server error page and I have to do it again, at which point it works, because files are only copied it they don't already exist. Is there anyway I can receive the 500 response code and send it back it to the url to make it go again? As this is to all be an automated process?
If this is memory related, I would personally use copy() rather than file_get_contents(). It supports the file wrappers the same way, and I don't see any advantage in loading the whole file in memory just to write it back on the filesystem.
Otherwise, your error_log might give you more information as of why the 500 happens.
There are three parties involved here:
Remote - The server(s) that contain the images you're after
Server - The computer that is running your php script
Client - Your home computer if you are running the script from a web browser, or the same computer as the server if you are running it from Cron.
Is the 500 error you are seeing being generated by 'Remote' and seen by 'Server' (i.e. the images are temporarily unavailable);
Or is it being generated by 'Server' and seen by 'Client' (i.e. there is a problem with your script).
If it is being generated by 'Remote', then see Ali's answer for how to retry.
If it is being generated by your script on 'Server', then you need to identify exactly what the error is - the php error logs should give you more information. I can think of two likely causes:
Reaching PHP's time limit. PHP will only spend a certain amount of time working before returning a 500 error. You can set this to a higher value, or regularly re-set the timer with a call to set_time_limit(), but that won't work if your server is configured in safe mode.
Reaching PHP's memory limit. You seem to have encoutered this already, but worth making sure you're script still isn't eating lots of memory. Consider outputing debug data (possibly only if you set $config['debug_mode'] = true or something). I'd suggest:
try {
echo 'Getting '.$va.'...';
$imgcont = file_get_contents($va); // $va is an img src from an array of thousands of srcs
$h = fopen($writeTo,'w');
fwrite($h,$imgcont);
fclose($h);
echo 'saved. Memory usage: '.(memory_get_usage() / (1024 * 1024)).' <br />';
unset($imgcont);
} catch(Exception $e) {
$error .= (!isset($error)) ? "error with <img src='" . $va . "' />" : "<br/>And <img src='" . $va . "' />";
}
I've also added a line to remove the image from memory, incase PHP isn't doing this correctly itself (in theory that line shouldn't be necessary).
You can avoid both problems by making your script process fewer images at a time and calling it regularly - either using Cron on the server (the ideal solution, although not all shared webhosts allow this), or some software on your desktop computer. If you do this, make sure you consider what will happen if there are two copies of the script running at the same time - will they both fetch the same image at the same time?
So it sounds like you're running this process via a web browser. I'm guessing that you may be getting the 500 error from Apache timing out somehow after a certain period of time or the process dies or something funky. I would suggest you do one of the following:
A) Move the image downloading to a background process, you can run the crawl script in the browser which will write the urls of the images to be downloaded to the db or something and another script will fire up via cron and fetch all the images. You could also have this script work in batches of 100 or so at a time to keep memory consumption down
B) Call the script directly from the command line (this is really the preferred method for something like this anyway, and you should still probably separate the image fetching to another script)
C) If the command line is not an option for some reason, have your browser loaded script touch a file, and have a cron that runs every minute and looks for the file to exist. Then it fires up your script, you can have the output written to a file for you to check later or send an email when it's completed
Is there anyway I can receive the 500 response code and send it back it to the url to make it go again? As this is to all be an automated process?
Here's the simple version of how I would do it:
function getImage($va, $writeTo, $retries = 3)
{
while ($retries > 0) {
if ($imgcont = file_get_contents($va)) {
file_put_contents($writeTo, $imgcont);
return true;
}
$retries--;
}
return false;
}
This doesn't create the file unless we successfully get our image file, and will retry three times by default. You will of course need to add any require exception handling, error checking, etc.
I would definitely stop using file_get_contents() and write the files in chunks, like this:
$read = fopen($url, 'rb');
$write = fope($local, 'wb');
$chunk = 8096;
while (!feof($read)) {
fwrite($write, fread($read, $chunk));
}
fclose($fp);
This will be nicer to your server, and should hopefully solve your 500 problems. As for "catching" a 500 error, this is simply not possible. It is an irretrievable error thrown by your script and written to the client by the web server.
I'm with Swish, this is not really the kind of task that PHP is intended for - you'de be much better using some sort of server side scripting.
Is there anyway I can receive the 500 response code and send it back it to the url to make it go again?
Have you considered using another library? Fetching files from an external server seems to me more like a job for curl or ftp than file_get_content &etc. If the error is external, and you're using curl, you can detect the 500 return code and handle it appropriately without crashing. If not, then maybe you should split your program into two files - one of which fetches a single file/image, and the other that uses curl to repeatedly call the first one. Unless the 500 error means that all php execution crashes, you would be able to detect the failure and handle it.
Something like this pseudocode:
file1.php:
foreach(list_of_files as filename){
do {
x = call_curl('file2.php', filename);
}
while(x == 500);
}
file2.php:
filename=$_GET['filename'];
results = use_curl_to_get_page(filename);
echo results;
Thanks for all your input. I had seperated everything by the time I wrote this question, so the crawler, fired the image grabber, etc.
I took on board the solution to split the number of images, and that also helped.
I also added a try, catch round the file read.
This was only being called from the browser during testing, but now that it is all up and running it is going to be a cron job.
Thanks Swish and Benubird for your particularly detailed and educational answers. Unfortunately I had no cooperation with the developers on the backend where the images are coming from (long and complicated story).
Anyway, all good now so thanks. (Swish how do you call a script from the command line, my knowledge of this field is severely lacking?)
I know this is a bit generic, but I'm sure you'll understand my explanation. Here is the situation:
The following code is executed every 10 minutes. Variable "var_x" is always read/written to an external text file when its refereed to.
if ( var_x != 1 )
{
var_x = 1;
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
var_x = 0;
}
else
{
// exit script as it's already running.
}
The problem is: if I simulate a hardware failure (do a hard reset when the script is executing) then the main script logic will never execute again because "var_x" will always be "1". (I already have logic to work out the restore point).
Thanks.
You should lock and unlock files with flock:
$fp = fopen($your_file);
if (flock($fp, LOCK_EX)) { )
{
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
flock($fp, LOCK_UN);
}
else
{
// exit script as it's already running.
}
Edit:
As flock seems not to work correctly on Windows machines, you have to resort to other solutions. From the top of my head an idea for a possible solution:
Instead of writing 1 to var_x, write the process ID retrieved via getmypid. When a new instance of the script reads the file, it should then lookup for a running process with this ID, and if the process is a PHP script. Of course, this can still go wrong, as there is the possibility of another PHP script obtaining the same PID after a hardware failure, so the solution is far from optimal.
Don't you think this would be better solved using file locks? (When the reset occurs file locks are reset as well)
http://php.net/flock
It sounds like you're doing some kind of manual semaphore for process management.
Rather than writing to a file, perhaps you should use an environment variable instead. That way, in the event of failure, your script will not have a closed semaphore when you restore.