I have a temporary folder generated by my business application and wish for the documents within to be only available for around 30 minutes. I was tempted to build an index to keep track of when each file was created but that would be a little silly for just temporary files, they are not of too much importance but I would like them to removed according to the time they were last modified.
What would I need to do this with my Linux server?
The function filemtime() will allow you to check the last modify date of a file. What you will need to do is run your cron job each minute and check if it is greater than the threshold and unlink() it as needed.
$time = 30; //in minutes, time until file deletion threshold
foreach (glob("app/temp/*.tmp") as $filename) {
if (file_exists($filename)) {
if(time() - filemtime($filename) > $time * 60) {
unlink($filename);
}
}
}
This should be the most efficient method as you requested, change the cron threshold to 10 minutes if you need a less accuracy in case there are many files.
You'd need nothing more than to call stat on the files and decide whether to unlink them or not based on their mtime.
Call this script every ten minutes or so from cron or anacron.
Or you could use tmpwatch, a program designed for this purpose.
Related
Is it possible to check how long a file has been inside a directory?
Background: I have a script that moves a file inside a new processing directory. I want to check how long it has been in this directory. If it has been for more than 2 hours then the processing script has probably timed out. So I want to move it outside the directory and run the script again.
filemtime() returns only the last edit time
Thanks!
Use filectime(), see php manual at http://php.net/manual/en/function.filectime.php
Returns the time the file was last changed, or FALSE on failure. The time is returned as a Unix timestamp.
You may want to write you own utility function:
function fileOlderThanHours($filename, $hours = 2) {
$ctime = filectime($filename);
return (time() - $ctime) > ($hours * 60 * 60);
}
I have a php script i run every 5 minutes with Cron from a folder. In the folder there is several images and i add more as time goes.
I was wondering how i can make the php script in the beginning check if NEW files exist after the last time the script was run? If new files exist the script should just go on and if no new files exist then it should not go on. I tried searching around but i cant find anything regarding php.
Anyone that know a quick solution to this problem maybet ?
If the new files are also created with a new timestamp, you can use filemtime() to fetch only files that were created/modified in a specified window of time.
Example:
$files = glob("folder/*.jpg");
$files = array_filter($files, function ($file) { return filemtime($file) >= time() - 5*60; /* modified in the last 5 minutes */ });
if ($files)
{
// there are new files! $files is an array with their names
}
To make sure you won't miss any file, you might want to store the time from last run somewhere, so in case cron delays a second or two and new files were created precisely within that window, you won't lose track of them.
Update for comments:
Now, to store the time from last check, thats up to you to decide how you will do that, you can use database, file, some sort of environment variable etc., but here is an example of how you can do something really simple storing time() in a file:
$last = (int)file_get_contents('folder/timestamp.txt');
file_put_contents('folder/timestamp.txt', time());
$files = glob("folder/*.jpg");
$files = array_filter($files, function ($file) { return filemtime($file) > $last; });
if ($files)
{
// there are new files! $files is an array with their names
}
Just make sure your PHP script can modify folder/timestamp.txt and with this script it will always process new files modified since the last run, no matter how long ago it happened.
Method :
store current time whenever the cron executed in a file or database.
every time when cron starts get the last executed time of the cron from your file or database
count the file which creates after last execution time.
if count greater than 0. process the cron. other wise stop.
You could keep track of the time the script was last run and use filemtime to check if the file was updated or created after your last execution.
http://php.net/manual/en/function.filemtime.php
int filemtime ( string $filename )
Use filemtime() as follows,You will get the added time as date format.
$file_time = date ("F d Y H:i:s.", filemtime($filename);
How does one, in PHP, delete all files in a directory every 24 hours without deleting the directory itself? It has to be in PHP, it cannot be a Cron Job.
Can you please provide an explanation behind your code as well? I am still learning PHP.
Thank you!
There is no way to do this in PHP without using PHP. Sorry.
Joking, but if you wanted to do this you would need some sort of task scheduler (like cron).
That is to say that you could program your personal computer to send the request to the server every 24 hours, but you would either have to do it manually or schedule the task locally.
My point being, you need cron, but it doesn't need to be running on the same host as the PHP files.
Without cron you'd have to add code like this to a commonly-requested page:
$scheduledclean = strtotime("midnight"); // 00:00:00 of the current day
$lastcleanfile = '/path/to/my/app/lastclean';
$lastcleantime = (file_exists($lastcleanfile)) ? filemtime($lastcleanfile) : 0;
$curtime = time();
if( ($curtime > $scheduledclean) && ($lastcleantime < $scheduledclean) ) {
touch($lastcleanfile); //touch first to prevent multiple executions
// file cleanup code here
}
On the first request to the page after midnight the cleanup will fire off, but the unlucky person that made the request will likely have a delay for their page to be served that is as long as the cleanup takes. You could mitigate this by running the cleanup as a backgrounded shell command like shell_exec('rm -rf /path/to/dir/* &');
I did something similar to this a long time ago. It's a terrible idea, but you can have a file which stores the last time your directory was cleared. Each time a user visits a relevant page, check this file in the PHP script (you could also check the modified time). If it is far enough into the past, update the file and run your delete script.
Downsides:
Not guaranteed to run every 24 hours (maybe you get no visitors one day)
Gives one user per day a longer wait
Ugly
As for deleting the files,
function delete_contents( $dirpath ) {
$dir = opendir( $dirpath );
if( $dir ) {
while( ($s = readdir( $dir )) !== false ) {
if( is_dir( $s ) ) {
delete_contents( $s );
rmdir( $s );
} else {
unlink( $s );
}
}
closedir( $dir );
}
}
BE VERY CAREFUL with that. On a crude server setup, delete_contents('/') will delete every file.
Make a PHP script that removes all files in the directory, look for the functions readdir() and unlink() to remove the files.
Set-up a Cron Job to run the script automatically each 24 hours. How you have to do this exactly depends on your host. There are also websites that you can use for this: http://www.google.nl/search?q=cronjobs+webservice
Good luck!
Some background information
The files I would like to download is kept at the external server for a week, and a new XML file(10-50mb large) is created there every hour with a different name. I would like the large file to be downloaded to my server chunk by chunk in the background each time my website is loaded, perhaps 0.5mb each time, and then resume the download the next time someone else loads the website. This would require my site to have atleast 100 pageloads each hour to stay updated, so perhaps abit more of the file each time if possible. I have researched simpleXML, XMLreader, SAX parsing, but whatever I do, it seems it takes too long to parse the file directly, therefore I would like a different approach, namely downloading it like described above.
If I download a 30mb large XML file, I can parse it locally with XMLreader in 3 seconds(250k iterations) only, but when I try to do the same from the external server limiting it to 50k iterations, it uses 15secs to read that small part, so it would not be possible to parse it directly from that server it seems.
Possible solutions
I think it's best to use cURL. But then again, perhaps fopen(), fsockopen(), copy() or file_get_contents() are the way to go. I'm looking for advice on what functions to use to make this happen, or different solutions on how I can parse a 50mb external XML file into a mySQL database.
I suspect a Cron job every hour would be the best solution, but I am not sure how well that would be supported by webhosting companies, and I have no clue how to do something like that. But if thats the best solution, and the majority thinks so, I will have to do my research in that area too.
If a java applet/javascript running in the background would be a better solution, please point me in the right direction when it comes to functions/methods/libraries there aswell.
Summary
What's the best solution to downloading parts of a file in the
background, and resume the download each time my website is loaded
until its completed?
If the above solution would be moronic to even try, what
language/software would you use to achieve the same thing(download a large file every hour)?
Thanks in advance for all answers, and sorry for the long story/question.
Edit: I ended up using this solution to get the files with cron job scheduling a php script. It checks my folder for what files I already have, generates a list of the possible downloads for the last four days, then downloads the next XMLfile in line.
<?php
$date = new DateTime();
$current_time = $date->getTimestamp();
$four_days_ago = $current_time-345600;
echo 'Downloading: '."\n";
for ($i=$four_days_ago; $i<=$current_time; ) {
$date->setTimestamp($i);
if($date->format('H') !== '00') {
$temp_filename = $date->format('Y_m_d_H') ."_full.xml";
if(!glob($temp_filename)) {
$temp_url = 'http://www.external-site-example.com/'.$date->format('Y/m/d/H') .".xml";
echo $temp_filename.' --- '.$temp_url.'<br>'."\n";
break; // with a break here, this loop will only return the next file you should download
}
}
$i += 3600;
}
set_time_limit(300);
$Start = getTime();
$objInputStream = fopen($temp_url, "rb");
$objTempStream = fopen($temp_filename, "w+b");
stream_copy_to_stream($objInputStream, $objTempStream, (1024*200000));
$End = getTime();
echo '<br>It took '.number_format(($End - $Start),2).' secs to download "'.$temp_filename.'".';
function getTime() {
$a = explode (' ',microtime());
return(double) $a[0] + $a[1];
}
?>
edit2: I just wanted to inform you that there is a way to do what I asked, only it would'nt work in my case. With the amount of data I need the website would have to have 400+ visitors an hour for it to work properly. But with smaller amounts of data there are some options; http://www.google.no/search?q=poormanscron
You need to have a scheduled, offline task (e.g., cronjob). The solution you are pursuing is just plain wrong.
The simplest thing that could possibly work is a php script you run every hour (scheduled via cron, most likely) that downloads the file and processes it.
You could try fopen:
<?php
$handle = fopen("http://www.example.com/test.xml", "rb");
$contents = stream_get_contents($handle);
fclose($handle);
?>
I've got a download script which checks a couple of things and then streams a file across in chunks of 8kb.
The loop that does the transfer looks like:
$file = #fopen($file_path,"rb");
if ($file) {
while(!feof($file)) {
set_time_limit(60);
print(fread($file, 1024*8));
flush();
if (connection_status()!=0) {
#fclose($file);
die();
}
}
#fclose($file);
}
I wrote a small application which simulated a very slow download. It waits for 2 minutes before continuing the download. I expected that the script would time out given that I've set a 60 second time limit. This does not happen and the download continues until it has finished. It seems that the time spent in print / flush doesn't count towards the script execution time. Is this correct? Is there a better way to send the file to the client / browser such that I can specify a time limit for the print / flush command?
From set_time_limit():
The set_time_limit() function and the configuration directive max_execution_time
only affect the execution time of the script itself. Any time spent on activity
that happens outside the execution of the script such as system calls using system(),
stream operations, database queries, etc. is not included when determining the
maximum time that the script has been running. This is not true on Windows where
the measured time is real.
So it looks like you can either measure the passage of real time with calls to the time() function, along the lines of:
$start = time();
while (something) {
// do something
if( time()-$start > 60) die();
}
Or you can use Windows. I prefer the first option :p