I've got a download script which checks a couple of things and then streams a file across in chunks of 8kb.
The loop that does the transfer looks like:
$file = #fopen($file_path,"rb");
if ($file) {
while(!feof($file)) {
set_time_limit(60);
print(fread($file, 1024*8));
flush();
if (connection_status()!=0) {
#fclose($file);
die();
}
}
#fclose($file);
}
I wrote a small application which simulated a very slow download. It waits for 2 minutes before continuing the download. I expected that the script would time out given that I've set a 60 second time limit. This does not happen and the download continues until it has finished. It seems that the time spent in print / flush doesn't count towards the script execution time. Is this correct? Is there a better way to send the file to the client / browser such that I can specify a time limit for the print / flush command?
From set_time_limit():
The set_time_limit() function and the configuration directive max_execution_time
only affect the execution time of the script itself. Any time spent on activity
that happens outside the execution of the script such as system calls using system(),
stream operations, database queries, etc. is not included when determining the
maximum time that the script has been running. This is not true on Windows where
the measured time is real.
So it looks like you can either measure the passage of real time with calls to the time() function, along the lines of:
$start = time();
while (something) {
// do something
if( time()-$start > 60) die();
}
Or you can use Windows. I prefer the first option :p
Related
I'm like to improve script below, or maybe know if exist a better way to rewrite to better results.
I use this on two files cron1.php and cron2.php executed every 5 seconds and need to prevent running twice.
Script execution time depends of filesize, most of the time took around 2 seconds, but for huge files can take 25/30 seconds, for this i need to stop execution.
I'm on right way? Any suggestion to improve?
$fp = fopen("cron.lock", "a+");
if (flock($fp, LOCK_EX | LOCK_NB))
{
echo "task started\n";
// Here is my long script
// Cron run every 5 seconds
sleep(2);
flock($fp, LOCK_UN);
}
else
{
echo "task already running\n";
exit;
}
fclose($fp);
I generally do a file operation like dumping the getmypid in the lock file. So externally I can know which pid has locked it. In some debugging cases, that is helpful.
Finally, unlink the lock file when you are done.
I've done a little bit of PHP coding and am familiar with aspects of it.
I have made a PHP script that runs as a cron job that will pull data from a database and if certain conditions are met, some information is written to a file.
Because there may be more than one result in the database, a loop is done to run through each result in the database.
Within that loop, I have another loop which will write data to a file. A cron job is then used to call this file every minute and run the contents in the bash script.
So, the PHP loop it setup to see if the file has anything written to it by using the filesize() function. If the filesize is not zero, then it will sleep for 10 seconds and try to read it again. Here is the code:
while(filesize('/home/cron-script.sh') != 0)
{
sleep(10);
}
Unfortunately, when the filesize is ran, it seems to place some kind of lock or something on the file. The cron job can execute the bash script without a problem and the very last command in the script is to zero out the file:
cat /dev/null > /home/cron-script.sh
But, it seems that once the while loop above is started, it locks in the original file size. As an example, I just simply put in the word "exit" in the cron-script.sh file and then ran through a test script:
while(filesize("/home/cron-script.sh") != 0)
{
echo "filesize: " . filesize("/home/cron-script.sh");
sleep(10);
}
The loop is infinite and will continue to show "filesize: 4" when I put in the word "exit". I will then issue the command at the terminal:
cat /dev/null > /home/cron-script.sh
Which will then clear the file while I have the test script above running. But, it continues to say the filesize is 4 and never returns to 0 - therefore making the PHP script run until the execution time limit is reached.
Could anyone give me some advice on how I can resolve this issue? In essence, I just need some way to reading the filesize - and if there is any kind of data in the file, it will need to loop through a sleep routine until the file is cleared. The file should clear within one minute (since the cron job calls that cron-script.sh file every minute).
Thank you!
From http://www.php.net/manual/en/function.filesize.php
Note: The results of this function are cached. See clearstatcache() for more details.
To resolve this, remember to call clearstatcache() before calling filesize():
while(filesize("/home/cron-script.sh") != 0)
{
echo "filesize: " . filesize("/home/cron-script.sh");
sleep(10);
clearstatcache();
}
The results of filesize are cached.
You can use clearstatchace to clear the cache on each iteration of the loop.
Here's my code:
$cachefile = "cache/ttcache.php";
if(file_exists($cachefile) && ((time() - filemtime($cachefile)) < 900))
{
include($cachefile);
}
else
{
ob_start();
/*resource-intensive loop that outputs
a listing of the top tags used on the website*/
$fp = fopen($cachefile, 'w');
fwrite($fp, ob_get_contents());
fflush($fp);
fclose($fp);
ob_end_flush();
}
This code seemed like it worked fine at first sight, but I found a bug, and I can't figure out how to solve it. Basically, it seems that after I leave the page alone for a period of time, the cache file empties (either that, or when I refresh the page, it clears the cache file, rendering it blank). Then the conditional sees the now-blank cache file, sees its age as less than 900 seconds, and pulls the blank cache file's contents in place of re-running the loop and refilling the cache.
I catted the cache file in the command line and saw that it is indeed blank when this problem exists.
I tried setting it to 60 seconds to replicate this problem more often and hopefully get to the bottom of it, but it doesn't seem to replicate if I am looking for it, only when I leave the page and come back after a while.
Any help?
In the caching routines that I write, I almost always check the filesize, as I want to make sure I'm not spewing blank data, because I rely on a bash script to clear out the cache.
if(file_exists($cachefile) && (filesize($cachefile) > 1024) && ((time() - filemtime($cachefile)) < 900))
This assumes that your outputted cache file is > 1024 bytes, which, usually it will be if it's anything relatively large. Adding a lock file would be useful as well, as noted in the comments above to avoid multiple processes trying to write to the same lock file.
you can double check the file size with the filesize() function, if it's too small, act as if the cache was old.
if there's no PHP in the file, you may want to either use readfile() for performance reasons to just spit the file back out to the end user.
I have a temporary folder generated by my business application and wish for the documents within to be only available for around 30 minutes. I was tempted to build an index to keep track of when each file was created but that would be a little silly for just temporary files, they are not of too much importance but I would like them to removed according to the time they were last modified.
What would I need to do this with my Linux server?
The function filemtime() will allow you to check the last modify date of a file. What you will need to do is run your cron job each minute and check if it is greater than the threshold and unlink() it as needed.
$time = 30; //in minutes, time until file deletion threshold
foreach (glob("app/temp/*.tmp") as $filename) {
if (file_exists($filename)) {
if(time() - filemtime($filename) > $time * 60) {
unlink($filename);
}
}
}
This should be the most efficient method as you requested, change the cron threshold to 10 minutes if you need a less accuracy in case there are many files.
You'd need nothing more than to call stat on the files and decide whether to unlink them or not based on their mtime.
Call this script every ten minutes or so from cron or anacron.
Or you could use tmpwatch, a program designed for this purpose.
This first script gets called several times for each user via an AJAX request. It calls another script on a different server to get the last line of a text file. It works fine, but I think there is a lot of room for improvement but I am not a very good PHP coder, so I am hoping with the help of the community I can optimize this for speed and efficiency:
AJAX POST Request made to this script
<?php session_start();
$fileName = $_POST['textFile'];
$result = file_get_contents($_SESSION['serverURL']."fileReader.php?textFile=$fileName");
echo $result;
?>
It makes a GET request to this external script which reads a text file
<?php
$fileName = $_GET['textFile'];
if (file_exists('text/'.$fileName.'.txt')) {
$lines = file('text/'.$fileName.'.txt');
echo $lines[sizeof($lines)-1];
}
else{
echo 0;
}
?>
I would appreciate any help. I think there is more improvement that can be made in the first script. It makes an expensive function call (file_get_contents), well at least I think its expensive!
This script should limit the locations and file types that it's going to return.
Think of somebody trying this:
http://www.yoursite.com/yourscript.php?textFile=../../../etc/passwd (or something similar)
Try to find out where delays occur.. does the HTTP request take long, or is the file so large that reading it takes long.
If the request is slow, try caching results locally.
If the file is huge, then you could set up a cron job that extracts the last line of the file at regular intervals (or at every change), and save that to a file that your other script can access directly.
readfile is your friend here
it reads a file on disk and streams it to the client.
script 1:
<?php
session_start();
// added basic argument filtering
$fileName = preg_replace('/[^A-Za-z0-9_]/', '', $_POST['textFile']);
$fileName = $_SESSION['serverURL'].'text/'.$fileName.'.txt';
if (file_exists($fileName)) {
// script 2 could be pasted here
//for the entire file
//readfile($fileName);
//for just the last line
$lines = file($fileName);
echo $lines[count($lines)-1];
exit(0);
}
echo 0;
?>
This script could further be improved by adding caching to it. But that is more complicated.
The very basic caching could be.
script 2:
<?php
$lastModifiedTimeStamp filemtime($fileName);
if (isset($_SERVER['HTTP_IF_MODIFIED_SINCE'])) {
$browserCachedCopyTimestamp = strtotime(preg_replace('/;.*$/', '', $_SERVER['HTTP_IF_MODIFIED_SINCE']));
if ($browserCachedCopyTimestamp >= $lastModifiedTimeStamp) {
header("HTTP/1.0 304 Not Modified");
exit(0);
}
}
header('Content-Length: '.filesize($fileName));
header('Expires: '.gmdate('D, d M Y H:i:s \G\M\T', time() + 604800)); // (3600 * 24 * 7)
header('Last-Modified: '.date('D, d M Y H:i:s \G\M\T', $lastModifiedTimeStamp));
?>
First things first: Do you really need to optimize that? Is that the slowest part in your use case? Have you used xdebug to verify that? If you've done that, read on:
You cannot really optimize the first script usefully: If you need a http-request, you need a http-request. Skipping the http request could be a performance gain, though, if it is possible (i.e. if the first script can access the same files the second script would operate on).
As for the second script: Reading the whole file into memory does look like some overhead, but that is neglibable, if the files are small. The code looks very readable, I would leave it as is in that case.
If your files are big, however, you might want to use fopen() and its friends fseek() and fread()
# Do not forget to sanitize the file name here!
# An attacker could demand the last line of your password
# file or similar! ($fileName = '../../passwords.txt')
$filePointer = fopen($fileName, 'r');
$i = 1;
$chunkSize = 200;
# Read 200 byte chunks from the file and check if the chunk
# contains a newline
do {
fseek($filePointer, -($i * $chunkSize), SEEK_END);
$line = fread($filePointer, $i++ * $chunkSize);
} while (($pos = strrpos($line, "\n")) === false);
return substr($line, $pos + 1);
If the files are unchanging, you should cache the last line.
If the files are changing and you control the way they are produced, it might or might not be an improvement to reverse the order lines are written, depending on how often a line is read over its lifetime.
Edit:
Your server could figure out what it wants to write to its log, put it in memcache, and then write it to the log. The request for the last line could be fulfulled from memcache instead of file read.
The most probable source of delay is that cross-server HTTP request. If the files are small, the cost of fopen/fread/fclose is nothing compared to the whole HTTP request.
(Not long ago I used HTTP to retrieve images to dinamically generate image-based menus. Replacing the HTTP request by a local file read reduced the delay from seconds to tenths of a second.)
I assume that the obvious solution of accessing the file server filesystem directly is out of the question. If not, then it's the best and simplest option.
If not, you could use caching. Instead of getting the whole file, you just issue a HEAD request and compare the timestamp to a local copy.
Also, if you are ajax-updating a lot of clients based on the same files, you might consider looking at using comet (meteor, for example). It's used for things like chats, where a single change has to be broadcasted to several clients.