How to avoid a possible missing cache file in PHP? - php

I have a simple caching system as
if (file_exists($cache)) {
echo file_get_contents($cache);
// if coming here when $cache is deleting, then nothing to display
}
else {
// PHP process
}
We regularly delete outdated cache files, e.g. deleting all caches after 1 hour. Although this process is very fast, but I am thinking that a cache file can be deleted right between the if statement and file_get_contents processes.
I mean when if statement checks the existence of cache file, it exists; but when file_get_contents tries to catch it, it is no longer there (deleted by simultaneous cache deleting process).
file_get_contents locks the file to avoid the undergoing delete process during the read process. But the file can be deleted when the if statement sends the PHP process to the first condition (before start of the file_get_contents).
Is there any approach to avoid this? Is the cache deleting system different?
NOTE: I did not face any practical problem, as it is not very probable to catch this event, but logically it is possible, and should happen on heavy loads.

Luckily file_get_contents return FALSE on error, so you could quick-bake it like:
if (FALSE !== ($buffer = file_get_contents())) {
echo $buffer;
return;
}
// PHP process
or similiar. It's a bit the quick and dirty way, considering you want to place the # operator to hide any warnings about non-existent files:
if (FALSE !== ($buffer = #file_get_contents())) {
The other alternative would be to lock, however that might prevent your cache-deletion to not delete the file if you have locked it.
Then left is to stall the cache your own. That means reading the file-creation time in PHP, check that it is < 5 minutes then for the file-deletion processing (5 minutes is exemplary) and then you would know that the file is already stale and for being replaced with fresh content. Re-create the file then. Otherwise read the file in, which probably is better then with readfile instead of file_get_contents and echo.

On failure, file_get_contents returns false, so what about this:
if (($output = file_get_contents($filename)) === false){
// Do the processing.
$output = 'Generated content';
// Save cache file
file_put_contents($filename, $output);
}
echo $output;
By the way, you may want to consider using fpassthru, which is more memory-efficient, especially for larger files. Using file_get_contents on large files (> 100 MB), will probably cause problems (depending on your configuration).
<?php
$fp = #fopen($filename, 'rb');
if ($fp === false){
// Generate output
} else {
fpassthru($fp);
}

Related

How to rename() a file in PHP that needs to remain locked while doing so?

I have a text file which multiple users will be simultaneously editing (limited to an individual line per edit, per user). I have already found a solution for the "line editing" part of the required functionality right here on StackOverflow.com, specifically, the 4th solution (for large files) offered by #Gnarf in the following question:
how to replace a particular line in a text file using php?
It basically rewrites the entire file contents to a new temporary file (with the user's edit included) and then renames the temporary file to the original file once finished. It's great!
To avoid one user's edit causing a conflict with another user's edit if they are both attempting an edit at the same time, I have introduced flock() functionality, as can be seen in my variation on the code here:
$reading = fopen($file, 'r');
$writing = fopen($temp, 'w');
$replaced = false;
if ((flock($reading, LOCK_EX)) and (flock($writing, LOCK_EX))) {
echo 'Lock acquired.<br>';
while (!feof($reading)) {
$line = fgets($reading);
$values = explode("|",$line);
if ($values[0] == $id) {
$line = $id."|comment edited!".PHP_EOL;
$replaced = true;
}
fputs($writing, $line);
}
flock($reading, LOCK_UN);
flock($writing, LOCK_UN);
fclose($reading);
fclose($writing);
} else {
echo 'Lock not acquired.<br>';
}
I've made sure the $temp file always has a unique filename. Full code here: https://pastebin.com/E31hR9Mz
I understand that flock() will force any other execution of the script to wait in a queue until the first execution has finished and the flock() has been released. So far so good.
However, the problem starts at the end of the script, when the time has come to rename() the temporary file to replace the original file.
if ($replaced) {
rename($temp, $file);
} else {
unlink($temp);
}
From what I have seen, rename() will fail if the original file still has a flock(), so I need to release the flock() before this point. However, I also need it to remain locked, or rename() will fail when another user running the same script immediately opens a new flock() as soon as the previous flock() is released. When this happens, it will return:
Warning: rename(temporary.txt,original.txt): Access is denied. (code: 5)
tl;dr: I seem to be in a bit of a Catch-22. It looks like rename() won't work on a locked file, but unlocking the file will allow another user to immediately lock it again before the rename() can take place.
Any ideas?
update: After some extensive research into how flock() works, (in layman's terms, there is no guarantee that another script will respect the "lock", and therefore it is not really a "lock" at all as one would assume from the literal meaning of the word) I have opted for this solution instead which works like a charm:
https://docstore.mik.ua/orelly/webprog/pcook/ch18_25.htm
"Good lock" on your locking adventures.

PHP script file loads indefinitely on browser when flock (LOCK_SH/LOCK_EX)

I came across this link while trying to learn how to lock files to prevent a script reading from a file as another is writing, or two scripts writing to the same file simultaneously.
I created two scripts, readandwritelock.php and readlock.php, the first script to retrieve the file with file_get_contents, append it and then write back to the same file with file_put_contents($file, $data, LOCK_EX), and the second that just retrieves the file with file_get_contents after flock($file, LOCK_SH).
<?php
//readandwritelock.php
$myfile = fopen('15-11-2018.txt', 'r+');
if (flock($myfile, LOCK_SH)) {
echo "Gotten lock<br>";
$current = file_get_contents('15-11-2018.txt');
/*I commented this on my second test to see if file_put_contents will work.
After uncommenting and third test, it does not work anymore.
if (flock($myfile, LOCK_UN)) {
echo "Unlocked<br>";
}*/
$current .= "appending";
if (file_put_contents('15-11-2018.txt', $current, LOCK_EX)) {
echo "Success";
}
else {
echo "Failed";
//browser loads indefinitely so this does not run
}
fclose($myfile);
}
?>
The problem I am facing is that the first try I was able to file_get_contents after getting the lock, and then releasing the lock and proceed to append and file_put_contents($file, $data, LOCK_EX). However on the second try I decided to comment the releasing of the LOCK_SH lock to test and see what would happen. The script file loads indefinitely (Waiting for localhost...) on my browser, so I reverted back the changes for my third try, but this time the script file still loads indefinitely. It's as if the LOCK_SH was never released.
I must be doing something wrong, but I do not know what exactly it is. Could someone explain?
This was tested on XAMPP and macOS High Sierra and Chrome.
<?php
//readlock.php
//works as normal
$myfile = fopen('15-11-2018.txt', 'r');
if (flock($myfile, LOCK_SH)) {
echo "Gotten lock<br>";
$current = file_get_contents('15-11-2018.txt');
echo $current;
if (flock($myfile, LOCK_UN)) {
echo "<br>Unlocked";
}
fclose($myfile);
}
?>
The reason why your browser seems to load indefinitely is because your PHP file never finishes.
First you get a LOCK_SH (a shared or read lock) for your file, which is fine while you are reading the content.
The problem is that you also try to get a LOCK_EX (an exclusive lock) on the same file in the file_put_contents function. Therefore the file_put_contents functions blocks until all other locks (shared AND exclusive ones) are unlocked, which can't work (this is a deadlock).
For your code to work properly, you can either try to get an exlusive lock in the first place
if( flock($myfile, LOCK_EX) ) {
// ...
or you unlock the shared lock before you write
flock($myfile, LOCK_UN);
if ( file_put_contents('15-11-2018.txt', $current, LOCK_EX) ) {
// ...
In general it is a good idea to keep a locks life as short as possible. If you plan to make extensive manipulations to your data between reading and writing, I would recommend to unlock the file right after reading and lock it again right for writing.

PHP, check if the file is being written to/updated by PHP script?

I have a script that re-writes a file every few hours. This file is inserted into end users html, via php include.
How can I check if my script, at this exact moment, is working (e.g. re-writing) the file when it is being called to user for display? Is it even an issue, in terms of what will happen if they access the file at the same time, what are the odds and will the user just have to wait untill the script is finished its work?
Thanks in advance!
More on the subject...
Is this a way forward using file_put_contents and LOCK_EX?
when script saves its data every now and then
file_put_contents($content,"text", LOCK_EX);
and when user opens the page
if (file_exists("text")) {
function include_file() {
$file = fopen("text", "r");
if (flock($file, LOCK_EX)) {
include_file();
}
else {
echo file_get_contents("text");
}
}
} else {
echo 'no such file';
}
Could anyone advice me on the syntax, is this a proper way to call include_file() after condition and how can I limit a number of such calls?
I guess this solution is also good, except same call to include_file(), would it even work?
function include_file() {
$time = time();
$file = filectime("text");
if ($file + 1 < $time) {
echo "good to read";
} else {
echo "have to wait";
include_file();
}
}
To check if the file is currently being written, you can use filectime() function to get the actual time the file is being written.
You can get current timestamp on top of your script in a variable and whenever you need to access the file, you can compare the current timestamp with the filectime() of that file, if file creation time is latest then the scenario occured when you have to wait for that file to be written and you can log that in database or another file.
To prevent this scenario from happening, you can change the script which is writing the file so that, it first creates temporary file and once it's done you just replace (move or rename) the temporary file with original file, this action would require very less time compared to file writing and make the scenario occurrence very rare possibility.
Even if read and replace operation occurs simultaneously, the time the read script has to wait will be very less.
Depending on the size of the file, this might be an issue of concurrency. But you might solve that quite easy: before starting to write the file, you might create a kind of "lock file", i.e. if your file is named "incfile.php" you might create an "incfile.php.lock". Once you're doen with writing, you will remove this file.
On the include side, you can check for the existance of the "incfile.php.lock" and wait until it's disappeared, need some looping and sleeping in the unlikely case of a concurrent access.
Basically, you should consider another solution by just writing the data which is rendered in to that file to a database (locks etc are available) and render that in a module which then gets included in your page. Solutions like yours are hardly to maintain on the long run ...
This question is old, but I add this answer because the other answers have no code.
function write_to_file(string $fp, string $string) : bool {
$timestamp_before_fwrite = date("U");
$stream = fopen($fp, "w");
fwrite($stream, $string);
while(is_resource($stream)) {
fclose($stream);
}
$file_last_changed = filemtime($fp);
if ($file_last_changed < $timestamp_before_fwrite) {
//File not changed code
return false;
}
return true;
}
This is the function I use to write to file, it first gets the current timestamp before making changes to the file, and then I compare the timestamp to the last time the file was changed.

Running concurrent PHP scripts

I'm having the following problem with my VPS server.
I have a long-running PHP script that sends big files to the browser. It does something like this:
<?php
header("Content-type: application/octet-stream");
readfile("really-big-file.zip");
exit();
?>
This basically reads the file from the server's file system and sends it to the browser. I can't just use direct links(and let Apache serve the file) because there is business logic in the application that needs to be applied.
The problem is that while such download is running, the site doesn't respond to other requests.
The problem you are experiencing is related to the fact that you are using sessions. When a script has a running session, it locks the session file to prevent concurrent writes which may corrupt the session data. This means that multiple requests from the same client - using the same session ID - will not be executed concurrently, they will be queued and can only execute one at a time.
Multiple users will not experience this issue, as they will use different session IDs. This does not mean that you don't have a problem, because you may conceivably want to access the site whilst a file is downloading, or set multiple files downloading at once.
The solution is actually very simple: call session_write_close() before you start to output the file. This will close the session file, release the lock and allow further concurrent requests to execute.
Your server setup is probably not the only place you should be checking.
Try doing a request from your browser as usual and then do another from some other client.
Either wget from the same machine or another browser on a different machine.
In what way doesn't the server respond to other requests? Is it "Waiting for example.com..." or does it give an error of any kind?
I do something similar, but I serve the file chunked, which gives the file system a break while the client accepts and downloads a chunk, which is better than offering up the entire thing at once, which is pretty demanding on the file system and the entire server.
EDIT: While not the answer to this question, asker asked about reading a file chunked. Here's the function that I use. Supply it the full path to the file.
function readfile_chunked($file_path, $retbytes = true)
{
$buffer = '';
$cnt = 0;
$chunksize = 1 * (1024 * 1024); // 1 = 1MB chunk size
$handle = fopen($file_path, 'rb');
if ($handle === false) {
return false;
}
while (!feof($handle)) {
$buffer = fread($handle, $chunksize);
echo $buffer;
ob_flush();
flush();
if ($retbytes) {
$cnt += strlen($buffer);
}
}
$status = fclose($handle);
if ($retbytes && $status) {
return $cnt; // return num. bytes delivered like readfile() does.
}
return $status;
}
I have tried different approaches (reading and sending the files in small chunks [see comments on readfile in PHP doc], using PEARs HTTP_Download) but I always ran into performance problems when the files are getting big.
There is an Apache mod X-Sendfile where you can do your business logic and then delegate the download to Apache. The download will not be publicly available. I think, this is the most elegant solution for the problem.
More Info:
http://tn123.org/mod_xsendfile/
http://www.brighterlamp.com/2010/10/send-files-faster-better-with-php-mod_xsendfile/
The same happens go to me and i'm not using sessions.
session.auto_start is set to 0
My example script only runs "sleep(5)", and adding "session_write_close()" at the beginning doesn't solve the problem.
Check your httpd.conf file. Maybe you have "KeepAlive On" and that is why your second request hangs until the first is completed. In general your PHP script should not allow the visitors to wait for long time. If you need to download something big, do it in a separate internal request that user have no direct control of. Until its done, return some "executing" status to the end user and when its done, process the actual results.

Bug in my caching code

Here's my code:
$cachefile = "cache/ttcache.php";
if(file_exists($cachefile) && ((time() - filemtime($cachefile)) < 900))
{
include($cachefile);
}
else
{
ob_start();
/*resource-intensive loop that outputs
a listing of the top tags used on the website*/
$fp = fopen($cachefile, 'w');
fwrite($fp, ob_get_contents());
fflush($fp);
fclose($fp);
ob_end_flush();
}
This code seemed like it worked fine at first sight, but I found a bug, and I can't figure out how to solve it. Basically, it seems that after I leave the page alone for a period of time, the cache file empties (either that, or when I refresh the page, it clears the cache file, rendering it blank). Then the conditional sees the now-blank cache file, sees its age as less than 900 seconds, and pulls the blank cache file's contents in place of re-running the loop and refilling the cache.
I catted the cache file in the command line and saw that it is indeed blank when this problem exists.
I tried setting it to 60 seconds to replicate this problem more often and hopefully get to the bottom of it, but it doesn't seem to replicate if I am looking for it, only when I leave the page and come back after a while.
Any help?
In the caching routines that I write, I almost always check the filesize, as I want to make sure I'm not spewing blank data, because I rely on a bash script to clear out the cache.
if(file_exists($cachefile) && (filesize($cachefile) > 1024) && ((time() - filemtime($cachefile)) < 900))
This assumes that your outputted cache file is > 1024 bytes, which, usually it will be if it's anything relatively large. Adding a lock file would be useful as well, as noted in the comments above to avoid multiple processes trying to write to the same lock file.
you can double check the file size with the filesize() function, if it's too small, act as if the cache was old.
if there's no PHP in the file, you may want to either use readfile() for performance reasons to just spit the file back out to the end user.

Categories