I am trying to block a file using the flock() function to avoid an overwrite error but the counter often resets (probably for a reading issue).
numeri.txt (counter)
4895|533753
frame.php (PHP file)
$filename="numeri.txt";
$contents=file_get_contents($filename);
if(($fp=#fopen($filename,'w'))!==false)
{
if(flock($fp,LOCK_EX))
{
$contents=explode("|",$contents);
$clicks=$contents[0];
$impressions=$contents[1]+1;
fwrite($fp,$clicks."|".$impressions);
flock($fp,LOCK_UN);
}
fclose($fp);
}
Sometimes the counter returns "|1" instead of "4895|533754".
How can I fix it?
If two threads will execute your code almost simultaneusly with the small delay, the 1st thread will open file for write and erase it contents before locking.
The 2nd thread will read the empty file contents, wait for lock release, and then overwrite the correct data.
The solution is to open file not in "w", but in "a" or "c" mode and then use fwrite, fseek and ftruncate.
Related
I have a PHP script which receives and saves invoices as files in Linux. Later, a C++ infinite loop based program reads each and does some processing. I want the latter to read each file safely (only after fully written).
PHP side code simplification:
file_put_contents("sampleDir/invoice.xml", "contents", LOCK_EX)
On the C++ side (with C filesystem API), I must first note that I want to preserve a code which deletes the files in the designated invoices folder which are empty, just as a means to properly deal with the edge case of an empty file being created from other sources (not the PHP script).
Now, here's a C++ side code simplification, too:
FILE* pInvoiceFile = fopen("sampleDir/invoice.xml", "r");
if (pInvoiceFile != NULL)
{
if (flock(pInvoiceFile->_fileno, LOCK_SH) == 0)
{
struct stat fileStat;
fstat(pInvoiceFile->_fileno, &fileStat);
string invoice;
invoice.resize(fileStat.st_size);
if (fread((char*)invoice.data(), 1, fileStat.st_size, pInvoiceFile) < 1)
{
remove("sampleDir/invoice.xml"); // Edge case resolution
}
flock(pInvoiceFile->_fileno, LOCK_UN);
}
}
fclose(pInvoiceFile);
As you can see, the summarizing key concept is the cooperation of LOCK_EX and LOCK_SH flags.
My problem is that, while this integration has been working fine, yesterday I noticed the edge case executed for an invoice which should not be empty, and thus it got deleted by the C++ program.
PHP manual on file_put_contents mentions the following for the LOCK_EX flag:
Acquire an exclusive lock on the file while proceeding to the writing. In other words, a flock() call happens between the fopen() call and the fwrite() call. This is not identical to an fopen() call with mode "x".
Could the problem be caused as a race condition by the LOCK_EX not being established right before file_put_contents calls fopen? If so, what could be done to solve this while keeping the edge case removal code?
Otherwise, may I be doing anything wrong overall?
Your code is assuming that the file_put_contents() operation is atomic, and that using FLOCK_EX and FLOCK_SH is enough to ensure no race conditions between the two programs happen. This is not the case.
As you can see from the PHP doc, the FLOCK_EX is applied after opening the file. This is important, because it leaves a short window of time for the C++ program to successfully open the file and lock it with FLOCK_SH. At that point the file was already truncated by the fopen() done by PHP, and it's empty.
What's most likely happening is:
PHP code opens the file for writing, truncating it and effectively wiping out its content.
C++ code opens the file for reading.
C++ code requests the shared lock on the file: the lock is granted.
PHP code requests the exclusive lock on the file: the call blocks, waiting for the lock to be available.
C++ code reads the file's contents: nothing, the file is empty.
C++ code deletes the file.
C++ code releases the shared lock.
PHP code acquires the exclusive lock.
PHP code writes to the file: the data does not reach the disk because the inode associated with the open file descriptor does not exist anymore.
You are effectively left with no file and the data is lost.
The problem with your code is that the operations you are doing on the file from two different programs are not atomic, and the way you are acquiring the locks does not help in ensuring that those don't overlap.
The only sane way of guaranteeing the atomicity of such an operation on a POSIX compliant system, without even worrying about file locking, is to take advantage of the atomicity of rename(2):
If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing.
If newpath exists but the operation fails for some reason, rename() guarantees to leave an instance of newpath in place.
The equivalent rename() PHP function is what you should use in this case. It's the simplest way to guarantee atomic updates to a file.
What I would suggest is the following:
PHP code:
$tmpfname = tempnam("/tmp", "myprefix"); // Create a temporary file.
file_put_contents($tmpfname, "contents"); // Write to the temporary file.
rename($tmpfname, "sampleDir/invoice.xml"); // Atomically replace the contents of invoice.xml by renaming the file.
// TODO: check for errors in all the above calls, most importantly tempnam().
C++ code:
FILE* pInvoiceFile = fopen("sampleDir/invoice.xml", "r");
if (pInvoiceFile != NULL)
{
struct stat fileStat;
fstat(fileno(pInvoiceFile), &fileStat);
string invoice;
invoice.resize(fileStat.st_size);
size_t n = fread(&invoice[0], 1, fileStat.st_size, pInvoiceFile);
fclose(pInvoiceFile);
if (n == 0)
remove("sampleDir/invoice.xml");
}
This way, the C++ program will always either see the old version of the file (if fopen() happens before PHP's rename()) or the new version of the file (if fopen() happens after), but it will never see an inconsistent version of the file.
I wanted to wait for all processes reading a certain file in PHP by obtaining an exclusive lock on that file, and after that delete (unlink) the file. This concerns files like profile pictures which a user can delete or change. The name of the file will be something like the user ID.
My code:
//Obtain lock
$file = fopen("path/to/file", "r"); //(I'm not sure which mode to use here btw)
flock($file, LOCK_EX);
//Delete file
unlink("path/to/file");
Line 3 waits for all locks to be released, which is good, but the unlink function throws an error: Warning: unlink(path/to/file): Resource temporarily unavailable in path/to/script on line xx
To prevent this I could release the lock before calling unlink, but this means another process could lock on the file again, which would cause the same error.
My questions are:
Is it possible to delete a file in PHP without releasing the lock? That is, without the risk of other processes trying to use the file at the same time.
If not:
Is this possible in Windows at all? How about Unix?
Should I involve my database for this matter and lock on rows in the database instead, or is there a better way?
Another option I can see is repeating this piece of code, including a release of the lock before calling unlink, until unlink succeeds, but this seems a bit messy, right?
Hey I'm struggling with this too, 2 years later. Kind of seems dumb you can't acquire an exclusive lock on a file when trying to rename or unlink it, or at least the documentation isn't there for doing this.
One solution is open the file for writing, acquire an exclusive lock, clear contents of the file using ftruncate, close it, and then unlink it. When you're reading from the file, you can check the size to make sure the file has contents.
When deleting (untested code):
$fh = fopen('yourfile.txt', 'c'); // 'w' mode truncates file, you don't want to do that yet!
flock($fh, LOCK_EX); // blocking, but you can use LOCK_EX | LOCK_NB for nonblocking and a loop + sleep(1) for a timeout
ftruncate($fh, 0); // truncate file to 0 length
fclose($fh);
unlink('yourfile.txt');
When reading (untested code):
if (!file_exists('yourfile.txt') || filesize('yourfile.txt') <= 0) {
print 'nah.jpg, must be dELeTeD :O';
}
I am trying to understand the right way to synchronize file read/write using the flock in PHP.
I have two php scripts.
testread.php:
<?
$fp=fopen("test.txt","r");
if (!flock($fp,LOCK_SH))
echo "failed to lock\n";
else
echo "lock ok\n";
while(true) sleep(1000);
?>
and testwrite.php:
<?
$fp=fopen("test.txt","w");
if (flock($fp,LOCK_EX|LOCK_NB))
{
echo "acquired write lock\n";
}
else
{
echo "failed to acquire write lock\n";
}
fclose($fp);
?>
Now I run testread.php and let it hang there. Then I run testwrite.php in another session. As expected, flock failed in testwrite.php. However, the content of the file test.txt is cleared when testwrite.php exits. The fact is, fopen always succeeds even if the file has been locked in another process. If the file is opened with "w" mode, the file content will be erased regardless of the lock. So what is the point of flock here? It doesn't really protect anything.
You are using fopen() with the w mode in testwrite.php. When using the w option fopen() will truncate the file after opening it. (see fopen()).
Because of that the file gets truncated in your example before you try to obtain the exclusive lock. However you'll need an open file descriptor in order to use flock().
The way out of this dilemma is to use a lock file different from the file you are working on. The flock() manual page mentions this:
Because flock() requires a file pointer, you may have to use a special lock file to protect access to a file that you intend to truncate by opening it in write mode (with a "w" or "w+" argument to fopen()).
The accepted answer is overly complicated. You can simply open the file using a "c" argument, which doesn't truncate the file. Then call ftruncate() only if you acquire the lock.
From the documentation:
'c' Open the file for writing only. If the file does not exist, it is
created. If it exists, it is neither truncated (as opposed to 'w'),
nor the call to this function fails (as is the case with 'x'). The
file pointer is positioned on the beginning of the file. This may be
useful if it's desired to get an advisory lock (see flock()) before
attempting to modify the file, as using 'w' could truncate the file
before the lock was obtained (if truncation is desired, ftruncate()
can be used after the lock is requested).
Have a file in a website. A PHP script modifies it like this:
$contents = file_get_contents("MyFile");
// ** Modify $contents **
// Now rewrite:
$file = fopen("MyFile","w+");
fwrite($file, $contents);
fclose($file);
The modification is pretty simple. It grabs the file's contents and adds a few lines. Then it overwrites the file.
I am aware that PHP has a function for appending contents to a file rather than overwriting it all over again. However, I want to keep using this method since I'll probably change the modification algorithm in the future (so appending may not be enough).
Anyway, I was testing this out, making like 100 requests. Each time I call the script, I add a new line to the file:
First call:
First!
Second call:
First!
Second!
Third call:
First!
Second!
Third!
Pretty cool. But then:
Fourth call:
Fourth!
Fifth call:
Fourth!
Fifth!
As you can see, the first, second and third lines simply disappeared.
I've determined that the problem isn't the contents string modification algorithm (I've tested it separately). Something is messed up either when reading or writing the file.
I think it is very likely that the issue is when the file's contents are read: if $contents, for some odd reason, is empty, then the behavior shown above makes sense.
I'm no expert with PHP, but perhaps the fact that I performed 100 calls almost simultaneously caused this issue. What if there are two processes, and one is writing the file while the other is reading it?
What is the recommended approach for this issue? How should I manage file modifications when several processes could be writing/reading the same file?
What you need to do is use flock() (file lock)
What I think is happening is your script is grabbing the file while the previous script is still writing to it. Since the file is still being written to, it doesn't exist at the moment when PHP grabs it, so php gets an empty string, and once the later processes is done it overwrites the previous file.
The solution is to have the script usleep() for a few milliseconds when the file is locked and then try again. Just be sure to put a limit on how many times your script can try.
NOTICE:
If another PHP script or application accesses the file, it may not necessarily use/check for file locks. This is because file locks are often seen as an optional extra, since in most cases they aren't needed.
So the issue is parallel accesses to the same file, while one is writing to the file another instance is reading before the file has been updated.
PHP luckily has a mechanisms for locking the file so no one can read from it until the lock is released and the file has been updated.
flock()
can be used and the documentation is here
You need to create a lock, so that any concurrent requests will have to wait their turn. This can be done using the flock() function. You will have to use fopen(), as opposed to file_get_contents(), but it should not be a problem:
$file = 'file.txt';
$fh = fopen($file, 'r+');
if (flock($fh, LOCK_EX)) { // Get an exclusive lock
$data = fread($fh, filesize($file)); // Get the contents of file
// Do something with data here...
ftruncate($fh, 0); // Empty the file
fwrite($fh, $newData); // Write new data to file
fclose($fh); // Close handle and release lock
} else {
die('Unable to get a lock on file: '.$file);
}
What's the cleanest way in php to open a file, read the contents, and subsequently overwrite the file's contents with some output based on the original contents? Specifically, I'm trying to open a file populated with a list of items (separated by newlines), process/add items to the list, remove the oldest N entries from the list, and finally write the list back into the file.
fopen(<path>, 'a+')
flock(<handle>, LOCK_EX)
fread(<handle>, filesize(<path>))
// process contents and remove old entries
fwrite(<handle>, <contents>)
flock(<handle>, LOCK_UN)
fclose(<handle>)
Note that I need to lock the file with flock() in order to protect it across multiple page requests. Will the 'w+' flag when fopen()ing do the trick? The php manual states that it will truncate the file to zero length, so it seems that may prevent me from reading the file's current contents.
If the file isn't overly large (that is, you can be confident loading it won't blow PHP's memory limit), then the easiest way to go is to just read the entire file into a string (file_get_contents()), process the string, and write the result back to the file (file_put_contents()). This approach has two problems:
If the file is too large (say, tens or hundreds of megabytes), or the processing is memory-hungry, you're going to run out of memory (even more so when you have multiple instances of the thing running).
The operation is destructive; when the saving fails halfway through, you lose all your original data.
If any of these is a concern, plan B is to process the file and at the same time write to a temporary file; after successful completion, close both files, rename (or delete) the original file and then rename the temporary file to the original filename.
Read
$data = file_get_contents($filename);
Write
file_put_contents($filename, $data);
One solution is to use a separate lock file to control access.
This solution assumes that only your script, or scripts you have access to, will want to write to the file. This is because the scripts will need to know to check a separate file for access.
$file_lock = obtain_file_lock();
if ($file_lock) {
$old_information = file_get_contents('/path/to/main/file');
$new_information = update_information_somehow($old_information);
file_put_contents('/path/to/main/file', $new_information);
release_file_lock($file_lock);
}
function obtain_file_lock() {
$attempts = 10;
// There are probably better ways of dealing with waiting for a file
// lock but this shows the principle of dealing with the original
// question.
for ($ii = 0; $ii < $attempts; $ii++) {
$lock_file = fopen('/path/to/lock/file', 'r'); //only need read access
if (flock($lock_file, LOCK_EX)) {
return $lock_file;
} else {
//give time for other process to release lock
usleep(100000); //0.1 seconds
}
}
//This is only reached if all attempts fail.
//Error code here for dealing with that eventuality.
}
function release_file_lock($lock_file) {
flock($lock_file, LOCK_UN);
fclose($lock_file);
}
This should prevent a concurrently-running script reading old information and updating that, causing you to lose information that another script has updated after you read the file. It will allow only one instance of the script to read the file and then overwrite it with updated information.
While this hopefully answers the original question, it doesn't give a good solution to making sure all concurrent scripts have the ability to record their information eventually.