PHP allows us to use the x flag when doing fopen:
Create and open for writing only; place the file pointer at the
beginning of the file.
If the file already exists, the fopen() call
will fail by returning FALSE and generating an error of level
E_WARNING.
If the file does not exist, attempt to create it. This is
equivalent to specifying O_EXCL|O_CREAT flags for the underlying
open(2) system call.
Does this mean that no matter how many concurrent fopen requests we have (from different users), it is guaranteed that the file will only be created once and never overwritten?
if ($handle = fopen("part006", "x+b")) {
do_some_processing();
echo "You managed to process.";
/*
can we guarantee that only 1 user (http request)
will ever process the function and see the
message "you managed to process" ?
*/
} else {
echo "You failed to process.";
}
Answer: It is guaranteed that the file will be only created once and never being overwritten , as long as other processes also use O_EXCL. If they do not, the file can be overwritten. So, opening (creating) a file using O_EXCL means not, that the file is somewhat write protected
Explanation: fopen uses the underlying open syscall. From the man page: man 2 open
O_EXCL
If O_CREAT and O_EXCL are set, open() shall fail if the file exists. The check for the existence of the file and the creation of the file if it does not exist shall be atomic with respect to other threads executing open() naming the same filename in the same directory with O_EXCL and O_CREAT set. If O_EXCL and O_CREAT are set, and path names a symbolic link, open() shall fail and set errno to [EEXIST], regardless of the contents of the symbolic link. If O_EXCL is set and O_CREAT is not set, the result is undefined.
Related
I am writing a wee wrapper (in PHP 7.0.33) around a complex binary which takes its input from a named file. As this will be processing secrets, I don't want to commit the data to the filesystem, hence using a FIFO rather than conventional file. The binary will happily read its data from a FIFO (tested using 2 shell sessions - in one I created the fifo and started the binary, in the second I catted a file into the fifo).
However in PHP the call to fopen() blocks, regardless if I specify w, a or c
if (posix_mkfifo("myfifo", 0600)) {
$writer=fopen("myfifo", 'w'); // this blocks
`binary myfifo`;
fputs($writer, $mydata);
}
While I would expect writes to block if nothing is reading the data, I did not expect fopen() to block.
It does appear to work (execution continues, and the binary is started) with "w+" however the binary fails with
QIODevice::read (QFile, "filename"): device not open
To investigate further, I wrote a simple replacement for the binary. Again this works when I cat a file into the FIFO:
$in='';
$fh=fopen($argv[1],'r');
if (is_resource($fh)) {
print "File opened\n";
while (!feof($fh)) {
$in.=fgets($fh);
}
} else {
print "failed to open file\n";
}
file_put_contents("output", $in);
but when I write to the FIFO from the PHP code....
fopen(import): failed to open stream: No such file or directory in ...
By default, opening a FIFO will block until there is at least one reader and writer. The rationale for this is that the kernel has no place to stash the pipe data if no process is there to consume it. man page for fifo:
... the FIFO special file has no contents on the file system, the file system entry merely serves as a reference point so that processes can access the pipe using a name in the file system.
The kernel maintains exactly one pipe object for each FIFO special file that is opened by at least one process. The FIFO must be opened on both ends (reading and writing) before data can be passed. Normally, opening the FIFO blocks until the other end is opened also.
You can bypass this behaviour though. One way is like you've done - by opening the read and write end yourself. The other is to set O_NONBLOCK flag when opening the file (you can set it back to block afterwards). AFAIK you can't do that with fopen. Example with dio library:
<?php
echo "Opening\n";
$writer = dio_open("myfifo", O_CREAT | O_WRONLY | O_NONBLOCK) or die("Could not create FIFO\n");
echo "Open. Writing\n";
dio_write($writer, "DATA");
echo "Done\n";
The caveat with doing this is, if there is no reader, the process above will write the data, then exit immediately and then the data is lost forever.
For the benefit of anyone finding this question in Google and seeking a more detailled solution than comment on semisecure's answer:
if (pcntl_fork()) {
`binary myfifo`;
} else {
$fh=fopen('myfifo', 'w');
fputs($fh, $data);
fclose($fh);
}
(but you might also want to add a SIGALRM on the writer in case "binary" does not execute / flush the pipe).
I have a PHP script which receives and saves invoices as files in Linux. Later, a C++ infinite loop based program reads each and does some processing. I want the latter to read each file safely (only after fully written).
PHP side code simplification:
file_put_contents("sampleDir/invoice.xml", "contents", LOCK_EX)
On the C++ side (with C filesystem API), I must first note that I want to preserve a code which deletes the files in the designated invoices folder which are empty, just as a means to properly deal with the edge case of an empty file being created from other sources (not the PHP script).
Now, here's a C++ side code simplification, too:
FILE* pInvoiceFile = fopen("sampleDir/invoice.xml", "r");
if (pInvoiceFile != NULL)
{
if (flock(pInvoiceFile->_fileno, LOCK_SH) == 0)
{
struct stat fileStat;
fstat(pInvoiceFile->_fileno, &fileStat);
string invoice;
invoice.resize(fileStat.st_size);
if (fread((char*)invoice.data(), 1, fileStat.st_size, pInvoiceFile) < 1)
{
remove("sampleDir/invoice.xml"); // Edge case resolution
}
flock(pInvoiceFile->_fileno, LOCK_UN);
}
}
fclose(pInvoiceFile);
As you can see, the summarizing key concept is the cooperation of LOCK_EX and LOCK_SH flags.
My problem is that, while this integration has been working fine, yesterday I noticed the edge case executed for an invoice which should not be empty, and thus it got deleted by the C++ program.
PHP manual on file_put_contents mentions the following for the LOCK_EX flag:
Acquire an exclusive lock on the file while proceeding to the writing. In other words, a flock() call happens between the fopen() call and the fwrite() call. This is not identical to an fopen() call with mode "x".
Could the problem be caused as a race condition by the LOCK_EX not being established right before file_put_contents calls fopen? If so, what could be done to solve this while keeping the edge case removal code?
Otherwise, may I be doing anything wrong overall?
Your code is assuming that the file_put_contents() operation is atomic, and that using FLOCK_EX and FLOCK_SH is enough to ensure no race conditions between the two programs happen. This is not the case.
As you can see from the PHP doc, the FLOCK_EX is applied after opening the file. This is important, because it leaves a short window of time for the C++ program to successfully open the file and lock it with FLOCK_SH. At that point the file was already truncated by the fopen() done by PHP, and it's empty.
What's most likely happening is:
PHP code opens the file for writing, truncating it and effectively wiping out its content.
C++ code opens the file for reading.
C++ code requests the shared lock on the file: the lock is granted.
PHP code requests the exclusive lock on the file: the call blocks, waiting for the lock to be available.
C++ code reads the file's contents: nothing, the file is empty.
C++ code deletes the file.
C++ code releases the shared lock.
PHP code acquires the exclusive lock.
PHP code writes to the file: the data does not reach the disk because the inode associated with the open file descriptor does not exist anymore.
You are effectively left with no file and the data is lost.
The problem with your code is that the operations you are doing on the file from two different programs are not atomic, and the way you are acquiring the locks does not help in ensuring that those don't overlap.
The only sane way of guaranteeing the atomicity of such an operation on a POSIX compliant system, without even worrying about file locking, is to take advantage of the atomicity of rename(2):
If newpath already exists, it will be atomically replaced, so that there is no point at which another process attempting to access newpath will find it missing.
If newpath exists but the operation fails for some reason, rename() guarantees to leave an instance of newpath in place.
The equivalent rename() PHP function is what you should use in this case. It's the simplest way to guarantee atomic updates to a file.
What I would suggest is the following:
PHP code:
$tmpfname = tempnam("/tmp", "myprefix"); // Create a temporary file.
file_put_contents($tmpfname, "contents"); // Write to the temporary file.
rename($tmpfname, "sampleDir/invoice.xml"); // Atomically replace the contents of invoice.xml by renaming the file.
// TODO: check for errors in all the above calls, most importantly tempnam().
C++ code:
FILE* pInvoiceFile = fopen("sampleDir/invoice.xml", "r");
if (pInvoiceFile != NULL)
{
struct stat fileStat;
fstat(fileno(pInvoiceFile), &fileStat);
string invoice;
invoice.resize(fileStat.st_size);
size_t n = fread(&invoice[0], 1, fileStat.st_size, pInvoiceFile);
fclose(pInvoiceFile);
if (n == 0)
remove("sampleDir/invoice.xml");
}
This way, the C++ program will always either see the old version of the file (if fopen() happens before PHP's rename()) or the new version of the file (if fopen() happens after), but it will never see an inconsistent version of the file.
in my PHP project, I use some kind of counter that appends to an existing (or new) file very often:
$f = fopen($filename, 'ab');
fwrite($f);
fclose($f);
When a new file is created, I have to edit this file's permissions, so another user may access the file as well:
$existed = file_exists($filename);
// Do the append
$f = fopen($filename, 'ab');
fwrite($f);
fclose($f);
// Update permissions
if (!$existed) {
#chmod($filename, 0666);
}
Is there any way to find out, whether 'a' (append) created a new file or appended to an exiting one without using file_exists()? To my understanding, file_exists() retrieves the file stats, which causes some unnecessary overhead compared to a simple file-append. As the function is used very often, I wonder if there's an option to tell if fopen(..., 'a') created a new file without using file_exists()?
Note: This is mostly a question of style and interest, not a true performance issue. But if I am mistaken and fopen() already retrieves the file stats, please let me know!
Update
Okay, it really is a rather academic question. Here're some performance tests run on a windows system (Apache, Win8.1 - no UNIX file permissions) and a linux machine (Nginx, Ubuntu 14.04, virutal machine).
Each test run with 1000 repetitions, file deleted before the first repetition.
Win Linux
simply append one byte 1.8ms 9.4ms
append + clearstatcache() 1.8ms 9.3ms
test fileexists() + append 2.2ms 10.5ms
fileexists() + append + clear 2.2ms 11.0ms
append + chmod() 2.7ms 12.3ms
append + fileexists() -> chmod() 3.3ms 10.6ms
Note: The last one is the only one that uses and IF within the test loop.
The php fopen is just a call to the libc fopen, that automatically creates a file for the modes w,w+,a and a+. As far as I can see, there is no way to get the stat with the permission bits from the returned file pointer.
It seems that PHP stores the stat array for each opened file and you can access it with fstat($fp) with the opened file handle $fp. But the mode field contains inode permission bits. I can't immediately see how "inode permission bits" are related to the "UNIX file mode". The stat system call does not use this term.
You can use "r+" mode to open your file and create it if that fails. If not you need to SEEK to then end to achieve something similar.
But finally it's best to check for existence before you open the file.
No, fopen just returns the resource, it doesn't return or set a flag that indicates whether the file already existed - http://php.net/manual/en/function.fopen.php
EDIT: see the performance test in the edited question.
Why not calling chmod() every times?
Your file_exist() is probably (maybe a little performance test...) more expensive than a chmod().
// Do the append
$f = fopen($filename, 'ab');
fwrite($f);
fclose($f);
// Update persmissions
#chmod($this->filename, 0666);
I am trying to understand the right way to synchronize file read/write using the flock in PHP.
I have two php scripts.
testread.php:
<?
$fp=fopen("test.txt","r");
if (!flock($fp,LOCK_SH))
echo "failed to lock\n";
else
echo "lock ok\n";
while(true) sleep(1000);
?>
and testwrite.php:
<?
$fp=fopen("test.txt","w");
if (flock($fp,LOCK_EX|LOCK_NB))
{
echo "acquired write lock\n";
}
else
{
echo "failed to acquire write lock\n";
}
fclose($fp);
?>
Now I run testread.php and let it hang there. Then I run testwrite.php in another session. As expected, flock failed in testwrite.php. However, the content of the file test.txt is cleared when testwrite.php exits. The fact is, fopen always succeeds even if the file has been locked in another process. If the file is opened with "w" mode, the file content will be erased regardless of the lock. So what is the point of flock here? It doesn't really protect anything.
You are using fopen() with the w mode in testwrite.php. When using the w option fopen() will truncate the file after opening it. (see fopen()).
Because of that the file gets truncated in your example before you try to obtain the exclusive lock. However you'll need an open file descriptor in order to use flock().
The way out of this dilemma is to use a lock file different from the file you are working on. The flock() manual page mentions this:
Because flock() requires a file pointer, you may have to use a special lock file to protect access to a file that you intend to truncate by opening it in write mode (with a "w" or "w+" argument to fopen()).
The accepted answer is overly complicated. You can simply open the file using a "c" argument, which doesn't truncate the file. Then call ftruncate() only if you acquire the lock.
From the documentation:
'c' Open the file for writing only. If the file does not exist, it is
created. If it exists, it is neither truncated (as opposed to 'w'),
nor the call to this function fails (as is the case with 'x'). The
file pointer is positioned on the beginning of the file. This may be
useful if it's desired to get an advisory lock (see flock()) before
attempting to modify the file, as using 'w' could truncate the file
before the lock was obtained (if truncation is desired, ftruncate()
can be used after the lock is requested).
I have the following code:
function lock() {
if($lock = #fopen('lock.txt', 'x')) {
fwrite($lock, getmypid());
fclose($lock);
return true;
}
else {
return false;
}
}
$response = lock();
When I run the code, the file lock.txt is created and the PID is put into the file. However, the function returns false. What in the world is going on?
I need X for fopen because I'm using this function for file locking and control
I took the # off and this is the error that I got:
fopen(lock.txt): failed to open stream: File exists in /xxx on line
22.
The problem is, I know for sure that the file does not exist -- I even went back and deleted it before I ran the code. The code creates the file but still returns false.
I checked to make sure no other code is creating the file. I even waited 30 secs to wait and see if the file reappeared -- it does not appear by itself, it only appears after I execute this code.
The PHP Manual states that for mode x:
Create and open for writing only; place the file pointer at the beginning of the file. If the file already exists, the fopen() call will fail by returning FALSE and generating an error of level E_WARNING. If the file does not exist, attempt to create it. This is equivalent to specifying O_EXCL|O_CREAT flags for the underlying open(2) system call.
The function is returning false because #fopen('lock.txt', 'x') returns false (the file would already exist), which causes $lock = #fopen('lock.txt', 'x') to evaluate to false, triggering the branch to return false;.
fopen() mode X http://php.net/manual/en/function.fopen.php
Create and open for writing only; place the file pointer at the
beginning of the file. If the file already exists, the fopen() call
will fail by returning FALSE and generating an error of level
E_WARNING.
If the file already exists, the fopen() call will fail by returning FALSE and generating an error of level E_WARNING.
You will see the warning if you don't use # error suppression.
You probably want mode W.
Open for writing only; place the file pointer at the beginning of the
file and truncate the file to zero length. If the file does not exist,
attempt to create it.