I would like to implement a quick and efficient serialization mechanism between PHP requests for virtual named resources that would unlock when the script is finished, either normally or due to error. I had eaccelerator_lock() and its corresponding eaccelerator_unlock() in the past, but eaccelerator doesn't implement that function anymore. What I want to do is something like:
lock_function("my-named-resource");
..
my_might_abort_abruptly_function();
..
unlock_function("my-named-resource");
Other PHP scripts calling lock_function() with the exact same parameter should block until this script calls unlock_function() or aborts. The resource name is unknown before the processing (it's a generated string) and can't be constrained to a small set (i.e., the locking mechanism should have good granularity). I would like to avoid try/catch code, because there are circunstances in which catch is not called. Also, any mechanism depending on manual usleep() spinning (instead of native OS blocking) should be avoided.
Mine is the only running application in the server. The system is a CentOS 6 Linux with PHP 5.3.3, Apache 2.2.15 and I have full control over it.
I explored the following alternatives:
semaphores: they are not well implemented in PHP; Linux allows arrays of thousands, while PHP only allocates one per id.
flock(): my resources are virtual, and flock() would only lock whole/real/existing files; I'd need to pre-create thousands of files and choose one to lock with a hash function. The granularity would depend on the number of files.
dio_fcntl(): I could attempt to reproduce the idea of flock() with a single file and fcntl(F_SETLK). This would have the advantage of a good granularity without the need of many files; the file could even be zero bytes long! (F_SETLK can lock beyond the end of the file). Alas! The problem is that nowhere in the documentation says that dio_fcntl() will release resources when the script terminates.
database lock: I could implement some key locking in a database with good key locking granularity, althought this is too database dependent. It would not be so quick either.
implement my own PHP extension: I'd really like to avoid that path.
The thing is, I think someone somewhere should have thought of this before me. What would be a good choice? Is there another solution I'm not seeing?
Thanks in advance. Guillermo.
You can always go old school and touch a file when your script starts and remove it when complete.
You could register_shutdown_function to remove the file.
The existence or absence of the file would indicate the locked state of the resource.
It turns out dio_open() does release the resources upon script termination. So I ended writing up the following functions:
$lockfile = $writable_dir."/serialized.lock";
function serialize_access($name)
{
$f = serialize_openfile();
if( !$f ) return false;
$h = serialize_gethash($name);
return dio_fcntl($f, F_SETLKW, array("whence"=>SEEK_SET,"start"=>$h, "length"=>1, "type"=>F_WRLCK)) >= 0;
}
function serialize_release($name)
{
$f = serialize_openfile();
if( !$f ) return false;
$h = serialize_gethash($name);
#dio_fcntl($f, F_SETLK, array("whence"=>SEEK_SET,"start"=>$h, "length"=>1, "type"=>F_UNLCK));
}
function serialize_gethash($name)
{
// Very good granularity (2^31)
return crc32($name) & 0x7fffffff;
}
function serialize_openfile()
{
global $lockfile, $serialize_file;
if( !isset($serialize_file) )
{
$serialize_file = false;
if( extension_loaded("dio") )
{
$serialize_file = #dio_open($lockfile,O_RDWR);
if( $serialize_file )
{
// Do not attempt to create the file with dio_open()
// because the file permissions get all mangled.
$prev = umask(0);
$temp = fopen($lockfile,"a");
if( $temp )
{
$serialize_file = #dio_open($lockfile,O_RDWR);
fclose($temp);
}
umask($prev);
}
}
}
return $serialize_file;
}
It seems to work very well.
implement my own PHP extension
You might want to check ninja-mutex library which does exactly what you want
Related
I have an issue with my non-unique hit counter.
The script is as below:
$filename = 'counter.txt';
if (file_exists($filename)) {
$current_value = file_get_contents($filename);
} else {
$current_value = 0;
}
$current_value++;
file_put_contents($filename, $current_value);
When I'm refreshing my website very often (like 10 times per second or even faster), the value in the text file are getting reset to 0.
Any guess for fixing this issue?
This is a pretty poor way to maintain a counter, but your problem is probably that when you fire multiple requests at the site, one of the calls to file_exists() is getting a false because one of the other processes is removing and recreating the file.
If you want this to work consistantly you are going to have to lock the file between read and write See flock on php manual
Of course without the file lock you would also be getting incorrect counts anyway, when 2 processes manage to read the same value from the file.
Locking the file would also potentially slow your system down as 2 or more processes queue for access to the file.
It would probably be a better idea to store your counter in a database, as they are designed for coping with this kind of quick fire access and ensuring every process is properly queued and released.
Does it help if you add a check if file_get_contents isn't returning false?
$value = file_get_contents($filename);
if($value !== false)
{
$current_value = $value
}
I have a function that generates a table with contents from the DB. Some cells have custom HTML which I'm reading in with file_get_contents through a templating system.
The small content is the same but this action is performed maybe 15 times (I have a limit of 15 table rows per page). So does file_get_contents cache if it sees that the content is the same?
file_get_contents() does not have caching mechanism. However, you can use write your own caching mechanism.
Here is a draft :
$cache_file = 'content.cache';
if(file_exists($cache_file)) {
if(time() - filemtime($cache_file) > 86400) {
// too old , re-fetch
$cache = file_get_contents('YOUR FILE SOURCE');
file_put_contents($cache_file, $cache);
} else {
// cache is still fresh
}
} else {
// no cache, create one
$cache = file_get_contents('YOUR FILE SOURCE');
file_put_contents($cache_file, $cache);
}
UPDATE the previous if case is incorrect, now rectified by comparing to current time. Thanks #Arrakeen.
Like #deceze says, generally the answer is no. However operating system level caches may cache recently used files to make for quicker access, but I wouldn't count on those being available. If you'd like to cache a file that is being read multiple times per request, consider using a static variable to act as a cache inside a wrapper function.
function my_file_read($filename) {
static $file_contents = array();
if (!isset($file_contents[$filename])) {
$file_contents[$filename] = file_get_contents($filename);
}
return $file_contents[$filename];
}
Calling my_file_read($filename) multiple times will only read the file from disk a single time, subsequent calls will read the value from the static variable within the function. Note that you shouldn't count on this approach for large files or ones used only once per page, since the memory used by the static variable will persist until the end of the request. Keeping the contents of files unnecessarily in static variables is a good way to make your script a memory hog.
The correct answer is yes. All the PHP file system functions do their own caching, and you can use the "realpath_cache_size = 0" directive in PHP.ini to disable the caching if you like. The default caching timeout is 120 seconds. This is separate from the caching typically done by browsers for all GET requests (the majority of Web accesses) unless the HTTP headers override it. Caching is not a good idea during development work, since your code may read in old data from a file whose contents you have changed.
is there a way for me to check to see if a file is copied before continuing to execute a php loop?
i have a for loop, and within the loop it is going to copy a file. now, i want it so that it waits until the current file is copied before continuing the loop.
example:
for ($i = 1; $i <= 10; $i++)
{
$temp = $_FILES['tmp_name'];
$extension = '.jpg';
copy("$temp_$i_$extension", "$local_$i_$extension");
// not sure what to do here
if (FILE_DONE_COPYING())
{
CONTINUE_LOOP();
}
else
{
PAUSE_LOOP();
}
}
thats just an example. i have no clue how to do this...can anyone chime in?
That's what copy() does in PHP - it blocks until the file is copied. There's nothing you need to do, except checking the return value to see if the operation was successful.
PHP is taking it line by line, step by step, so it's waiting until copy() is completed
for ($i = 1; $i <= 10; $i++)
{
$temp = $_FILES['tmp_name'];
$extension = '.jpg';
$result = copy("$temp_$i_$extension", "$local_$i_$extension");
if($result){
//done
}
else{
//failed
}
}
copy returns true on success and false on failure. Check for that.
Unless you go through the trouble of using threading and have copy fired asynchronously, PHP will not move to the line after copy until after it has completed.
copy does wait for completion before continuing execution. It is a syncronous call. But, it can return false if it didn't work, and your copy wont work since $temp_ and $i_ are not defined variables. So maybe you are thinking the copy isn't finishing, when it actually just isn't working at all.
You should use:
copy("{$temp}_{$i}_$extension", "{$local}_{$i}_$extension");
OR
copy($temp.'_'.$i.'_'.$extension, $local.'_'.$i.'_'.$extension);
What makes you think that copy() will return before it has finished?
You could of course compare filesize of original file and copy to be sure the process is complete.
You could use a while loop with sleep calls to delay checking, and just exit the while loop once the file exists under the new name.
I know this is an ancient question but I feel I really need to talk about this problem. COPY is a great command - BUT - it does not work all of the time. I can honestly tell you this. Why? Why does it not always work? Simple - the Operating System is at fault. Here are two examples. One is using a standard disk drive and the second one deals with a Ram disk. The COPY command reads CHUNKS of a file and writes these chunks out to the destination. This means it really does NOT just do a File_get_contents but instead does the fopen(IN), fopen(OUT), while( !EOF(IN) ){ fread(IN), fwrite(OUT) } and then fclose(IN), and fclose(out). It should be noted that these commands try to make sure everything goes ok but if the disk drive buffers what it does - then the file might take a second or two to finish. This can be seen by having a file_exists() on the output file's name. It can come back as FALSE(it IS NOT there). This is because the disk drive's hardware has not caught up with the commands.
I even installed the AMD RamDisk software and ran a program using the above commands (both file_get_contents->file_put_contents and the fopen-fread/fwrite-fclose commands). The same thing happened then also. Every now and then (not always) the file_exists() function returned FALSE because the test got there before the file had finished being created. Don't ask me why - it just would do this.
So what do I suggest? Use the SLEEP() command. Maybe use three(3) seconds (so SLEEP(3);) -after- the COPY() command. I also determined that a CHMOD(, 0777); was a good idea also. With a SLEEP() command after it so it has time to apply the changes. (Which is probably closer to one second).
Now, remember - everyone's hardware is different. So some hardware might work better or faster than the one I am using. So - this is one of those - try it if you are having problems. Or don't - if you are not having problems. It is that simple. So - this is happening to me - I'm using it - it works now that the system gets three seconds to breath - but it might not do anything for you - who has an atomic powered Willy-Wonka mobile which does the impossible before breakfast.
Got it? Good. :-)
Read some texts about locking in PHP.
They all, mainly, direct to http://php.net/manual/en/function.flock.php .
This page talks about opening a file on the hard-disk!!
Is it really so? I mean, this makes locking really expensive - it means each time I want to lock I'll have to access the hard-disk )=
Can anymore comfort me with a delightful news?
Edit:
Due to some replies I've got here, I want to ask this;
My script will run only by one thread, or several? Because if it's by one then I obviously don't need a mutex. Is there a concise answer?
What exactly I'm trying to do
Asked by ircmaxell.
This is the story:
I have two ftp servers. I want to be able to show at my website how many online users are online.
So, I thought that these ftp servers will "POST" their stats to a certain PHP script page. Let's assume that the URL of this page is "http://mydomain.com/update.php".
On the website's main page ("http://mydomain.com/index.php") I will display the cumulative statistics (online users).
That's it.
My problem is that I'm not sure if, when one ftp server updates his stats while another does it too, the info will get mixed.
Like when multi-threading; Two threads increase some "int" variable at the same time. It will not happen as expected unless you sync between them.
So, will I have a problem? Yes, no, maybe?
Possible solution
Thinking hard about it all day long, I have an idea here and I want you to give your opinion.
As said these ftp servers will post their stats, once every 60sec.
I'm thinking about having this file "stats.php".
It will be included at the updating script that the ftp servers go to ("update.php") and at the "index.php" page where visitors see how many users are online.
Now, when an ftp server updates, the script at "update.php" will modify "stats.php" with the new cumulative statistics.
First it will read the stats included at "stats.php", then accumulate, and then rewrite that file.
If I'm not mistaken PHP will detect that the file ("stats.php") is changed and load the new one. Correct?
Well, most of PHP runs in a different process space (there are few threading implementations). The easy one is flock. It's guaranteed to work on all platforms.
However, if you compile in support, you can use a few other things such as the Semaphore extension. (Compile PHP with --enable-sysvsem). Then, you can do something like (note, sem_acquire() should block. But if it can't for some reason, it will return false):
$sem = sem_get(1234, 1);
if (sem_acquire($sem)) {
//successful lock, go ahead
sem_release($sem);
} else {
//Something went wrong...
}
The other options that you have, are MySQL user level locks GET_LOCK('name', 'timeout'), or creating your own using something like APC or XCache (Note, this wouldn't be a true lock, since race conditions could be created where someone else gets a lock between your check and acceptance of the lock).
Edit: To answer your edited question:
It all depends on your server configuration. PHP May be run multi-threaded (where each request is served by a different thread), or it may be run multi-process (where each request is served by a different process). It all depends on your server configuration...
It's VERY rare that PHP will serve all requests serially, with only one process (and one thread) serving all requests. If you're using CGI, then it's multi-process by default. If you're using FastCGI, it's likely multi-process and multi-thread. If you're using mod_php with Apache, then it depends on the worker type:
mpm_worker will be both multi-process and multi-thread, with the number of processes dictated by the ServerLimit variable.
prefork will be multi-process
perchild will be multi-process as well
Edit: To answer your second edited question:
It's quite easy. Store it in a file:
function readStatus() {
$f = fopen('/path/to/myfile', 'r');
if (!$f) return false;
if (flock($f, LOCK_SH)) {
$ret = fread($f, 8192);
flock($f, LOCK_UN);
fclose($f);
return $ret;
}
fclose($f);
return false;
}
function updateStatus($new) {
$f = fopen('/path/to/myfile', 'w');
if (!$f) return false;
if (flock($f, LOCK_EX)) {
ftruncate($f, 0);
fwrite($f, $new);
flock($f, LOCK_UN);
fclose($f);
return true;
}
fclose($f);
return false;
}
function incrementStatus() {
$f = fopen('/path/to/myfile', 'rw');
if (!$f) return false;
if (flock($f, LOCK_EX)) {
$current = fread($f, 8192);
$current++;
ftruncate($f, 0);
fwrite($f, $current);
flock($f, LOCK_UN);
fclose($f);
return true;
}
fclose($f);
return false;
}
The question is: Where will you store the stats that the FTP servers are pushing with POST to your update.php file? If it's a local file, than ircmaxell in the second post has answered you. You can do this with a mutex as well - the semaphore functions. Another solution is to use MySQL MyISAM table to store the stats and use something like update info_table set value = value + 1. It should lock the table, and serialize your requests, and you will have no problems.
I recently created my own simple implementation of a mutex-like mechanism using the flock function of PHP. Of course the code below can be improved, but it is working for most use cases.
function mutex_lock($id, $wait=10)
{
$resource = fopen(storage_path("app/".$id.".lck"),"w");
$lock = false;
for($i = 0; $i < $wait && !($lock = flock($resource,LOCK_EX|LOCK_NB)); $i++)
{
sleep(1);
}
if(!$lock)
{
trigger_error("Not able to create a lock in $wait seconds");
}
return $resource;
}
function mutex_unlock($id, $resource)
{
$result = flock($resource,LOCK_UN);
fclose($resource);
#unlink(storage_path("app/".$id.".lck"));
return $result;
}
Yes that's true, as PHP is run by Apache, and Apache can organize the threads of execution as it deems the best (see the various worker model). So if you want to access a resource one at a time, you either lock to a file (which is good if you are dealing with cron jobs for example), or you rely on database transaction mechanism, ACID features, and database resources locking, if you are dealing with data.
PHP doesn't support multithreading, every request (and therefore every PHP script) will be executed in only one thread (or even process, depending on the way you run PHP).
I know this is a bit generic, but I'm sure you'll understand my explanation. Here is the situation:
The following code is executed every 10 minutes. Variable "var_x" is always read/written to an external text file when its refereed to.
if ( var_x != 1 )
{
var_x = 1;
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
var_x = 0;
}
else
{
// exit script as it's already running.
}
The problem is: if I simulate a hardware failure (do a hard reset when the script is executing) then the main script logic will never execute again because "var_x" will always be "1". (I already have logic to work out the restore point).
Thanks.
You should lock and unlock files with flock:
$fp = fopen($your_file);
if (flock($fp, LOCK_EX)) { )
{
//
// here is where the main body of the script is.
// it can take hours to completely execute.
//
flock($fp, LOCK_UN);
}
else
{
// exit script as it's already running.
}
Edit:
As flock seems not to work correctly on Windows machines, you have to resort to other solutions. From the top of my head an idea for a possible solution:
Instead of writing 1 to var_x, write the process ID retrieved via getmypid. When a new instance of the script reads the file, it should then lookup for a running process with this ID, and if the process is a PHP script. Of course, this can still go wrong, as there is the possibility of another PHP script obtaining the same PID after a hardware failure, so the solution is far from optimal.
Don't you think this would be better solved using file locks? (When the reset occurs file locks are reset as well)
http://php.net/flock
It sounds like you're doing some kind of manual semaphore for process management.
Rather than writing to a file, perhaps you should use an environment variable instead. That way, in the event of failure, your script will not have a closed semaphore when you restore.