Linux concurrency scripting with mutexes

Linux concurrency scripting with mutexes - php

On my Linux server, I need to synchronize multiple scripts, written in BASH and PHP, so that only one of them is able to start a system-critical job, which is a series of BASH/PHP commands, that would mess things up if performed simultaneously by two or more scripts. From my experience with multithreading in C++, I'm familiar with the notion of mutex, but how do I implement a mutex for a bunch of scripts that run in separate processes and, of course, aren't written in C++?
Well, the first solution that comes into mind would be making sure that each of the scripts initially creates a "lock flag" file to let other scripts know that the job is "locked" and then deletes the file after it's done with the job. But, as I see it, the file writing and reading operations are required to be completely atomic to let this approach work out with a 100% probability, and the same requirement would apply to any other synchronization method. And I'm pretty sure that file writing/reading operations are not atomic, they are not atomic across all existing Linux/Unix systems at least.
So what is the most flexible and reliable way to synchronize concurrent BASH and PHP scripts?

I'm not a PHP programmer, but the documentation says it provides a portable version of flock that you can use. The first example snippet looks pretty close to what you want. Try this:
<?php
$fp = fopen("/tmp/lock.txt", "r+");
if (flock($fp, LOCK_EX)) { // acquire an exclusive lock
// Do your critical section here, while you hold the lock
flock($fp, LOCK_UN); // release the lock
} else {
echo "Couldn't get the lock!";
}
fclose($fp);
?>
Note that by default flock waits until it can acquire the lock. You can use LOCK_EX | LOCK_NB if you want it to exit immediately in the case where another copy of the program is already running.
Using the name "/tmp/lock.txt" may be a security hole (I don't want to think hard enough to decide whether it truly is) so you should probably choose a directory that can only be written to by your program.

You can use flock to atomically lock your flag file. The -e option is to acquire an exclusive lock.
From the man page:
By default, if the lock cannot be immediately acquired, flock waits
until the lock is available.
So if all your bash/php scripts try to lock the file exclusively, only one can successfully acquire it and rest of them would wait for the lock.
If you don't want to wait thenuse -w to timeout.

fuser-based lock in Bash (it guarantees that no two processes access the protected resource at the same time but may result in negative locking attempt even if no processes access the resource, almost improbable though):
#!/bin/bash
set -eu
function mutex {
local file=$1 pid pids
exec 8>>"$file"
{ pids=$(/sbin/fuser -f "$file"); } 2>&- 9>&-
for pid in $pids; do
[[ $pid = $$ ]] && continue
exec 8>&-
return 1 # locked by other pid
done
}

Related

How can I avoid errors when PHP, C++ and shell script try to access the same file?

Are there methods in PHP, C++ and bash scripts that can make the respective program wait its turn when accessing a file ?
I have a web-page written in PHP that gives the user the ability to input 6 values:
URL
URL Refresh Interval
Brightness
Color1 in hex
Color2 in hex
Color3 in hex
These values will be written in configuration.txt.
Each time the web-page is accessed configuration.txt gets opened, the PHP gets some values from there and then closes it.
configuration.txt is also opened when one or more of the above values are submitted and then it gets closed.
Next, I have a bash that regularly wgets the URL from configuration.txt and writes the output to a different file, called url_response.txt.
while [ 0 ]
do
line=$(head -n 1 data/configuration.txt)
wget -q -i $line -O url_response.txt
sleep 2
done
This script will be put inside a C++ program.
Finally, the same C++ program will have to access url_response.txt to get and parse some strings from it and it will also have to access configuration.txt to get the three colors from it.
I am pretty sure that these 3 programs will intersect at one point and I don't want to find out what happens then.

A common way to avoid race conditions is to use a lock file. When a program tries to read or write to configuration.txt it checks the lock file first.
There are two kinds of locks:
shared lock
exclusive lock
A program can get a shared lock (read lock) as long as no other program has an exclusive lock. This is used to read file. Multiple programs can read a file as long as no other program write to that file.
A program can get an exclusive lock (write lock) only if no other program has a lock (neither exclusive nor shared). This is used to write to a file. As long as a program is reading or writing to a file other programs are not allowed to write.
On a linux system you can use flock to manage file locks.
Read:
flock --shared lockfile -c read.sh
Write
flock --exclusive lockfile -c write.sh
Usually this command will wait until the lock is available. With
flock --nonblock lockfile
the command will fail immediately instead of waiting.
From manpage
SYNOPSIS
flock [options] <file|directory> <command> [command args]
flock [options] <file|directory> -c <command>
flock [options] <file descriptor number>
DESCRIPTION
This utility manages flock(2) locks from within shell scripts or the command line.
The first and second forms wrap the lock around the executing a command, in a manner similar to su(1) or newgrp(1). It
locks a specified file or directory, which is created
(assuming appropriate permissions), if it does not already exist. By default, if the lock cannot be immediately acquired, flock
waits until the lock is available.
The third form uses open file by file descriptor number. See examples how that can be used.
Here is the manpage for c++ and here is the manpage for shell scripts.

So you want to avoid one of your readers getting a partial copy of the file as it is being written?
So the usual way to avoid this issue is that when you write the file, you write it to a different name in the same folder. Then use move to replace the original file.
Unix will ensure that if there are any existing readers for the original file, the file is kept as a hidden file until they all close. Any new readers will see the newly moved file. No readers should see a broken file.
To do the same thing by modifying the file, the best you can do is keep some sort of serial number in the file. The writer should update the serial before it writes, and again after. Any reader should read the serial before and after the read. If the serial changes, then the read is invalid and should be repeated. The issue of ensuring the data is not cached also needs addressing. This is OK for occasionally updating files, but clearly it will impair the performance of the readers if the content is frequently updated.

prevent multiple instances of same php script

I am executing the following bash script on ubuntu 16.04 virtua machine startup with rc.local.
#!/bin/bash
# Loop forever (until break is issued)
(while true; do
sudo php /var/www/has/index.php external communication
done ) &
As you can see, the bash executes a php script continuously. Over time, the script might take longer time to execute. Sometime scripts like the one above keep starting even though another instance of that same script is running. So, i want to know how can I prevent a new instance of the php script to execute, if there is an existing instance?

You can use file locking to acquire an exclusive lock. If the lock exists, you can end the script or wait until the lock is released.
I suggest you read up on http://php.net/manual/en/function.flock.php
$fp = fopen("/tmp/lock.txt", "r+");
if (flock($fp, LOCK_EX)) { // acquire an exclusive lock
// Execute logic
} else {
echo "Couldn't get the lock!";
}

executing one php script at a time from a script list

I have a web application from where the user can choose a list of scripts to execute , the executions are then added to a table in mysql and each one have its own state like "Pending,"success"
,"failed" or "in progress" the user also can choose to stop the execution.
The problem is that only one script can be executed at the same time so that the others have to wait until it is finished.
My environement is LINUX (UBUNTU) and the scripts are in PHP
I though about doing a crontab that executes a php script , this php script will grab the informations from the sql table and search if there is an other execution by looking if there there is an execution with an "In progress"
state so if there is one it will simply exit,otherwise it will execute an other execution having the pending state.
Is there any other solution for this ?

It's better to use an atomic check. The way how you do this with the database is not atomic as after you checked that no other scripts are running, but before you've written that the current script starts, another process may perform the same check and therefore you'll get two concurrent scripts running.
Also if the script terminates abnormally for any reason, it won't update the database, so other scripts won't be able to start at all.
More reliable way is to use file locking:
$lock_file = 'some_path/process.lock';
$fd = fopen($lock_file, "w");
if (!$fd)
throw new Exception("Can't open file, check permissions on ".$lock_file, 1);
if (!flock($fd, LOCK_EX + LOCK_NB))
throw new AlreadyRunningException("Can't lock the file - another script is already running", 0);
Then, after the script job is done, unlock the file:
flock($fd, LOCK_UN);
fclose($fd);

Stopping Parallel Execution of PHP Script

I am trying to stop my cron script from allowing it to run in parallel. I need it so that if there is no current execution of it, the script will be allowed to run until it is complete, the script timesout or an exception occurs.
I have been trying to use the PHP flock function to engage a file lock, run the script and then release the lock. However, it still looks like I am able to run the script multiple times in parallel. Am I missing something?
Btw, I am developing on Mac OS X with the Mac filesystem, maybe this is the reason the file locks are being ignored? Though the PHP documentation only looks about NTFS filesystems?
// Construct cron lock file path
$cronLockFilePath = realpath(APPLICATION_PATH . '/locks/cron');
// Get cron lock file
$cronLockFile = fopen($cronLockFilePath, 'r');
// Lock cron lock file
if (flock($cronLockFile, LOCK_EX)) {
echo 'lock';
sleep(10);
} else {
echo 'no lock';
}

Your idea is basically correct, but tinkering with file locks generally leads to strange behaviour.
Just create a file on script start and delete it in the end. The presense of the file will indicate if the cron is already running. Make absolutely sure, that the file is deleted in the end, even if the cron runs into an error halfway through.

From documentation:
Warning
On some operating systems flock() is
implemented at the process level. When
using a multithreaded server API like
ISAPI you may not be able to rely on
flock() to protect files against other
PHP scripts running in parallel
threads of the same server instance!
You can try to create and delete file, or write something in to it.

I think what you could do is write a regular file somewhere (lock.txt or something) when script starts to execute, without any flocks, and remove it when the script stops running. And then always check upon initialization whether that file already exists - another instance running.

PHP loop acting as cronjob[ensure only one instance running]

I have a multi part question for a php script file. I am creating this file that updates the database every second. There is no other modeling method, it has to be done every second.
Now i am running CentOS and i am new to it. The first noob question is:
How do i run a php file via SSH. I read it is just # php path-to/myfile.php. But i tried to echo something, and i dont see it in the text.
Now i don't think that starting the file is going to be a problem. One problem i guess will be, i don't know if it is even possible, but here goes.
Is it possible for me to be hundred percent sure that the file is only run once. What happens if i by accident run the file again.
I was wondering further, if i implement a write to a log every second, i can know if everything is running ok. If there is an error or something wrong the log file will stop.
Is the writing to a log file with the fopen, and write and close. Isn't this going to take a lot of time, isn't there an easier method in CentOS.
Ok another big point i have is what happens when i run the file. Is the file run in the memory, or does it use the file in the system. Does it respond on changes made in the file, for example to stop the execution of the script.
Can i implement some kind of stop mechanism in the file itself. Or is there a command i can use to stop the file.
Another option i know of is implementing a cronjob that runs every minute. And this cronjob executes the php file. The php file will loop for one minute, updateting everything needed, and terminating. I implemented this method, but just used a browser. I just browsed to mu file, and opened it. I saw the browser was busy for a minute, but it didn't update anything in the database. Does anyone have an idea what the reason of this can be.
Another question i have is by implementing the cronjob method, what is the command i fill in the PLESK panel. Is it the same as the above command. just php and the file name. Or are there special command like -f -q -something.
Sorry for all the noob questions.
If someone can help me i really appreciate it.
Ciao!

The simplest way to ensure only one copy of your script is running is to use flock() to obtain a file lock. For example:
<?php
$fp = fopen("/tmp/lock.txt", "r+");
if (flock($fp, LOCK_EX)) { // do an exclusive lock
ftruncate($fp, 0); // truncate file
fwrite($fp, "Write something here\n");
flock($fp, LOCK_UN); // release the lock
} else {
echo "Couldn't get the lock!";
}
fclose($fp);
?>
So basically you'd have a dummy file set up where your script, upon starting, tries to acquire a lock. If it succeeds, it runs. If not, it exits. That way only one copy of your script can be running at a time.
Note: flock() is what is called an advisory locking method, meaning it only works if you use it. So this will stop your own script from being run multiple times but won't do anything about any other scripts, which sounds fine in your situation.

You can't always rely on the lock within the script itself, as stated in the comment to previous answer. This might be a solution.
#Mins Hours Days Months Day of week
* * * * * lockfile -r 0 /tmp/the.lock; php parse_tweets.php; rm -f /tmp/the.lock
* * * * * lockfile -r 0 /tmp/the.lock; php php get_tweets.php; rm -f /tmp/the.lock
This way even if the scripts crashes, the lockfile will be released. Taken from here: https://unix.stackexchange.com/a/158459

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.