How to properly implement a custom session persister in PHP + MySQL? - php

I'm trying to implement a custom session persister in PHP + MySQL. Most of the stuff is trivial - create your DB table, make your read/write functions, call session_set_save_hander(), etc. There are even several tutorials out there that offer sample implementations for you. But somehow all these tutorials have conveniently overlooked one tiny detail about session persisters - locking. And now that's where the real fun starts!
I looked at the implementation of session_mysql PECL extension of PHP. That uses MySQL's functions get_lock() and release_lock(). Seems nice, but I don't like the way it's doing it. The lock is acquired in the read function, and released in the write function. But what if the write function never gets called? What if the script somehow crashes, but the MySQL connection stays open (due to pooling or something)? Or what if it the script enters a deadly deadlock?
I just had a problem where a script opened a session and then tried to flock() a file over an NFS share, while the other computer (that hosted the file) was also doing the same thing. The result was that the flock()-over-NFS call was blocking the script for about 30 seconds on each call. And it was in a loop of 20 iterations! Since that was an external operation, PHP's script timeouts didn't apply, and the session got locked for over 10 minutes every time this script was accessed. And, as luck would have it, this was the script that got polled by an AJAX shoutbox every 5 seconds... Major showstopper.
I already have some ideas on how to implement it in a better way, but I would really like to hear what other people suggest. I haven't had that much experience with PHP to know what subtle edge cases loom in the shadows which could one day jeopardize the whole thing.
Added:
OK, seems that nobody has anything to suggest. OK then, here's my idea. I'd like some opinon on where this could go wrong.
Create a session table with InnoDB storage engine. This should ensure some proper locking of rows even under clustered scenarios. The table should have the columns ID, Data, LastAccessTime, LockTime, LockID. I'm omitting the datatypes here because they follow quite directly from the data that needs to be stored in them. The ID will be the ID of the PHP session. Data will of course contain the session data. LastAccessTime will be a timestamp which will be updated on each read/write operation and will be used by GC to delete old sessions. LockTime will be a timestamp of the last lock that was acquired on the session, and LockID will be a GUID of the lock.
When a read operation is requested, there will be the following actions taken:
Execute INSERT IGNORE INTO sessions (id, data, lastaccesstime, locktime, lockid) values ($sessid, null, now(), null, null); - this will create the session row if it is not there, but do nothing if it is already present;
Generate a random lock id in the variable $guid;
Execute UPDATE sessions SET (lastaccesstime, locktime, lockid) values (now(), now(), $guid) where id=$sessid and (lockid is null or locktime < date_add(now(), INTERVAL -30 seconds)); - this is an atomic operation which will either obtain a lock on the session row (if it's not locked or the lock is expired), or will do nothing.
Check with mysql_affected_rows() if the lock was obtained or not. If it was obtained - proceed. If not - re-attempt the operation every 0.5 seconds. If in 40 seconds the lock is still not obtained, throw an exception.
When a write operation is requested, execute UPDATE sessions SET (lastaccesstime, data, locktime, lockid) values (now(), $data, null, null) where id=$sessid and lockid=$guid; This is another atomic operation which will update the session row with the new data and remove the lock if it still has the lock, but do nothing if the lock was already taken away.
When a gc operation is requested, simply delete all rows with lastaccesstime too old.
Can anyone see flaws with this?

Ok. The answer is going to be a bit longer - so patience!
1) Whatever I am going to write is based on the experiments I have done over last couple of days. There may be some knobs/settings/inner working I may not be aware of. If you spot mistakes/ or do not agree then please shout!
2) First clarification - WHEN SESSION DATA is READ and WRITTEN
The session data is going to be read exactly once even if you have multiple $_SESSION reads inside your script. The read from session is a on a per script basis. Moreover the data fetch happens based on the session_id and not keys.
2) Second clarification - WRITE ALWAYS CALLED AT END OF SCRIPT
A) The write to session save_set_handler is always fired, even for scripts that only "read" from session and never do any writes.
B) The write is only fired once, at the end of the script or if you explicitly call session_write_close. Again, the write is based on session_id and not keys
3) Third Clarification : WHY WE NEED Locking
What is this fuss all about?
Do we really need locks on session?
Do we really Need a Big Lock wrapping READ + WRITE
To explain the Fuss
Script1
1: $x = S_SESSION["X"];
2: sleep(20);
3: if($x == 1 ) {
4: //do something
5: $_SESSION["X"] = 3 ;
6: }
4: exit;
Script 2
1: $x = $_SESSION["X"];
2: if($x == 1 ) { $_SESSION["X"] = 2 ; }
3: exit ;
The inconsistency is that script 1 is doing something based on a session variable (line:3) value that has changed in by another script while script-1 was already running. This is a skeleton example but it illustrates the point. The fact that you are taking decisions based on something that is no longer TRUE.
when you are using PHP default session locking (Request Level locking) script2 will block on line 1 because it cannot read from the file that script 1 started reading at line1. So the requests to session data are serialized. When script2 reads a value, it is guaranteed to read the new value.
Clarification 4: PHP SESSION SYNCHRONIZATION IS DIFFERENT FROM VARIABLE SYNCHRONIZATION
Lot of people talk about PHP session synchronization as if it is like a variable synchronization, the write to memory location happening as soon as you overwrite variable value and the next read in any script will fetch the new value. As we see from CLARIFICATION #1 - That is not true. The script uses the values read at the start of the script throughout the script and even if some other script has changed the values, the running script will not know about new values till next refresh. This is a very important point.
Also, keep in mind that values in session changes even with PHP big locking. Saying things like, "script that finishes first will overwrite value" is not very accurate. Value change is not bad, what we are after is inconsistency, namely, it should not change without my knowledge.
CLARIFICATION 5: Do we REALLY NEED BIG LOCK?
Now, do we really need Big Lock (request level)? The answer, as in the case of DB isolation, is that it depends on how you want to do things. With the default implementation of $_SESSION, IMHO, only the big lock makes sense. If I am going to the use the value that I read at the beginning throughout my script then only the big lock makes sense. If I change the $_SESSION implementation to "always" fetch "fresh" value then you do not need BIG LOCK.
Suppose we implement a session data versioning scheme like object versioning. Now, script 2 write will succeed because script-1 has not come to write point yet. script-2 writes to session store and increments version by 1. Now, when script 1 tries to write to session, it will fail (line:5) - I do not think this is desirable, though doable.
===================================
From (1) and (2), it follows that no matter how complicated your script, with X reads and Y writes to session,
the session handler read() and write() methods are only called once
and they are always called
Now, there are custom PHP session handlers on net that try to do a "variable"-level locking etc. I am still trying to figure some of them. However I am not in favor of complex schemes.
Assuming that PHP scripts with $_SESSION are supposed to be serving web pages and are processed in milli-seconds, I do not think the additional complexity is worth it. Like Peter Zaitsev mentions here, a select for update with commit after write should do the trick.
Here I am including the code that I wrote to implement locking. It would be nice to test it with some "Race simulation" scripts. I believe it should work. There are not many correct implementations I found on net. It would be good if you can point out the mistakes. I did this with bare mysqli.
<?php
namespace com\indigloo\core {
use \com\indigloo\Configuration as Config;
use \com\indigloo\Logger as Logger;
/*
* #todo - examine row level locking between read() and write()
*
*/
class MySQLSession {
private $mysqli ;
function __construct() {
}
function open($path,$name) {
$this->mysqli = new \mysqli(Config::getInstance()->get_value("mysql.host"),
Config::getInstance()->get_value("mysql.user"),
Config::getInstance()->get_value("mysql.password"),
Config::getInstance()->get_value("mysql.database"));
if (mysqli_connect_errno ()) {
trigger_error(mysqli_connect_error(), E_USER_ERROR);
exit(1);
}
//remove old sessions
$this->gc(1440);
return TRUE ;
}
function close() {
$this->mysqli->close();
$this->mysqli = null;
return TRUE ;
}
function read($sessionId) {
Logger::getInstance()->info("reading session data from DB");
//start Tx
$this->mysqli->query("START TRANSACTION");
$sql = " select data from sc_php_session where session_id = '%s' for update ";
$sessionId = $this->mysqli->real_escape_string($sessionId);
$sql = sprintf($sql,$sessionId);
$result = $this->mysqli->query($sql);
$data = '' ;
if ($result) {
$record = $result->fetch_array(MYSQLI_ASSOC);
$data = $record['data'];
}
$result->free();
return $data ;
}
function write($sessionId,$data) {
$sessionId = $this->mysqli->real_escape_string($sessionId);
$data = $this->mysqli->real_escape_string($data);
$sql = "REPLACE INTO sc_php_session(session_id,data,updated_on) VALUES('%s', '%s', now())" ;
$sql = sprintf($sql,$sessionId, $data);
$stmt = $this->mysqli->prepare($sql);
if ($stmt) {
$stmt->execute();
$stmt->close();
} else {
trigger_error($this->mysqli->error, E_USER_ERROR);
}
//end Tx
$this->mysqli->query("COMMIT");
Logger::getInstance()->info("wrote session data to DB");
}
function destroy($sessionId) {
$sessionId = $this->mysqli->real_escape_string($sessionId);
$sql = "DELETE FROM sc_php_session WHERE session_id = '%s' ";
$sql = sprintf($sql,$sessionId);
$stmt = $this->mysqli->prepare($sql);
if ($stmt) {
$stmt->execute();
$stmt->close();
} else {
trigger_error($this->mysqli->error, E_USER_ERROR);
}
}
/*
* #param $age - number in seconds set by session.gc_maxlifetime value
* default is 1440 or 24 mins.
*
*/
function gc($age) {
$sql = "DELETE FROM sc_php_session WHERE updated_on < (now() - INTERVAL %d SECOND) ";
$sql = sprintf($sql,$age);
$stmt = $this->mysqli->prepare($sql);
if ($stmt) {
$stmt->execute();
$stmt->close();
} else {
trigger_error($this->mysqli->error, E_USER_ERROR);
}
}
}
}
?>
To register the object session Handler,
$sessionHandler = new \com\indigloo\core\MySQLSession();
session_set_save_handler(array($sessionHandler,"open"),
array($sessionHandler,"close"),
array($sessionHandler,"read"),
array($sessionHandler,"write"),
array($sessionHandler,"destroy"),
array($sessionHandler,"gc"));
ini_set('session_use_cookies',1);
//Defaults to 1 (enabled) since PHP 5.3.0
//no passing of sessionID in URL
ini_set('session.use_only_cookies',1);
// the following prevents unexpected effects
// when using objects as save handlers
// #see http://php.net/manual/en/function.session-set-save-handler.php
register_shutdown_function('session_write_close');
session_start();
Here is another version done with PDO. This one checks for existence of sessionId and does update or Insert. I have also removed the gc function from open() as it unnecessarily fires a SQL query on each page load. The stale session cleanup can easily be done via a cron script. This should be the version to use if you are on PHP 5.x. Let me know if you find any bugs!
=========================================
namespace com\indigloo\core {
use \com\indigloo\Configuration as Config;
use \com\indigloo\mysql\PDOWrapper;
use \com\indigloo\Logger as Logger;
/*
* custom session handler to store PHP session data into mysql DB
* we use a -select for update- row leve lock
*
*/
class MySQLSession {
private $dbh ;
function __construct() {
}
function open($path,$name) {
$this->dbh = PDOWrapper::getHandle();
return TRUE ;
}
function close() {
$this->dbh = null;
return TRUE ;
}
function read($sessionId) {
//start Tx
$this->dbh->beginTransaction();
$sql = " select data from sc_php_session where session_id = :session_id for update ";
$stmt = $this->dbh->prepare($sql);
$stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
$stmt->execute();
$result = $stmt->fetch(\PDO::FETCH_ASSOC);
$data = '' ;
if($result) {
$data = $result['data'];
}
return $data ;
}
function write($sessionId,$data) {
$sql = " select count(session_id) as total from sc_php_session where session_id = :session_id" ;
$stmt = $this->dbh->prepare($sql);
$stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
$stmt->execute();
$result = $stmt->fetch(\PDO::FETCH_ASSOC);
$total = $result['total'];
if($total > 0) {
//existing session
$sql2 = " update sc_php_session set data = :data, updated_on = now() where session_id = :session_id" ;
} else {
$sql2 = "insert INTO sc_php_session(session_id,data,updated_on) VALUES(:session_id, :data, now())" ;
}
$stmt2 = $this->dbh->prepare($sql2);
$stmt2->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
$stmt2->bindParam(":data",$data, \PDO::PARAM_STR);
$stmt2->execute();
//end Tx
$this->dbh->commit();
}
/*
* destroy is called via session_destroy
* However it is better to clear the stale sessions via a CRON script
*/
function destroy($sessionId) {
$sql = "DELETE FROM sc_php_session WHERE session_id = :session_id ";
$stmt = $this->dbh->prepare($sql);
$stmt->bindParam(":session_id",$sessionId, \PDO::PARAM_STR);
$stmt->execute();
}
/*
* #param $age - number in seconds set by session.gc_maxlifetime value
* default is 1440 or 24 mins.
*
*/
function gc($age) {
$sql = "DELETE FROM sc_php_session WHERE updated_on < (now() - INTERVAL :age SECOND) ";
$stmt = $this->dbh->prepare($sql);
$stmt->bindParam(":age",$age, \PDO::PARAM_INT);
$stmt->execute();
}
}
}
?>

I just wanted to add (and you may already know) that PHP's default session storage (which uses files) does lock the sessions files. Obviously using files for sessions has plenty of shortcomings which is probably why you are looking at a database solution.

Check with mysql_affected_rows() if the lock was obtained or not. If it was obtained - proceed. If not - re-attempt the operation every 0.5 seconds. If in 40 seconds the lock is still not obtained, throw an exception.
I see a problem in blocking script execution with this continual check for a lock. You're suggesting that PHP run for up to 40 seconds looking for this lock everytime the session is initialized (if I'm reading that correctly.)
Recommendation
If you have a clustered environment, I would highly recommend memcached. It supports a server/client relationship so all clustered instances can defer to the memcached server. It doesn't have locking issues you're fearful of, and is plenty fast. Quote from their page:
Regardless of what database you use (MS-SQL, Oracle, Postgres, MySQL-InnoDB, etc..), there's a lot of overhead in implementing ACID properties in a RDBMS, especially when disks are involved, which means queries are going to block. For databases that aren't ACID-compliant (like MySQL-MyISAM), that overhead doesn't exist, but reading threads block on the writing threads. memcached never blocks.
Otherwise, if you're still committed to an RDBMS session store (and worried that locking will become a problem), you could try some sort of sharding based on a sticky session identifier (grasping at straws here.) Knowing nothing else about your architecture, that's about as specific as I can get.

My question is why lock at all? Why not just let the last write succeed? You shouldn't be using session data as a cache, so writes tend to be infrequent, and in practice never trample each other.

Related

Which Method is More Practical or Orthodox When Importing Large Data?

I have a file that has the function of importing data into a sql database from an api. A problem I encountered was that the api can only retrieve a max dataset size of 1000, even though sometimes I need to retrieve large amounts of data, ranging from 10-200,000. My first thought was to create a while loop in which inside I make calls to the api until all of the data is properly retrieved, and afterwards, can I enter it into the database.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
mysqli_multi_query($con, $query);
The issue I encountered with this is that I was quickly reaching memory limits. The easy solution to this is to simply increase the memory limit until it was suffice. How much memory I needed, however, was undeclared, because there is always a possibility that I need to import very large datasets, and can theoretically always run out of memory. I don't want to set an infinite memory limit, as the problems with that are unimaginable.
My second solution to this was instead of looping through the imported data, I could instead send it to my database, and then do a page refresh, with a get request specifying the last Id I left off on.
if (isset($_GET['lastId'])
$lastId = $_GET['lastId'];
else
$lastId = null;
$result = json_decode(callToApi($lastId));
$query .= formatResult($result);
mysqli_multi_query($con, $query);
if (!empty($result['dataNotExported'])) {
header('Location: ./page.php?lastId='.getLastId($result['customers']));
}
This solution solves my memory limit issue, however now I have another issue, being that browsers, after 20 redirects (depends on the browser), will automatically kill the program to stop a potential redirect loop, then shortly refresh the page. The solution to this would be to kill the program yourself at the 20th redirect and allow it to do a page refresh, continuing the process.
if (isset($_GET['redirects'])) {
$redirects = $_GET['redirects'];
if ($redirects == '20') {
if ($lastId == null) {
header("Location: ./page.php?redirects=2");
}
else {
header("Location: ./page.php?lastId=$lastId&redirects=2");
}
exit;
}
}
else
$redirects = '1';
Though this solves my issues, I am afraid this is more impractical than other solutions, as there must be a better way to do this. Is this, or the issue of possibly running out of memory my only two choices? And if so, is one more efficient/orthodox than the other?
Do the insert query inside the loop that fetches each page from the API, rather than concatenating all the queries.
$moreDataToImport = true;
$lastId = null;
$query = '';
while ($moreDataToImport) {
$result = json_decode(callToApi($lastId));
$query = formatResult($result);
mysqli_query($con, $query);
$moreDataToImport = !empty($result['dataNotExported']);
$lastId = getLastId($result['customers']);
}
Page your work. Break it up into smaller chunks that will be below your memory limit.
If the API only returns 1000 at a time, then only process 1000 at a time in a loop. In each iteration of the loop you'll query the API, process the data, and store it. Then, on the next iteration, you'll be using the same variables so your memory won't skyrocket.
A couple things to consider:
If this becomes a long running script, you may hit the default script running time limit - so you'll have to extend that with set_time_limit().
Some browsers will consider scripts that run too long to be timed out and will show the appropriate error message.
For processing upwards of 200,000 pieces of data from an API, I think the best solution is to not make this work dependant on a page load. If possible, I'd put this in a cron job to be run by the server on a regular schedule.
If the dataset is dependant on the request (for example, if you're processing temperatures from one of 1000s of weather stations - the specific station ID to be set by the user), then consider creating a secondary script that does the work. Calling and forking the secondary script from your primary script will enable your primary script to finish execution while your secondary script executes in the background on your server. Something like:
exec('php path/to/secondary-script.php > /dev/null &');

Prevent PHP from sending multiple emails when running parallel instances

This is more of a logic question than language question, though the approach might vary depending on the language. In this instance I'm using Actionscript and PHP.
I have a flash graphic that is getting data stored in a mysql database served from a PHP script. This part is working fine. It cycles through database entries every time it is fired.
The graphic is not on a website, but is being used at 5 locations, set to load and run at regular intervals (all 5 locations fire at the same time, or at least within <500ms of each-other). This is real-time info, so time is of the essence, currently the script loads and parses at all 5 locations between 30ms-300ms (depending on the distance from the server)
I was originally having a pagination problem, where each of the 5 locations would pull a different database entry since i was moving to the next entry every time the script runs. I solved this by setting the script to only move to the next entry after a certain amount of time passed, solving the problem.
However, I also need the script to send an email every time it displays a new entry, I only want it to send one email. I've attempted to solve this by adding a "has been emailed" boolean to the database. But, since all the scripts run at the same time, this rarely works (it does sometimes). Most of the time I get 5 emails sent. The timeliness of sending this email doesn't have to be as fast as the graphic gets info from the script, 5-10 second delay is fine.
I've been trying to come up with a solution for this. Currently I'm thinking of spawning a python script through PHP, that has a random delay (between 2 and 5 seconds) hopefully alleviating the problem. However, I'm not quite sure how to run exec() command from php without the script waiting for the command to finish. Or, is there a better way to accomplish this?
UPDATE: here is my current logic (relevant code only):
//get the top "unread" information from the database
$query="SELECT * FROM database WHERE Read = '0' ORDER BY Entry ASC LIMIT 1";
//DATA
$emailed = $row["emailed"];
$Entry = $row["databaseEntryID"];
if($emailed == 0)
{
**CODE TO SEND EMAIL**
$EmailSent="UPDATE database SET emailed = '1' WHERE databaseEntryID = '$Entry'";
$mysqli->query($EmailSent);
}
Thanks!
You need to use some kind of locking. E.g. database locking
function send_email_sync($message)
{
sql_query("UPDATE email_table SET email_sent=1 WHERE email_sent=0");
$result = FALSE;
if(number_of_affacted_rows() == 1) {
send_email_now($message);
$result = TRUE;
}
return $result;
}
The functions sql_query and number_of_affected_rows need to be adapted to your particular database.
Old answer:
Use file-based locking: (only works if the script only runs on a single server)
function send_email_sync($message)
{
$fd = fopen(__FILE__, "r");
if(!$fd) {
die("something bad happened in ".__FILE__.":".__LINE__);
}
$result = FALSE;
if(flock($fd, LOCK_EX | LOCK_NB)) {
if(!email_has_already_been_sent()) {
actually_send_email($message);
mark_email_as_sent();
$result = TRUE; //email has been sent
}
flock($fd, LOCK_UN);
}
fclose($fd);
return $result;
}
You will need to lock the row in your database by using a transaction.
psuedo code:
Start transaction
select row .. for update
update row
commit
if (mysqli_affected_rows ( $connection )) >1
send_email();

query if a entry has changed since last check and continuously check for a time

The High Level Idea:
I have a micro controller that can connect to my site via a http request...I want to feed the device a response as soon as a change is noted on the database...
Due to the the end device being a client ie micro controller...Im unaware of a method to pass the data to the client without having to set up port forwarding...which is heavily undesired ...The problem arise when trying send data from an external network to an internal one...Either A. port forwarding or B have the client device initiate the request which leads me to the idea of having the device send an http request to file that polls for changes
Update:
Much Thanks to Ollie Jones. I have implimented some of his
suggestions here.
Jason McCreary suggested having a modified column which is a big
improvement as it should increase speed and reliability ...Great
suggestion! :)
if the database being overworked is in question in this example
maybe the following would work where...when the data is inserted into
the database the changes are wrote to a file...then have the loop
that continuously checks that file for an update....thoughts?
I have table1 and i want to see if a specific row(based on a UID/key) has been updated since the last time i checked as well as continuously check for 60 seconds if the record bets updated...
I'm thinking i can do this using the INFORMATION_SCHEMA database.
This database contains information about tables, views, columns, etc.
attempt at a solution:
<?php
$timer = time() + (10);//add 60 seconds
$KEY=$_POST['KEY'];
$done=0;
if(isset($KEY)){
//loign stuff
require_once('Connections/check.php');
$mysqli = mysqli_connect($hostname_check, $username_check, $password_check,$database_check);
if (mysqli_connect_errno($mysqli))
{ echo "Failed to connect to MySQL: " . mysqli_connect_error(); }
//end login
$query = "SELECT data1, data2
FROM station
WHERE client = $KEY
AND noted = 0;";
$update=" UPDATE station
SET noted=1
WHERE client = $KEY
AND noted = 0;";
while($done==0) {
$result = mysqli_query($mysqli, $query);
$update = mysqli_query($mysqli, $update);
$row_cnt = mysqli_num_rows($result);
if ($row_cnt > 0) {
$row = mysqli_fetch_array($result);
echo 'data1:'.$row['data1'].'/';
echo 'data2:'.$row['data2'].'/';
print $row[0];
$done=1;
}
else {
$current = time();
if($timer > $current){ $done=0; sleep(1); } //so if I haven't had a result update i want to loop back an check again for 60seconds
else { $done=1; echo 'done:nochange';}//60seconds pass end loop
}}
mysqli_close($mysqli);
echo 'time:'.time();
}
else {echo 'error:nokey';}
?>
Is this an adequate method and suggestions to improve the speed as well as improve the reliability
If I understand your application correctly, your client is a microcontroller. It issues an HTTP request to your php / mysql web app once in a while. The frequency of that request is up to the microcontroller, but but seems to be once a minute or so.
The request basically asks, "dude, got anything new for me?"
Your web app needs to send the answer, "not now" or "here's what I have."
Another part of your app is providing the information in question. And it's doing so asynchronously with your microcontroller (that is, whenever it wants to).
To make the microcontroller query efficient is your present objective.
(Note, if I have any of these assumptions wrong, please correct me.)
Your table will need a last_update column, a which_microcontroller column or the equivalent, and a notified column. Just for grins, let's also put in value1 and value2 columns. You haven't told us what kind of data you're keeping in the table.
Your software which updates the table needs to do this:
UPDATE theTable
SET notified=0, last_update = now(),
value1=?data,
value2?=data
WHERE which_microcontroller = ?microid
It can do this as often as it needs to. The new data values replace and overwrite the old ones.
Your software which handles the microcontroller request needs to do this sequence of queries:
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
This will retrieve the latest value1 and value2 items (your application's data, whatever it is) from the database, if it has been updated since last queried. Your php program which handles that request from the microcontroller can respond with that data.
If the SELECT statement returns no rows, your php code responds to the microcontroller with "no changes."
This all assumes microcontroller_id is a unique key. If it isn't, you can still do this, but it's a little more complicated.
Notice we didn't use last_update in this example. We just used the notified flag.
If you want to wait until sixty seconds after the last update, it's possible to do that. That is, if you want to wait until value1 and value2 stop changing, you could do this instead.
START TRANSACTION;
SELECT value1, value2
FROM theTable
WHERE notified = 0
AND last_update <= NOW() - INTERVAL 60 SECOND
AND microcontroller_id = ?microid
FOR UPDATE;
UPDATE theTable
SET notified=1
WHERE microcontroller_id = ?microid;
COMMIT;
For these queries to be efficient, you'll need this index:
(microcontroller_id, notified, last_update)
In this design, you don't need to have your PHP code poll the database in a loop. Rather, you query the database when your microcontroller checks in for an update/
If all table1 changes are handled by PHP, then there's no reason to poll the database. Add the logic you need at the PHP level when you're updating table1.
For example (assuming OOP):
public function update() {
if ($row->modified > (time() - 60)) {
// perform code for modified in last 60 seconds
}
// run mysql queries
}

Running multiple PHP scripts at the same time (database loop issue)

I am running 10 PHP scripts at the same time and it processing at the background on Linux.
For Example:
while ($i <=10) {
exec("/usr/bin/php-cli run-process.php > /dev/null 2>&1 & echo $!");
sleep(10);
$i++;
}
In the run-process.php, I am having problem with database loop. One of the process might already updated the status field to 1, it seem other php script processes is not seeing it. For Example:
$SQL = "SELECT * FROM data WHERE status = 0";
$query = $db->prepare($SQL);
$query->execute();
while ($row = $query->fetch(PDO::FETCH_ASSOC)) {
$SQL2 = "SELECT status from data WHERE number = " . $row['number'];
$qCheckAgain = $db->prepare($SQL2);
$qCheckAgain->execute();
$tempRow = $qCheckAgain->fetch(PDO::FETCH_ASSOC);
//already updated from other processs?
if ($tempRow['status'] == 1) {
continue;
}
doCheck($row)
sleep(2)
}
How do I ensure processes is not re-doing same data again?
When you have multiple processes, you need to have each process take "ownership" of a certain set of records. Usually you do this by doing an update with a limit clause, then selecting the records that were just "owned" by the script.
For example, have a field that specifies if the record is available for processing (i.e. a value of 0 means it is available). Then your update would set the value of the field to the scripts process ID, or some other unique number to the process. Then you select on the process ID. When your done processing, you can set it to a "finished" number, like 1. Update, Select, Update, repeat.
The reason why your script executeds the same query multiple times is because of the parallelisation you are creating. Process 1 reads from the database, Process 2 reads from the database and both start to process their data.
Databases provide transactions in order to get rid of such race conditions. Have a look at what PDO provides for handling database transactions.
i am not entirely sure of how/what you are processing.
You can introduce limit clause and pass that as a parameter. So first process does first 10, the second does the next 10 and so on.
you need lock such as "SELECT ... FOR UPDATE".
innodb support row level lock.
see http://dev.mysql.com/doc/refman/5.0/en/innodb-locking-reads.html for details.

How do I prevent a PHP script from running in the same time?

I am using a script that sync data betweens two databases (locally to an hosted domain). I made the sync occurs when someone log out from the CMS. Unfortunatly, I have seen cases when two logout happen close enough that the sync script is executed twice. It's not the part of the system I'm trying to change but I was expecting an easy way to makes that script (sync.php) not executing if it's already running.
I have came up with that test scripts:
$db = new DB_Mysql();
if ( $db->transactionBegun() == "0" )
{
$_Continue = true;
$_Continue = $_Continue && $db->squery( "START TRANSACTION" );
$_Continue = $_Continue && $db->squery( "SET #TransactionBegun = true" );
sleep( 10 );
$_Continue = $_Continue && $db->squery( "INSERT INTO tbConfiguration VALUES( NOW(), NOW() )" );
$_Continue = $_Continue && $db->squery( "SET #TransactionBegun = false" );
if ( $_Continue )
{
$db->query( "COMMIT" );
} else {
$db->query( "ROLLBACK" );
}
}
The transactionBegun method consist of:
function transactionBegun()
{
$_ResultSet = $this->query("SELECT #TransactionBegun AS TransactionBegun")->fetch_all(true);
return $_ResultSet[ 'TransactionBegun' ];
}
For some strange alien reason, I can run the script twice (in the same time) and both entry will be made.
Any ideas??
Edit:
I forgot to mention that I am using a mysql_pconnect() and that the server is hosted on a Windows machine. The whole system is a cash register machine using a tablet PC. The only client accessing the server is "localhost".
The variables are per-connection, not per-user. With two independent scripts, you'll have two seperate independent connections, each with its own variable space.
If you need to truly handle locking out parallel usage, you'll need to use a server-side lock acquired via GET_LOCK() or use a transaction mode that does exclusive locking on the resource.
Use file-based locks.
When the script begins, have it check for a file like "sync_db.lock". If it exists, another instance of the script is running. At this point, you can choose to sleep for a few seconds and begin again, or simply give up.
If it doesn't exist, create the file and complete the DB transaction. When the transaction is completed, simply delete the file.
To avoid issues with failed threads, write the current timestamp to the file. When the script checks for the file, read its contents and see if a given time period has passed. Don't forget to overwrite the timestamp if your script continues with the transaction.
A little psudo code:
check to see if a file (/var/tmp/.trans.lck) is there
if it is exit
create lock file if not
when finish remove lock file

Categories