How exactly do transactions with PHP PDO work with concurrency? - php

I'm making a webapp where they'll be multiple users interacting with each other and reading/making decisions on/modifying shared data.
I've read that transactions are atomic, which is what I need. However, I'm not sure how it works with the PHP PDO::beginTransaction()
I mean atomic as in if one transaction is editing some data, all other transactions also modifying/reading that data will need to wait until the first transaction finishes. Like I don't want two scripts reading a value, incrementing the old one, and effectively storing only one increment. The second script should have to wait for the first one to finish.
In almost all the examples I've seen the queries are used consecutively (example PHP + MySQL transactions examples). A lot of what I'm doing requires
querying and fetching
checking that data, and acting on it, as part of the same transaction
Will the transaction still work atomically if there is PHP code between queries?
I know you should prepare your statements outside the transaction, but is it okay to prepare it inside? Basically, I'm worried about PHP activity ruining the atomicity of a transaction.
Here's an example (this one doesn't require checking a previous value). I have a very basic inbox system which stores mail as a serialized array (if someone has a better recommendation please let me know). So I query for it, append the new message, and store it. Will it work as expected?
$getMail = $con->prepare('SELECT messages FROM inboxes WHERE id=?');
$storeMail = $con->prepare('UPDATE inboxes SET messages=? WHERE id=?');
$con->beginTransaction();
$getMail->execute(array($recipientID));
$result = $getMail->fetch();
$result = unserialize($result[0]);
$result[] = $msg;
$storeMail->execute(array(serialize($result), $recipientID));
$con->commit();

Transactions are atomic only with respect to other database connections trying to use the same data, i.e. other connections will see either no changes made by your transaction, or all changes; "atomic" meaning no other database connection will see an in-between state with some data updated and other not.
PHP code between queries won't break atomicity, and it does not matter where you prepare your statements.

Related

Using PDO, is there a way to handle a transaction across two drivers?

So, let's say I'm using two drivers at the same time (in the specific mysql and sqlite3)
I have a set of changes that must be commit()ted on both connections only if both dbms didn't fail, or rollBack()ed if one or the another did fail:
<?php
interface DBList
{
function addPDO(PDO $connection);
// calls ->rollBack() on all the pdo instances
function rollBack();
// calls ->commit() on all the pdo instances
function commit();
// calls ->beginTransaction() on all the pdo instances
function beginTransaction();
}
Question is: will it actually work? Does it make sense?
"Why not use just mysql?" you would say! I'm not a masochist! I need mysql for the classic fruition via my application, but I also need to keep a copy of a table that is always synchronized and that is also downloadable and portable!
Thank you a lot in advance!
I suspect you put the cart before the horses! If
two databases are in sync
a transaction commits successfully on one DB
No OS-level error occures
then the transaction will also commit successully on the second DB.
So what you would want to do is:
- Start the transaction on MySQL
- Record all data-changing SQL (see later)
- Commit the transaction on MySQL
- If the commit works, run the recorded SQL against SQlite
- if not, roll back MySQL
Caveat: The assumption above is only valid, if the sequence of transactions is identical on both DBs. So you would want to record the SQL into a MySQL table, which is subject to the same transaction logic as the rest. This does the serialization work for you.
You mistake PDO with a database server. PDO is just an interface, pretty much like the database console. It doesn't perform any data operations of its own. It cannot insert or select data. It cannot perform data locks or transactions. All it can do is to send your command to database server and bring back results if any. It's just an interface. It doesn't have transactions on it's own.
So, instead of such fictional trans-driver transactions you can use regular ones.
Start two, one for each driver, and then rollback them accordingly. By the way, with PDO one don't have to rollback manually. Just set PDO in exception mode, write your queries and add commit at the end. In case one of queries failed, all started transactions will be rolled back automatically due to script termination.

Executing Async prepared statements in PHP with postgreSQL and ignoring its results

I have some kind of (basic) logging with user actions in a postgreSQL database.
In order to gain performance, I execute all log inserts asynchronously to let the script continue and not wait until log entry is created.
I use prepared statements everywhere to prevent SQL injections and load them in an as-needed basis.
The problem comes when there are pending results to be fetched from a previous async query when I prepare a statement. (PostgreSQL says there are pending results to be fetched prior to prepare a new statement)
So as a workarround, I gather all pending results (if any) and ignore them to make PHP and PostgreSQL happy before preparing any statement.
But with that workarround (as I see it), I miss the performance I could gain by executing asyncronously as I have to gather the results anyway.
Is there any way to asynchronously execute a prepared statement and deliberatelly tell postgres to ignore results?
Inside my PostgreSQL class, I am calling prepared statements with
pg_send_execute($this->resource, $name, $params);
and prepairing them with
//Just in case there are pending results (workarround)
while (pg_get_result($this->resource)!==FALSE);
$stmt = pg_prepare($this->resource, $stmtname, $query);
Any help will be apreciated.
UPDATE: All asynchronous queries I am using are only INSERT ones, so it should be safe (theoretically) to ignore their results.
The only thing that is asynchronous is your communication with PostgreSQL server - your database has to process everything sequentially.
My proposal:
If you have to use PostgreSQL for logging, use a separate database connection for logging purposes and get a connection pool sitting between your script and database - auth in PostgreSQL is costly and takes some time, this will cut it down. Acquiring a second connection will take some time, but if you use this method it will be faster than one without connection pool.
Depending on your reliability requirements you should use autocommit (to never lose a log entry when PHP crashes). You may want to use an UNLOGGED table (available since PostgreSQL 9.1) if you don't care about reliability on databse end (faster inserts as your data skips WAL) or if you don't use replication or don't need to have logs replicated.
As a speed optimization, your log table should have no indexes because they would have to be updated on each insert. If you need them, create a second table and move data in a batch (every X minutes or every hour).

mysqli_fetch_assoc - what happens if the data is changed in the meanwhile?

In PHP I'm using mysqli_fetch_assoc() in a while-loop to get every record in a certain query.
I'm wondering what happens if the data is changed while running the loop (by another process or server), so that the record doesn't match the query any more. Will it still be fetched?
In other words, is the array of records that are fetched fixed, when you do query()? Or is it not?
Update:
I understand that it's a feature that the resultset is not changed when the data is changed, but what if you actually WANT that? In my loop I'm not interested in records that are already updated by another server. How do I check for that, without doing a new query for each record that I fetch??
UPDATE:
Detailed explanation:
I'm working on some kind of searchengine-scraper that searches for values in a database. This is done by a few servers at the same time. Items that have been scraped shouldn't be searched anymore. I can't really control which server searches which item, I was hoping I could check the status of an item, while fetching the recordset. Since it's a big dataset, I don't transfer the entire resultset before searching, I fetch each record when I need it...
Introduction
I'm wondering what happens if the data is changed while running the loop (by another process or server), so that the record doesn't match the query any more. Will it still be fetched?
Yes.
In other words, is the array of records that are fetched fixed, when you do query()? Or is it not?
Yes.
A DBMS would not be worth its salt were it vulnerable to race conditions between table updates and query resultset iteration.
Certainly, as far as the database itself is concerned, your SELECT query has completed before any data can be changed; the resultset is cached somewhere in the layers between your database and your PHP script.
In-depth
With respect to the ACID principle *:
In the context of databases, a single logical operation on the data is called a transaction.
User-instigated TRANSACTIONs can encompass several consecutive queries, but 4.33.4 and 4.33.5 in ISO/IEC 9075-2 describe how this takes place implicitly on the per-query level:
The following SQL-statements are transaction-initiating
SQL-statements, i.e., if there is no current SQLtransaction, and an
SQL-statement of this class is executed, then an SQL-transaction is
initiated, usually before execution of that SQL-statement proceeds:
All SQL-schema statements
The following SQL-transaction statements:
<start transaction statement>.
<savepoint statement>.
<commit statement>.
<rollback statement>.
The following SQL-data statements:
[..]
<select statement: single row>.
<direct select statement: multiple rows>.
<dynamic single row select statement>.
[..]
[..]
In addition, 4.35.6:
Effects of SQL-statements in an SQL-transaction
The execution of an SQL-statement within an SQL-transaction has no
effect on SQL-data or schemas [..]. Together with serializable
execution, this implies that all read operations are repeatable
within an SQL-transaction at isolation level SERIALIZABLE, except
for:
1) The effects of changes to SQL-data or schemas and its contents
made explicitly by the SQL-transaction itself.
2) The effects of differences in SQL parameter values supplied to externally-invoked
procedures.
3) The effects of references to time-varying system
variables such as CURRENT_DATE and CURRENT_USER.
Your wider requirement
I understand that it's a feature that the resultset is not changed when the data is changed, but what if you actually WANT that? In my loop I'm not interested in records that are already updated by another server. How do I check for that, without doing a new query for each record that I fetch??
You may not.
Although you can control the type of buffering performed by your connector (in this case, MySQLi), you cannot override the above-explained low-level fact of SQL: no INSERT or UPDATE or DELETE will have an effect on a SELECT in progress.
Once the SELECT has completed, the results are independent; it is the buffering of transport of this independent data that you can control, but that doesn't really help you to do what it sounds like you want to do.
This is rather fortunate, frankly, because what you want to do sounds rather bizarre!
* Strictly speaking, MySQL has only partial ACID-compliance for tables other than those with the non-default storage engines InnoDB, BDB and Cluster, and MyISAM does not support [user-instigated] transactions. Still, it seems like the "I" should remain applicable here; MyISAM would be essentially useless otherwise.

Can I use a database value right after I insert it?

Can I insert something into a MySQL database using PHP and then immediately make a call to access that, or is the insert asynchronous (in which case the possibility exists that the database has not finished inserting the value before I query it)?
What I think the OP is asking is this:
<?
$id = $db->insert(..);
// in this case, $row will always have the data you just inserted!
$row = $db->select(...where id=$id...)
?>
In this case, if you do a insert, you will always be able to access the last inserted row with a select. That doesn't change even if a transaction is used here.
If the value is inserted in a transaction, it won't be accessible to any other transaction until your original transaction is committed. Other than that it ought to be accessible at least "very soon" after the time you commit it.
There are normally two ways of using MySQL (and most other SQL databases, for that matter):
Transactional. You start a transaction (either implicitly or by issuing something like 'BEGIN'), issue commands, and then either explicitly commit the transaction, or roll it back (failing to take any action before cutting off the database connection will result in automatic rollback).
Auto-commit. Each statement is automatically committed to the database as it's issued.
The default mode may vary, but even if you're in auto-commit mode, you can "switch" to transactional just by issuing a BEGIN.
If you're operating transactionally, any changes you make to the database will be local to your db connection/instance until you issue a commit. Issuing a commit should block until the transaction is fully committed, so once it returns without error, you can assume the data is there.
If you're operating in auto-commit (and your database library isn't doing something really strange), you can rely on data you've just entered to be available as soon as the call that inserts the data returns.
Note that best practice is to always operate transactionally. Even if you're only issuing a single atomic statement, it's good to be in the habit of properly BEGINing and COMMITing a transaction. It also saves you from trouble when a new version of your database library switches to transactional mode by default and suddenly all your one-line SQL statements never get committed. :)
Mostly the answer is yes. You would have to do some special work to force a database call to be asynchronous in the way you describe, and as long as you're doing it all in the same thread, you should be fine.
What is the context in which you're asking the question?

Dependent insertion of data into MySql table

I have 2 tables:
user_tb.username
user_tb.point
review_tb.username
review_tb.review
I am coding with PHP(CodeIgniter). So I am trying to insert data into review_tb with the review the user had submitted and if that is a success, i will award the user with some points.
Well this look like a very simple process. We will first insert the review into the review_tb with the username and use PHP to check if there is any problem with the query executed and if it's a success, we will proceed with updating the points in the user_tb.
Yea, but here comes the problem. What if inserting into review_tb is a success but the second query, inserting into the user_tb is NOT a success, can we kind of "undo" the review_tb query or "revert" the change that we did to review_tb.
It's kind of like "all or nothing".
The purpose of this is to sync all data across the database, where in real life, we will be managing a database of more tables, and inserting more data into each table which depends on each other.
Please give some enlightenment on how we can do this in PHP or CodeIgniter or just MySql query.
If you want a "all or nothing" behavior for your SQL operations, you are looking for transactions ; here is the relevant page from the MySQL manual : 12.4.1. START TRANSACTION, COMMIT, and ROLLBACK Syntax.
Wikipedia describes those this way :
A database transaction comprises a
unit of work performed within a
database management system (or
similar system) against a database,
and treated in a coherent and reliable
way independent of other transactions.
Transactions in a database environment
have two main purposes:
To provide reliable units of work that allow correct recovery from
failures and keep a database
consistent even in cases of system
failure, when execution stops
(completely or partially) and many
operations upon a database remain
uncompleted, with unclear status.
To provide isolation between programs accessing a database
concurrently. Without isolation the
programs' outcomes are typically
erroneous.
Basically :
you start a transaction
you do what you have to ; ie, your first insert, and your update
if everything is OK, you commit the transaction
else, if there is any problem with any of your queries, you rollback the transaction ; and it will cancel everything you did in that transaction.
There is a manual page about transactions and CodeIgniter here.
Note that, with MySQL, no every Engine supports transaction ; between the two most used engines, MyISAM doesn't support transactions, while InnoDB supports them.
Can't you use transactions? If you did both inserts inside the same transaction, then either both succeed or neither does.
Try something like
BEGIN;
INSERT INTO review_tb(username, review) VALUES(x, y);
INSERT INTO user_tb(username, point) VALUES(x, y);
COMMIT;
Note that you need to use a database engine that supports transactions (such as InnoDB).
If you have InnoDB support use it, but when its not possible you can use a code similar to the following:
$result=mysql_query("INSERT INTO ...");
if(!$result) return false;
$result=mysql_query("INSERT INTO somewhereelse");
if(!$result) {
mysql_query("DELETE FROM ...");
return false;
}
return true;
This cleanup might still fail, but can work whenever the insert query fails because of duplicates or constraints. For unexpected terminations, only way is to use transactions.

Categories