My question is about more philosophical than technical issues.
A few words about Doctrine`s EM. It closes connection and clear itself if any exception occurred during it work: connection to database failed (common case in long-running consumers with low number of incoming tasks), error in SQL statement, or something else, related to DB-server, or EM itself. After this EM instance is completely unusable.
So, real-world example: i have a queue and consumer, that`s running as console worker and wait for tasks. Consumer has next dependencies:
EntityManager (EM)
Service1 -> has dependency from EM and Doctrine Repository1
Service2 -> has dependency from EM and Doctrine Repository2
ServiceN -> has dependency from EM and Doctrine RepositoryN
If EM service is failed - Service(1-N) and Repository(1-N), that depends on this EM, will be also throw errors when called, because EM is no longer works correctly. What I should do in this case?
"let-it-crash": worker stoppped with error and later reloaded by
supervisord. Leads to increase number of useless errors in
logs\stderr.
do some magic with $connection->ping() in each iteration: actually, ping() just execute SELECT 1;, so, this leads to
increase number of useless queries to DB server.
same as before, but in case of EM fail create new one on consumer: execute ping() on each iteration, if it failed - create new EM.
But, all services used in consumer should be also re-created, so I
need a Factory for each of them. This way leads to increase number
of classes and more complicated logic in consumer: re-create all
services (and it dependencies) on each iteration with new, or old
EM, or detect EM re-creation and re-create all dependent services
only in case of new EM. But this leads to abstaction leak: consumer
should not know what EM instance it uses - old or new, and should
not do this crappy things.
What is the best way to deal with this things?
I would share some thoughts here.
"Leads to increase number of useless errors in logs\stderr" - I do not think these are useless errors. If your software throws an exception, you should know about that. A log file of the software is best when it doesn't have any exceptions, but that's rarely the case. Anyway, any database exception and a rate at which it occurs, should be investigated.
I would not rely on reestablishing connection, but instead rely on Doctrine API to initialize itself. This answer has some details on how to do that for several Doctrine2 versions.
I think this is too much of the logic to implement and will only complicate matters.
If I were to choose, I would go with option #1 (let-it-crash) because it is the simpliest of all and it does not hide anything from us.
I have the follow relationships:
A Student has a "one-to-one" relationship with a Person. A Person has a "many-to-many" relationship with Address.
I want to persist the data: first create the Person, after create the Addresses and then create the Student.
But I want to rollback the transations if any error occur during the persistence in any of these tables. Ex.: If I save the Person and the Addresses, and the Student fails, I want to rollback everything.
How to handle this with Eloquent?
Thanks.
Laravel provides vary simple way to handle this kind of situation.
DB::transaction(function () {
//all your codes
});
From laravel Documentation :
To run a set of operations within a database transaction,
you may use the transaction method on the DB facade. If an exception is thrown within the transaction Closure, the transaction will automatically be rolled back. If the Closure executes successfully, the transaction will automatically be committed. You don't need to worry about manually rolling back or committing while using the transaction method.
Also if you want to do it manually , you can do it using
DB::beginTransaction();
//your codes
DB::commit();
To learn more about transactions, read official document Here
I need to modify all queries that are executed through Zend\Db before sending them to DB.
Basically it needs to add additional WHERE statement to all selects, updates and deletes and additional column and value in inserts.
I was thinking about writing my own TableGateway feature for that, the problem is that I would like to avoid being restricted to TableGateway alone and have this functionality while using Zend\Db\Adapter and TableGateway at the same time.
You can have a look at some of the events dispatched from the table gateway if that make sense in your context: http://framework.zend.com/apidoc/2.4/namespaces/Zend.Db.TableGateway.Feature.EventFeature.html
There is a preSelect event that is triggered and which you can probably listen to.
I have ended up writing custom db adapter that handles all the logic. I'll probably share it as an open source if I'll have a time to clean up the code.
In my obsolate procedural code (which I'd like now to translate into OOP) I have simple database transaction code like this:
mysql_query("BEGIN");
mysql_query("INSERT INTO customers SET cid=$cid,cname='$cname'");
mysql_query("INSERT INTO departments SET did=$did,dname='$dname'");
mysql_query("COMMIT");
If I build OOP classes Customer and Department for mapping customers and departments database tables I can insert table records like:
$customer=new Customer();
$customer->setId($cid);
$customer->setName($cname);
$customer->save();
$department=new Department();
$department->setId($did);
$department->setName($dname);
$department->save();
My Customer and Department classes internally use other DB class for querying database.
But how to make $customer.save() and $department.save() parts of a database transaction?
Should I have one outer class starting/ending transaction with Customer and Department classes instantiated in it or transaction should be started somehow in Customer (like Customer.startTransaction()) and ended in Department (like Department.endTransaction())? Or...
Additional object is the way to go. Something like this:
$customer=new Customer();
$customer->setId($cid);
$customer->setName($cname);
$department=new Department();
$department->setId($did);
$department->setName($dname);
$transaction = new Transaction();
$transaction->add($customer);
$transaction->add($department);
$transaction->commit();
You can see that there is no call to save() method on $customer and $department anymore. $transaction object takes care of that.
Implementation can be as simple as this:
class Transaction
{
private $stack;
public function __construct()
{
$this->stack = array();
}
public function add($entity)
{
$this->stack[] = $entity;
}
public function commit()
{
mysql_query("BEGIN");
foreach ($this->stack as $entity) {
$entity->save();
}
mysql_query("COMMIT");
}
}
How to make $customer.save() and $department.save() parts of a database transaction?
You don't have to do anything besides start the transaction.
In most DBMS interfaces, the transaction is "global" to the database connection. If you start a transaction, then all subsequent work is automatically done within the scope of that transaction. If you commit, you have committed all changes since the last transaction BEGIN. If you rollback, you discard all changes since the last BEGIN (there's also an option to rollback to the last transaction savepoint).
I've only used one database API that allowed multiple independent transactions to be active per database connection simultaneously (that was InterBase / Firebird). But this is so uncommon, that standard database interfaces like ODBC, JDBC, PDO, Perl DBI just assume that you only get one active transaction per db connection, and all changes happen within the scope of the one active transaction.
Should I have one outer class starting/ending transaction with Customer and Department classes instantiated in it or transaction should be started somehow in Customer (like Customer.startTransaction()) and ended in Department (like Department.endTransaction())? Or...
You should start a transaction, then invoke domain model classes like Customer and Department, then afterwards, either commit or rollback the transaction in the calling code.
The reason for this is that domain model methods can call other domain model methods. You never know how many levels deep these calls go, so it's really difficult for the domain model to know when it's time to commit or rollback.
For some pitfalls of doing this, see How do detect that transaction has already been started?
But they don't have to know that. Customer and Department should just do their work, inserting and deleting and updating as needed. Once they are done, the calling code decides if it wants to commit or rollback the whole set of work.
In a typical PHP application, a transaction is usually the same amount of work as one PHP request. It's possible, though uncommon, to do more than one transaction during a given PHP request, and it's not possible for a transaction to span across multiple PHP requests.
So the simple answer is that your PHP script should start a transaction near the beginning of the script, before invoking any domain model classes, then commit or rollback at the end of the script, or once the domain model classes have finished their work.
You are migrating to OOP, and thats great, but soon you will find yourself migrating to an arquitecture with a well diferenciated Data Access Layer, including a more complex way of separating data from control. Now, i guess you are using some kind of Data access object, that is a great first approach pattern, but for sure you can go further. Some of the answer here already lead you in that direction. You shouldent think in your objects as the basis of your arquitecture, and use some helper objects to query database. Instead, you should think about a fully featured layer, with all required generic classes that takes care of the comunication with the database, that you will use in all your projects, and then have the business-level-objects, like customer or department, than know as litle as possible about database implementations.
For this, for sure you will have an outer class handling transactions, but probably also other taking care of security, other for building queries providing a unique api regardless or the database engine, and even more, a class that reads objects in order to put them in the database, so the object itself doesn't even know that it is meant to end in a database.
Achieve this, would be a hard and long work, but after that, you could have a custom and widely reusable layer that will make your projects more escalable, more stable, and more trustable. And that will be great and you will learn a lot and after that you would fill quite good. You will have some kind of DBAL or ORM.
But that wouldnt also be the best solution, since there are people that already have been years doing that, and it will be hard to achieve what the already have.
So, what i recommend, for any medium size project, is that you take data base abstraction as serious as you can, and any opensource ORM, that happens to be easy to use, and finally you will save time and get a system much better.
for example, doctrine has a very nice way of handling transactions and concurrency, in two ways: implicit, taking automatically care of the normal operations, or implicit, when you need to take over and control transaction demarcation yourself. check it out here. Also, there are some other complex posibilities like transaction nesting, and others.
The most famous and reliable ORM are
Doctrine, and
Propel
I use doctrine mostly, since it has a module to integrate with Zend Framework 2 that i like, but propel has some aspects that i like a lot.
Probably you would have to refactor somethings, and you dont feel like doing it at this point, but i can say for my experience, that this is one of those things you dont even want to think about, and years after you start using it and realize how you wasted time :-)recommend you to consider this if not know, in your very next project.
UPDATE
Some thoughts after Tomas' comment.
It's true that for not so big projects (especially if you are not very familiar with orms, or your model is very complex) it can be a big effort to integrate a vendor orm.
But what i can say after years developing projects of any size, is that for any medium size one, i would use at least a custom, less serious and more flexible home-made orm, with a sort of generic classes, and as few as possible business oriented repositories, where an entity knows its table, and probably other related tables, and where you can encapsulate some sql or custom query function calls, but all around that entity (for example the main table of the entity, the table of pictures associated to that entity, and so) in order to provide to the controller a single interface to the data, so at any range the database engine is independent of the API of the model, and as much important as that, the controller doesn't have to be aware of any DBMS aspects, like the use of a transactions, something that is meant just to ensure a behavior that is purely model-related, and in a scandalous low level: related pretty much to DBMS technical needs. i mean, your controller could know that it is storing stuff in a database, but for sure it doesn't have to even know what a transaction is.
For sure this is a philosophical discussion, and it could be many equally valid points of view.
For any custom ORM, i would recommend to start looking for some DAO/DTO generator that can help you to create the main classes from your database, so you only need to adapt them to your needs at the points where you find exceptions to the normal behavior of a normal create-read-update-delete. This reminds me that you can also look for PHP CRUD and find some useful and fun tools.
In our Symfony2 project, we would like to ensure that modifications across resources are transactional. For example, something like:
namespace ...;
use .../TransactionManager;
class MyService {
protected $tm;
public function __construct(TransactionManager $tm)
{
$this->tm = $tm;
}
/**
* #ManagedTransaction
*/
public function doSomethingAcrossResources()
{
...
// where tm is the transaction manager
// tm is exposing a Doctrine EntityManager adapter here
$this->tm->em->persist($entity);
...
// tm is exposing a redis adapter here
$this->tm->redis->set('foo', 'bar');
if ($somethingWentWrong) {
throw new Exception('Something went terribly wrong');
}
}
}
So there are a couple things to note here:
Every resource will need an adapter exposing it's API (e.g. a Doctrine adapter, Redis adapter, Memcache adapter, File adapter, etc.)
In the case that something goes wrong (Exception thrown), nothing should get written to any managed resource (i.e. rollback everything).
If nothing goes wrong, all resources will get updated as expected
doSomethingAcrossResources function does not have to worry about un-doing changes it made to non-transactional resources like Files and Memcache for example. This is key, because otherwise, this code would likely become a tangled mess of only writing to redis at the appropriate time, etc.
#ManagedTransacton annotation will take care of the rest (commiting / rolling back / starting the transactions required (based on the adapters), etc.)
In the simplest implementation, the tm can simply manage a queue and dequeue all items serially. If an exception is thrown, it simply won't dequeue anything. So the adapters are the transaction manager's knowledge of how to commit each item in the queue.
If an exception occurs during a dequeue, then the transaction manager will look to it's adapters for how to rollback the already dequeued items (probably placed in a rollback stack). This might get tricky for resources like the EntityManager that would need to manage a transaction internally in order to rollback the changes easily. However, redis adapter might cache a previous value during an update, or during an ADD simply issue a DELETE during a rollback.
Does a transaction manager like this already exist? Is there a better way of achieving these goals? Are there caveats that I may be overlooking?
Thanks!
It turns out that we ended up not needing to ensure atomicity across our resources. We do want to be atomic with our database interactions when multiple rows / tables are involved, but we decided to use an event driven architecture instead.
If, say, updating redis fails inside of an event listener, we will stop propagation, but it's not the end of the world -- allowing us to inform the user of a successful operation (even if side effects were not successful).
We can run background jobs to occasionally update redis as needed. This enables us to focus the core business logic in a service method, and then dispatch an event upon success allowing non-critical side effects (updating cache, sending emails, updating elastic search, etc.) to take place, isolated from one another, outside the main business logic.