How to use the Repository Pattern to handle complex Reads(SELECTs)?

How to use the Repository Pattern to handle complex Reads(SELECTs)? - php

I've seen a lot of the $repo->findAll() or $repo->findById($id)examples, but I'm looking for how to expand on this for more complex Reads.
For example, let's say I have a datagrid that represents a SELECT query with several JOINs in it. I'm going to need to do these things:
Sorting
Filtering (WHERE conditions, some of which happen on the JOIN clauses of the query)
Columns (I don't want to SELECT *, so I need to specify the fields I want)
Limit (pagination)
Count (I need to know the total number of rows from all pages. Perhaps I do this in a seperate repo method/query.)
I'm not sure I'm comfortable using an existing query builder package because I'm not sure how testable and database-agnostic it would be (in other words, it might be too flexible). I do know that I do NOT want to use an ORM for this project. I'm using the Data Mapper + Repository approach instead.
How would I do this using the Repository Pattern?

(Sometimes, I believe the "Answer" to a question involves "lowering expectations".)
I believe you are asking too much for a "Repository Pattern". There are many 3rd party software packages that attempt to isolate the user from MySQL. They generally have limitations. Often a limitation is in scaling -- they are not designed to work with huge datasets in complex ways.
Whenever I use the Repository Pattern, it seems that I am doing little more than encapsulating a one (or a few) SQL statement and putting the encapsulated method (subroutine) in a separate file. Oh, I believe in doing it. I just don't believe in magic.
Let me pick apart two of your 'requirements'. They are good for encapsulating, but not necessarily good for the Repository Pattern.
Pagination using OFFSET and LIMIT... For simple datasets, this works fine. But I watched a project melt down after they did this. They required the obvious parameters (offset and limit) and did the obvious thing (construct and execute SELECT ... OFFSET $offs LIMIT $lim). Then they built a web page that had 126,000 "pages" worth of data. Then something did Next, Next, Next, ... until the system melted down.
The problem was depending on offset and limit instead of "Next" and "Prev", and "remembering where you left off". (I have a blog on that topic.) Note that the "solution" cannot be performed in the encapsulated routine, but involves UI changes and user expectation changes, plus code.
The other one I want to comment on is SQL_COUNT_FOUND_ROWS... So simple, so easy. But so deadly. As recently a as this week I was advising someone who's data had grown so much that he was having performance problems due to that counting technique. Many of the possible solutions involve more than can be stuck in a Repository Pattern. For example, the typical search engine long ago punted on getting the exact count and, instead, "managed the user expectations" by showing "10 items out of about 1,340,000". That, doubtless, took a lot of code in a lot of places, not just a simple enhancement to one SQL statement. It probably took multiple servers.
So, encapsulate - Yes. Repository Pattern - only somewhat. And become an expert in raw SQL.

Related

Chaining MySQL commands Vs. Raw queries

I have been building a lot of website in the past using my own cms/framework and I have developed a simple way of executing queries. Recently I have started playing with other frameworks such as code igniter. They offer raw query imputs such as…
$this->db->query(“SELECT * FROM news WHERE newsId=1;”);
But they also offer chaining of MySQL command via PHP methods.
$this->db->select("*")->from("news")->where("newsId=?");
The question is; what is the main difference and of benefits of each option.
I know the latter options prevents MySQL injection but to be honest you can do exactly the same from using $this->db->escape().
So in the end from what I can see the latter option only serves to make you use more letters on your keyboard, this you would think would slow you down.

I think the implementation of activerecord in codeigniter is suitable for small and easy queries.
When you need complex queries with lots of joins, it is more clear to just write the query itself.
I don't think that an extra layer of abstraction will ever give you better performance, if you have a certain skill in SQL.

Most recent php framework developers are uses AR(active record)/DAO(database access object) Pattern. Because it's really faster then raw query. Nowadays AR technique originally built from PDO(php data object).
why active record is really faseter?
its true query writing is the best habit for a developer. But some problem make it tough
1. When we write insert and update large query, sometime it's hard to match every row value.. but AR make it easy. you just add array first and then execute easily.
2. Doesn't matter what DB you use.
3. Sometimes it's really hard read or write query if it has many condition. But in AR you can cascade many object for 1 query.
4. AR save your time to repeating statement

I can't speak for CodeIgniter (what I've seen of it seems rather slung-together, frankly), but there are a few reasons such systems may be used:
as part of an abstraction layer which supports different DBMS back-ends, where for instance ->offset(10)->limit(10) would automatically generate the correct variant of OFFSET, LIMIT, and similar clauses for MySQL vs PostgreSQL etc
as part of an "ORM" system, where the result of the query is automatically mapped into Model objects of an appropriate class based on the tables and columns being queried
to abstract away from the exact names of tables and columns for backwards-compatibility, or installation requirements (e.g. the table "news" might actually be called "app1_news" in a particular install to avoid colliding with another application)
to handle parameterised queries, as in your example; although largely unrelated to this kind of abstraction, they provide more than just escaping, as the DBMS (MySQL or whatever is in use) knows which parts of the query are fixed and which are variable, which can be useful for performance as well as security

RedBean ORM performance [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I would like to know whether Redbean ORM can be used for performance oriented scenarios like social networking web apps or no and is it stable even if thousands of data are pulled by multiple users at same time. Also I'd like to know whether Redbean consumes more memory space.
Can anyone offer a comparison study of Doctrine-Propel-Redbean?

I feel Tereško's answer is not quite right.
Firstly it does not address the original question. It's indeed a case against ORMs, and I agree with the problems described in his answer. That's why I wrote RedBeanPHP. Just because most ORMs fail to make your life a bit easier does not mean the concept of an object relational mapping system is flawed. Most ORMs try to hide SQL, which is why JOINs get so complex; they need to re-invent something similar in an object oriented environment. This is where RedBeanPHP differs, as it does not hide SQL. It creates readable, valid SQL tables that are easy to query. Instead of a fabricated query language RedBeanPHP uses plain old SQL for record and bean retrieval. In short; RedBeanPHP works with SQL rather than against it. This makes it a lot less complex.
And yes, the performance of RedBeanPHP is good. How can I be so sure? Because unlike other ORMs, RedBeanPHP distinguishes between development mode and production mode. During the development cycle the database is fluid; you can add entries and they will be added dynamically. RedBeanPHP creates the columns, indexes, guesses the data types etc. It even stretches up columns if you need more bytes (higher data type) after a while. This makes RedBeanPHP extremely slow, but only during development time when speed should not be an issue. Once you are done developing you use freeze the database with a single mode specifier R::freeze() and no more checks are done. What you are left with is a pretty straight forward database layer on your production server. And because not much is done, performance is good.
Yes, I know, I am the author of RedBeanPHP so I am biased. However I felt like my ORM was being viewed in the same light as the other ORMs, which prompted me to write this. If you want to know more, feel free to consult the RedBeanPHP website, and here is a discussion on performance.
At our company we use RedBeanPHP for embedded systems as well as financial business systems, so it seems to scale rather well.
Together, me and the RedBeanPHP community are sincerely trying to make the ORM world a better place; you can read the mission statement here.
Good luck with your project and I hope you find the technical solution you are looking for.

#tereško if tis possible, can you give the pros and cons of orm with respect to pure sql according to your experience and also i will google the topic at same time. – Jaison Justus
Well .. explaining this in 600 characters would be hard.
One thing I must clarify: this is about ORMs in PHP, though i am pretty sure it applies to some Ruby ORMs too and maybe others.
In brief, you should avoid them, but if you have to use an ORM, then you will be better of with Doctrine 2.x , it's the lesser evil. (Implements something similar to DataMapper instead of ActiveRecord).
Case against ORMs
The main reason why some developers like to use ORMs is also the worst thing about them: it is easy to do simple thing in ORM, with very minor performance costs. This is perfectly fine.
1. Exponential complexity
The problem originates in people to same tool for everything. If all you have is a hammer (..) type of issue. This results in creating a technical debt.
At first it is easy to write new DB related code. And maybe, because you have a large project, management in first weeks (because later it would case additional issues - read The Mythical Man-Month, if interested in details) decides to hire more people. And you end up preferring people with ORM skills over general SQL.
But, as project progresses, you will begin to use ORM for solving increasingly complex problems. You will start to hack around some limitations and eventually you may end up with problems which just cannot be solved even with all the ORM hacks you know ... and now you do not have the SQL experts, because you did not hire them.
Additionally most of popular ORMs are implementing ActiveRecord, which means that your application's business logic is directly coupled to ORM. And adding new features will take more and more time because of that coupling. And for the same reason, it is extremely hard to write good unit-tests for them.
2. Performance
I already mentioned that even simple uses of ORM (working with single table, no JOIN) have some performance costs. It is due to the fact that they use wildcard * for selecting data. When you need just the list of article IDs and titles, there is no point on fetching the content.
ORMs are really bad at working with multiple tables, when you need data based on multiple conditions. Consider the problem:
Database contains 4 tables: Projects, Presentations, Slides and Bulletpoints.
Projects have many Presentations
Presentations have many Slides
Slides have many Bulletpoitns
And you need to find content from all the Bulletpoints in the Slides tagged as "important" from 4 latest Presentations related to the Projects with ids 2, 4 and 8.
This is a simple JOIN to write in pure SQL, but in any ORM implementation, that i have seen, this will result in 3-level nested loop, with queries at every level.
P.S. there are other reasons and side-effects, but they are relatively minor .. cannot remember any other important issues right now.

I differ from #tereško here - ORMs can make database queries easier to write and easier to maintain. There is some great work going into Propel and Doctrine, in my opinion - take advantage of them! There are a number of performance comparisons on the web, and check out NotORM as well (I've not used it but they do some comparisons to Doctrine, if I recall correctly).
If you get to a point where your throughput requires you to do raw SQL then optimise at that point. But in terms of reducing your bug count and increasing your productivity, I think that your savings will fund a better server anyway. Of course, your mileage may vary.
I don't know RedBean, incidentally, but I am mildly of the view that Propel is faster than Doctrine in most cases, since the classes are pre-generated. I used Propel when it was the only option and have stuck with it, though I certainly wouldn't be averse to using Doctrine.
2018 update
Propel 2 is still in alpha after a number of years, and is in need of a number of large refactoring projects, which sadly were not getting done. Although the maintainers say that this alpha is good to use in production, since it has good test coverage, they have now started on Propel 3. Unfortunately, this has not actually had any releases yet, at the time of my writing this, despite the repository being a year old.
While I think Propel was a great project, I wonder if it is best to use something else for the time being. It could yet rise from the ashes!

I would go with "Horses for Courses" situation that utilizes a mix and match of both the worlds. I have built few large scale applications using RedBean, so my comment will focus purely on RedBean and not on other ORMs.
IS RedBean ORM SLOW?
Well, it depends on how you use it. In certain scenarios, it's faster than traditional query because RedBean cache the result for few seconds. Reusing the query will produce result faster. Have a look at the log using R::debug(true); It always shows
"SELECT * FROM `table` -- keep-cache"
Scenario 1: Fetching All (*)
In RedBean if you query
$result = R::findOne('table', ' id = ?', array($id));
This is represented as
$result= mysql_query("Select * from TABLE where id =".$id);
You may argue that if the table is having multiple columns why should you query (*).
Scenario 2: Single column
Fetching a single column
R::getCol( 'SELECT first_name FROM accounts' );
Like i mentioned "Horses for Courses", developers should not simply rely on FindOne, FindAll, FindFirst, FindLast but also carefully draft what they really need.
Scenario 3: Caching
When you don't need caching, you can disable at application level which isn't an ideal situation
R::$writer->setUseCache(true);
RedBean suggests that if you don't want to disable caching at the application level you should use traditional query with no-cache parameter like $result = R::exec("SELECT SQL_NO_CACHE * FROM TABLE");
This perfectly solves the problem of fetching real-time data from table by completely discarding query cache.
Scenario 4: Rapid Development
Using ORM makes your application development really fast, developers can code using ORM 2-3x faster than writing SQL.
Scenario 5: Complex Queries & Relationships
RedBean presents a really nice way of implementing complex queries and one-to-many or many-to-many relationships
Plain SQL for complex queries
$books = R::getAll( 'SELECT
book.title AS title,
author.name AS author,
GROUP_CONCAT(category.name) AS categories FROM book
JOIN author ON author.id = book.author_id
LEFT JOIN book_category ON book_category.book_id = book.id
LEFT JOIN category ON book_category.category_id = category.id
GROUP BY book.id
' );
foreach( $books as $book ) {
echo $book['title'];
echo $book['author'];
echo $book['categories'];
}
OR RedBean way of handling many-t-to-many relationships
list($vase, $lamp) = R::dispense('product', 2);
$tag = R::dispense( 'tag' );
$tag->name = 'Art Deco';
//creates product_tag table!
$vase->sharedTagList[] = $tag;
$lamp->sharedTagList[] = $tag;
R::storeAll( [$vase, $lamp] );
Performance Issues
The arguments like ORMs are typically slow, consumes more memory and tends to make an application slow. I think they are not talking about RedBean.
We have tested it with MySQL and Postgres both, trust me performance was never a bottleneck.
There is no denying that ORMs adds up little overhead and tend to make your application slower ( just a little ). Using ORM is primarily a trade-off between developer time and slightly slower runtime performance. My strategy is to first build the application end-to-end with the ORM then based on test cases, tweak the speed critical modules to use straight data access.

Multi-tiered / Hierarchical SQL : How does Reddit do it? Which is the most efficient way? And what databases make it simpler?

I've been reading up a bit on how multi-tiered commenting systems are built:
http://articles.sitepoint.com/article/hierarchical-data-database/2
I understand the two methods talked about in that article. In fact I went down the recursive path myself, and I can see how the "Modified Preorder Tree Traversal" method is very useful as well, but I have a few questions:
How well do these two method perform in a large environment like Reddit's, where you can have thousands and thousands of mutli-tiered comments?
Which method does Reddit use? It simply seems very costly, to me, to have to update thousands of rows if they use the MPTT method. I'm not deluding myself into thinking I am building a system to handle Reddit's traffic, this is simply curiosity.
There's another way of retrieving comments like this ... JOINs via SQL that return the rows with IDs defining their parents. How much slower/faster/better/worse would it be to simply take these unformatted results, loop through them and add them into a formatted array using my language of choice (PHP)?
After reading that sitepoint article, I believe I understand that Oracle offers this functionality in a much simpler, easier to use way, and MySQL does not. Are there any free databases that offer something similar to Oracle?
On a side note, how is SQL pronounced? I'm getting the feeling I've been wrong for the past several years by saying 'sequel' instead of 's - q - l', although "My Sequel" rolls easier off the tongue than "My S Q L"!

MPTT is easier to fetch (a single SQL query), but more expensive to update. Simply delegate the update to a background process (that's what queue managers are for). Also note that most of that update is a single SQL UPDATE command. It might take long to process, but a smart RDBM could make the transaction visible (in cache) to new (read-only) queries before it's committed to disk.
I'd bet it uses MPTT, but not only doing the 'hard' update in background but also quite likely do a simple rendering to in-memory cache. This way, the posting user can see his post immediately, without having to wait until updating so many rows. Also, SSDs do help in getting high transaction rates.
that's called Adjacency Model (or sometimes adjacency list), it's a more obvious way to do it, and simpler to update (doesn't modify existing records) but FAR more inefficient to read. You have to do a recursive walk of the tree, with an SQL query at each node. That's what kills you: the number of small queries.
PostgreSQL has recursive SELECTs, which do in the server what you envision in PHP. It's better than PHP because it's closer to the data; but it still has the same (huge) number of random-access disk seeks.

You should have a closer look at the links in Further reading they give in the end. The Four ways to work with hierarchical data article on evolt linked there provides another way to approach this problem (the Flat table). Since that approach is extremely easy to implement for a threaded discussion board, I wouldn't be surprised if reddit uses it (or a variation on the theme).
I do like MPTT (aka nested set) though, and have used it for hierarchies that are (almost) static.

Most concise and complete Recursive Association Support in CakePHP?

I'm currently mucking about with CakePHP, deciding if I'll use it in an upcoming web application.
The problem is, I've got several tables which at some point share relevant data with each other. If I were to write all the code myself I would use an SQL query using rather a lot of different joins and subqueries. But from what I understand CakePHP only supports joins between two tables.
So for example, I have Users, Profile, Rank, Rating tables and I want to get the profile, rank and ratings of one particular user. CakePHP will do the trick by using multiple, separate SELECT statements. But this would be possible using one query with multiple joins. Performance is expected to be quite important, so not being too wasteful with SQL queries is a major perquisite.
I've found two hacks (one behaviour and one using bindModel) and a similar StackOverflow thread.
I'm undecided whether to use the behaviour or the bindModel hack. Could anybody shed any light as to what is the best approach - viz. what integrates best in the overall CakePHP structure (are features like pagination still available)? Or is there another approach which is ultimately better. The SO thread mentions a method using containables.
Hope I'm not wrong in opening a separate question for this, but the older thread lists some solutions, but the answer isn't that clear to me for the aforementioned reasons.

The easiest way to do this is to not bother with reducint the sql queries and to implement some form of caching.
The next solution - skipping over the Containable behavior since it doesn't work to reduce your queries is to do some ad-hoc joins in the find calls directly. Pushing these into the model so that you can call them from a central place is recommended. The good article on this technique is on the bakery here: http://bakery.cakephp.org/articles/view/quick-tip-doing-ad-hoc-joins-in-model-find
The best solution I have found to date is Rafael Bandeira's Linkable behavior : http://blog.rafaelbandeira3.com/2008/11/16/linkable-behavior-taking-it-easy-in-your-db/ which allows you to use a custom key in the options array that defines the fields and relationships to join on in a clear fashion and uses the technique described in 1 to use joins instead of sequential queries.
Good luck with your project.

The link to this bakery article in the other StackOverflow article you mentioned, is probably the better method for doing ad-hoc joins (without bindModel or a custom behavior). You can already specify joins inline (including extra tidbits such as the type of join) in the options for any find() method calls, but those can be greatly simplified by creating a new find "type" that requires less writing in the find() options. That's what the article discusses.
I also used to use raw SQL for some queries, but found that it can lead to unforseen incompatibilities with databases that are supported by CakePHP. However, this may not be much of an issue if you are not writing a web application to be used by the masses.

I have had similar problems and because performance was a huge factor I decided to simply use raw SQL rather than try and fiddle with solutions purely to maintain "cake-ness". Plus sometimes it's just nice to know where the bottleneck is (even though a debug mode of 2 does help somewhat). Migrating the db won't ever be an issue.
I decided to go for performance over the convenience of auto-pagination, sorting etc. Really you can code these yourself - you did so in the past I'm sure.
The bindModel solution however does interest me. This is what I would go for next time I come across this problem.

What's your experience with Doctrine ORM? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What's your experience with doctrine?
I've never been much of an ORM kind of guy, I mostlymanaged with just some basic db abstraction layer like adodb.
But I understood all the concepts and benifits of it. So when a project came along that needed an ORM I thought that I'd give one of the ORM framework a try.
I've to decide between doctrine and propel so I choose doctrine because I didn't want to handle the phing requirement.
I don't know what I did wrong. I came in with the right mindset. And I am by no means a 'junior' php kiddie. But I've been fighting the system each step of the way. There's a lot of documentation but it all feels a little disorganize. And simple stuff like YAML to db table creation just wouldn;t work and just bork out without even an error or anything. A lot of other stuff works a little funky require just that extra bit of tweaking before working.
Maybe I made some soft of stupid newbie assumption here that once I found out what it is I'll have the aha moment. But now I'm totally hating the system.
Is there maybe some tips anyone can give or maybe point me to a good resource on the subject or some authoritative site/person about this? Or maybe just recommend another ORM framework that 'just works"?

I have mixed feelings. I am a master at SQL only because it is so easy to verify. You can test SELECT statements quickly until you get the results right. And to refactor is a snap.
In Doctorine, or any ORM, there are so many layers of abstraction it almost seems OCD (obsessive/compulsive). In my latest project, in which I tried out Doctrine, I hit several walls. It took me days to figure out a solution for something that I knew I could have written in SQL in a matter of minutes. That is soooo frustrating.
I'm being grumpy. The community for SQL is HUGE. The community/support for Doctrine is minuscule. Sure you could look at the source and try to figure it out ... and those are the issues that take days to figure out.
Bottom line: don't try out Doctrine, or any ORM, without planning in a lot of time for grokking on your own.

I think mtbikemike sums it up perfectly: "It took me days to figure out a solution for something that I knew I could have written in SQL in a matter of minutes." That was also my experience. SAD (Slow Application Development) is guaranteed. Not to mention ugly code and limitations around every corner. Things that should take minutes take days and things that would normally be more complicated and take hours or days are just not doable (or not worth the time). The resulting code is much more verbose and cryptic (because we really need another query language, DQL, to make things even less readable). Strange bugs are all around and most of the time is spent hunting them down and running into limitations and problems. Doctrine (I only used v2.x) is akin to an exercise in futility and has absolutely no benefits. It's by far the most hated component of my current system and really the only one with huge problems. Coming into a new system, I'm always going back and forth from the db to the entity classes trying to figure out which name is proper in different places in the code. A total nightmare.
I don't see a single pro to Doctrine, only cons. I don't know why it exists, and every day I wish it didn't (at least in my projects).

we have been using Propel with Symfony for 2 years and Doctrine with Symfony for more than 1 year. I can say that moving to ORM with MVC framework was the best step we've made. I would recommend sticking with Doctrine eventhough it takes some time to learn how to work with it. In the end you'll find your code more readable and flexible.
If you're searching for some place where to start, I would recommend Symfony Jobeet tutorial http://www.symfony-project.org/jobeet/1_4/Doctrine/en/ (chapters 3, 6 covers the basics) and of course Doctrine documentation.
As I wrote above we have been using Doctrine for some time now. To make our work more comfortable we developed a tool called ORM Designer (www.orm-designer.com) where you can define DB model in a graphical user interface (no more YAML files :-), which aren't btw bad at all). You can find there also some helpful tutorials.

My experiences sound similar to yours. I've only just started using doctrine, and have never used Propel. However I am very disapointed in Doctrine. It's documentation is terrible. Poorly organised, and quite incomplete.

Propel and Doctrine uses PDO. PDO has a lot of open bugs with the Oracle Database. All of them related with CLOB fields. Please keep this in mind before starting a new project if you are working with Oracle. The bugs are open since years ago. Doctrine and PDO will crash working with Oracle and CLOBs

I'm using Doctrine in a medium sized project where I had to work from pre-existing databases I don't own. It gives you alot of built in features, but I have one major complaint.
Since I had to generate my models from the databases and not vice-versa, my models are too close to the database: the fields have very similar names to the database columns, to get objects you have to query in what is essential sql (where do I put that code, and how do I test it?), etc.
In the end I had to write a complex wrapper for doctrine that makes me question if it wouldn't have been easier to just use the old dao/model approach and leave doctrine out of the picture. The jury is still out on that. Good luck!

Using Doctrine 2.5 in 2015. It was seemingly going well. Until I wanted to use two entities (in a JOIN). [it's better now after I got a hang of DQL]
Good:
generating SQL for me
use of Foreign Keys and Referential Integrity
InnoDB generation by default
updates made to SQL with doctrine command line tool
Okay:
being hyper-aware of naming and mapping and how to name and how to map entities to actual tables
The Bad
takes a lot of time - learning custom API of query builder. Or figuring out how to do a simple JOIN, wondering if better techniques are out there.. Simple JOINs seem to require writing custom functions if you want to do object oriented queries.
[update on first impression above] -- I chose to use DQL as it is most similar to SQL
It seems to me that the tool is great in concept but its proper execution desires much of developer's time to get onboard. I am tempted to use it for entity SQL generation but then use PDO for actual Input/Output. Only because I didn't learn yet how to do Foreign Key and Referential Integrity with SQL. But learning those seems to be much easier task than learning Doctrine ins and outs even with simple stuff like a entity equivalent of a JOIN.
Doctrine in Existing Projects
I (am just starting to) use Doctrine to develop new features on an existing project. So instead of adding new mysql table for example for the feature, I have added entities (which created the tables for me using Doctrine schema generation). I reserve not using Doctrine for existing tables until I get to know it better.
If I were to use it on existing tables, I would first ... clean the tables up, which includes:
adding id column which is a primary/surrogate key
using InnoDb/transaction-capable table
map it appropriately to an entity
run Doctrine validate tool (doctrine orm:validate-schema)
This is because Doctrine makes certain assumptions about your tables. And because you are essentially going to drive your tables via code. So your code and your tables have to be in as much as 1:1 agreement as possible. As such, Doctrine is not suitable for just any "free-form" tables in general.
But then, you might be able to, with some care and in some cases, get away with little things like an extra columns not being accounted for in your entities (I do not think that Doctrine checks unless you ask it to). You will have to construct your queries knowing what you are getting away with. i.e. when you request an "entity" as a whole, Doctrine requests all fields of the entity specifically by column name. If your actual schema contains more column names, I don't think Doctrine will mind (It does not, as I have verified by creating an extra column in my schema).
So yes it is possible to use Doctrine but I'd start small and with care. You will most likely have to convert your tables to support transactions and to have the surrogate index (primary key), to start with. For things like Foreign Keys, and Referential Integrity, you'll have to work with Doctrine on polishing your entities and matching them up perfectly. You may have to let Doctrine re-build your schema to use its own index names so that it can use FK and RI properly. You are essentially giving up some control of your tables to Doctrine, so I believe it has to know the schema in its own way (like being able to use its own index names, etc).

A little late for the party, but let me throw my two cents here. I will make connections with Laravel, because that is the framework I use.
Active Record vs. Data Mapping vs. Proper OOP
Laravel and many other frameworks love Active Record. It might be great for simple applications, and it saves you time for trivial DB management. However, from the OOP perspective it is a pure anti-pattern. SoC (Separation of Concerns) just got killed. It creates a coupling between the model attributes and SQL column names. Terrible for extensions and future updates.
As your project growths (and yes, it will!), ActiveRecord will be more and more of pain. Don't even think of updating SQL structure easily. Remember, you have the column names all over your PHP code.
I was hired for a project that aims to be quite big down the road. I saw the limits of ActiveRecord. I sat back for 3 weeks and rewrote everything using a Data Mapper, which separates DB from the layers above.
Now, back to the Data Mapper and why I didn't choose Doctrine.
The main idea of Data Mapper is, that it separates your database from your code. And that is the correct approach from the OOP perspective. SoC rules! I reviewed Doctrine in detail, and I immediately didn't like several aspects.
The mapping. Why in a world would anyone use comments as commands? I consider this to be an extremely bad practise. Why not just use a PHP Class to store the mapping relations?
Yaml or XML for the map. Again, Why?? Why wasting time parsing text files, when a regular PHP Class can be used. Plus, a class can be extended, inhereted, can contain methods, not just data. Etc.
If we have a mapper and a model carrying data, then it should be the mapper storing the model. Methods such as $product->save() ar just not good. Model handles data, it should not care about storing anything to the DB. It is a very tight coupling. If we spend time building a mapper, then why not having $mapper->save($product). By definition, it shall be the mapper knowing how to save the data.
Tools such as Doctrine or Eloquent save time at the beginning, no doubt about it. But here is the tricky question for everyone individually. What is the right compromise between /development time/future updates/price/simplicity/following OOP principles/? In the end, it is up to you to answer and decide properly.
My own DataMapper instead of Doctrine
I ended up developing my own DataMapper and I have already used it for several of my small projects. It works very nicely, easy to extend and reuse. Most of the time we just set up parameters and no new code is required.
Here are the key principles:
Model carries data, similar to Laravel's model. Example variable $model for the following examples.
ModelMap contains a field that maps the attributes of the Model to the columns of the table in the SQL database. ModelMaps knows the table name, id, etc. It knows which attributes should be tranfromed to json, which attributes should be hidden (e.g. deleted_at). This ModelMap contains aliases for columns with the same name (connected tables). Example variable: $modelMap.
ModelDataMapper is a class that accepts Model and ModelMap in the controller and provides the store/getById/deleteById functionalities. You simply call $modelMapper->store($model) and that's all.
The base DataMapper also handles pagination, search ability, converting arrays to json, it adds time stamps, it checks for soft deletes, etc. For simple usages, the base DataMapper is enough. For anything more complex, it is easy to extend it using inheritance.

Using Doctrine ORM in 2016 with Approx experience ~2 - 2.5 years.
Inherent Inconsistency
SELECT i, p
FROM \Entity\Item i
JOIN i.product p
WHERE ...
Assume entities are Item and Product. They are connected via Item.product_id to Product.id, and Product contains Product.model that we want to display along with Item.
Here is retrieval of same "product.model" from database, using the above SQL but varying SQL parameters:
//SELECT i, p
$ret[0]->getProduct()->getModel();
//SELECT i as item, p as product
$ret[0]['item']->getProduct()->getModel();
//SELECT i as item, p.model as model
$ret[0]['model'];
Point I am making is this:
Output ResultSet structure can change drastically depending on how you write your DQL/ORM SELECT statement.
From array of objects to array of associative array of objects, to array of associative array, depending on how you want to SELECT. Imagine you have to make a change to your SQL, then imagine having to go back to your code and re-do all the code associated with reading data from the result set. Ouch! Ouch! Ouch! Even if it's a few lines of code, you depend on the structure of result set, there is no full decoupling/common standard.
What Doctrine is good at
In some ways it removes dealing with SQL, crafting and maintaining your own tables. It's not perfect. Sometimes it fails and you have to go to MySQL command line and type SQL to adjust things to the point where Doctrine and you are happy, to where Doctrine sees column types as valid and to where you are happy with column types. You don't have to define your own foreign keys or indices, it is done for you auto-magically.
What Doctrine is bad at
Whenever you need to translate any significantly advanced SQL to DQL/ORM, you may struggle. Separately from that, you may also deal with inconsistencies like one above.
Final thoughts
I love Doctrine for creating/modifying tables for me and for converting table data to Objects, and persisting them back, and for using prepared statements and other checks and balances, making my data safer.
I love the feeling of persistent storage being taken care of by Doctrine from within the object oriented interface of PHP. I get that tingly feeling that I can think of my data as being part of my code, and ORM takes care of the dirty stuff of interacting with the database. Database feels more like a local variable and I have gained an appreciation that if you take care of your data, it will love you back.
I hate Doctrine for its inconsistencies and tough learning curve, and having to look up proprietary syntax for DQL when I know how to write stuff in SQL. SQL knowledge is readily available, DQL does not have that many experts out in the wild, nor an accumulated body of knowledge (compared to SQL) to help you when you get stuck.

I'm not an expert with Doctrine - just started using it myself and I have to admit it is a bit of a mixed experience. It does a lot for you, but it's not always immediately obvious how to tell it to do this or that.
For example when trying to use YAML files with the automatic relationship discovery the many-to-many relationship did not translate correctly into the php model definition. No errors as you mention, because it just did not treat it as many-to-many at all.
I would say that you probably need time to get your head around this or that way of doing things and how the elements interact together. And having the time to do things one step at a time would be a good thing and deal with the issues one at a time in a sort of isolation. Trying to do too much at once can be overwhelming and might make it harder to actually find the place something is going wrong.

After some research into the various ORM libraries for PHP, I decided on PHP ActiveRecord (see phpactiverecord). My decision came down to the little-to-no configuration, light-weight nature of the library, and the lack of code generation. Doctrine is simply too powerful for what I need; what PHP ActiveRecord doesn't do I can implement in my wrapper layer. I would suggest taking a moment and examining what your requirements are in an ORM and see if either a simple one like PHP ActiveRecord offers what you need or if a home-rolled active record implementation would be better.

For now I'm using Symfony framework with Doctrine ORM,
how about using Doctrine together with plain queries?
For e.g. from knpuniversity, I can create custom repository method like:
public function countNumberPrintedForCategory(Category $category)
{
$conn = $this->getEntityManager()
->getConnection();
$sql = '
SELECT SUM(fc.numberPrinted) as fortunesPrinted, AVG(fc.numberPrinted) as fortunesAverage, cat.name
FROM fortune_cookie fc
INNER JOIN category cat ON cat.id = fc.category_id
WHERE fc.category_id = :category
';
$stmt = $conn->prepare($sql);
$stmt->execute(array('category' => $category->getId()));
return $stmt->fetch();
... lines 30 - 37
}
I'm just use Doctrine Entities for e.g. creating an processing forms,
When I need more complex query I just make plain statement and take values I need, from this example I can also pass Entity as variable and take it values for making query. I think this solution is easy understand and it takes less time for building forms, passing data for them and writing complex queries is not as hard as writing them with Doctrine.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.