I have a class (PersistenceClass), that takes an array of data (posts) and parses that data and puts it into a DB (via doctrine). The field content needs to be parsed by a second class (SyntaxClass) before it is set into the doctrine entity.
Now the problem is, that the SyntaxClass has to set references in the content to other posts (just a link with and ID). So it needs access to the DB, and also needs to search in the persisted but not yet flushed entities from the PersistenceClass.
I can inject a doctrine EM into SyntaxClass and find my references in DB, although I dont like it very much. But the bigger problem is, how I can access the only persisted, but not flushed entities from the PersistenceClass ? I could make an Array of that objects and put it as an parameter to the parser method like:
SyntaxClass->parseSyntax($content, $persistedObjects);
But that does not look very clean. Aside from that, I dont know if it is somehow possible to search in the data of the persisted objects?
Your question is full of sub-question, so, first I'll try to make some things clear.
First, the naming convention you used is a bit abiguos and this not helps, me and also other people that may work on your code in future (maybe you'll grow and need to hire more developers! :P ). So, let's start with some nomenclature.
What you are calling PersistenceClass may be something like this:
class PersistenceClass
{
public function parse(array $posts)
{
foreach ($posts as $post) {
// 1. Parse $post
// 2. Parse content with SyntaxClass
// 3. Persist $post in the database
}
}
}
The same applies also for SyntaxClass: it receives the $content and parses it in some ways, then sets the references and then persists.
This is just to set some boundaries.
Now, go to your questions.
I can inject a doctrine EM into SyntaxClass and find my references in
DB, although I dont like it very much.
This is exactly what you have to do! The OOP development works this way.
But, and here come the problems with naming conventions, the way you inject the entity manager depends on the structure of your classes.
A good design should use services.
So, what currently are PersistenceClass and SyntaxClass in reality should be called PersistenceService and SyntaxService (also if I prefere call them PersistenceManager and SyntaxManager, because in my code I always distinguish between managers and handlers - but this is a convention of mine, so I'll not write more about it here).
Now, another wrong thing that I'm imaging you are doing (only reading your question, I'M IMAGING!): you are instantiating SyntaxService (you currently named SyntaxClass) from inside PersistenceService (your currently named PersistenceClass). This is wrong.
If you need a fresh instance of SyntaxService for each post, then you should use a factory class (say SyntaxFactory), so calling SyntaxFactory::create() you'll get a fresh instance of SyntaxService. Is the factory itself that injects the entity manager in the newly created SyntaxClass.
If you don't need a fresh instance each, time, instead, you'll declare SyntaxClass simply as a service and will pass it to PersistenceService by injection. Below this last simpler example:
# app/config/service.yml
services:
app.service.persistence:
class: ...\PersistenceService
# Pass the SyntaxInstance directly or a factory if you need one
aguments: ["#doctrine.orm.default_entity_manager", "#app.service.syntax"]
app.service.syntax:
class: ...\SyntaxService
aguments: ["#doctrine.orm.default_entity_manager"]
But the bigger problem is, how I can access the only persisted, but
not flushed entities from the PersistenceClass ?
Now the second question: how to search for {persisted + flushed} and {persisted + not flushed} entities?
The problem is that you cannot use the ID as the search parameter as the persisted but not flushed entities doesn't have one before the flushing.
The solution may be to create another service: SearchReferencesService. In it you'll inject the entity manager too (as shown before).
So this class has a method search() that does the search.
To search for the entities persisted but not flushed, the UnitOfWork gives you some interesting methods: getScheduledEntityInsertions(), getScheduledEntityUpdates(), getScheduledEntityDeletions(), getScheduledCollectionDeletions() and getScheduledCollectionUpdates().
The array of which you are speaking about is already there: you need to only cycle it and compare object by object, basing the search on fields other than the ID one (as it doesn't exist yet).
Unfortunately, as you didn't provided more details about the nature of your search, it is not possible for me to be more precise about how to do this search, but only tell you you have to search using the unit of work and connecting to the database if null results are returned by the first search. Also the order in which you'll do this search (before in the database and then in the unit of work or viceversa) is up to you.
Hope this will help.
Related
I've got a script that fetches data from a database using doctrine. Sometimes it needs to fetch the data for the same entity, the second time however it uses the identity map and therefor might go out of sync with the database (another process can modify the entities in the db). One solution that we tried was to set the query hint Query::HINT_REFRESH before we run the DQL query. We however would like to use it also with simple findBy(..) calls but that doesn't seem to work? We would also like to be able to set it globally per process so that all the doctrine SELECT queries that are run in that context would actually fetch the entities from the DB. We tried to set the $em->getConfiguration()->setDefaultQueryHint(Query::HINT_REFRESH, true); but again that doesn't seem to work?
Doctrine explicitly warns you that it is not meant to be used without a cache.
However if want to ignore this, then Cerad's comment (also mentioned in in this answer) sound right. If you want to do it on every query though you might look into hooking into a doctrine event, unfortunately there is no event for preLoad, only postLoad, but if you really don't care about performance you could create a postLoad listener which first gets the class and id of the loaded entity, calls clear on the entity manager and finally reloads it. Sounds very wrong to me though, I wash my hands of it :-)
I'm writing an API where I have a Controller that POSTs a new object, GETs it back and can PUT/PATCH updates to it. The problem is that there's a difference in properties between the two different actions. For example, when I POST a new object I want to ensure that the 'id' of it is returned so that it can be used to identify it for the GET/PUT/PATCH endpoints. It doesn't matter if it comes back via the GET (it's just a duplication of data at that point) but I certainly don't want it passed for the PUT or PATCH as the id is immutable.
So what's the best way to mark this up in swagger so that I can have different versions of the same Definition? I've seen that you can use 'allOf' to add Definitions to other properties, but I'm wondering if there's a way of saying 'not these properties in the definition'?
If I could do the latter I could make one Definition of the object as a whole and simply knock out the things that aren't required to be returned or submitted when referencing it at the Controller. Is this possible? Am I making sense?
(Just to make things more interesting, my swagger.json file is being generated by swagger-php based on Annotations in my controller and entity files)
I am facing the same issue while writing our spec file, and didn't know how to fix it but what I used is the property "readOnly":true, this way the documentation says that this is a readonly property, you can only read it through GET/POST methods but you cannot send it via PATCH/ PUT.. hope this helps
I'm just wondering if someone can help me understand how to make the best use of objects in PHP.
My understanding of a PHP object is that is should represent an entity, providing methods to get and alter the properties of that entity. For example an object entitled Post would hold all the properties of a single post, which could be accessed and modified as appropriate.
What causes me some confusion is that libraries like CodeIgniter don't use objects in this manor. They treat classes more like wrappers for a group of functions. So a 'Posts' class in CodeIgniter would not hold properties of one post, it would provide functions for fetching, editing and deleting posts.
So what happens if I want to get every post out of a database and put it into a Post object? My understanding of it is I would in fact need two classes 'Posts' and 'Post', one that defines the Post object and one that handles fetching the Posts from the database and putting them into Post objects.
Do these two types of class have a name ('Proper' objects / Collections of functions)? And is it common to have two classes working together like this or have I completely misunderstood how to use objects?
Instead of having a Post object would it make more sense to have a method in my Posts class called getSinglePost($id) that just returned an array?
Hopefully that question makes sense, looking forwards to getting some feedback.
For an introduction, see What is a class in PHP?
For the answer, I'll just address your questions in particular. Search for the terms in bold to learn more about their meaning.
My understanding of a PHP object is that is should represent an entity, providing methods to get and alter the properties of that entity.
Entities are just one possible use for objects. But there is also Value Objects, Service Objects, Data Access Objects, etc. - when you go the OO route, everything will be an object with a certain responsibility.
What causes me some confusion is that libraries like CodeIgniter don't use objects in this manor.
Yes, Code Igniter is not really embracing OOP. They are using much more of a class-based-programming approach, which is more like programming procedural with classes and few sprinkles of OOP.
They treat classes more like wrappers for a group of functions. So a 'Posts' class in CodeIgniter would not hold properties of one post, it would provide functions for fetching, editing and deleting posts.
That is fine though. A posts class could be Repository, e.g. an in-memory collection of Post Entities that has the added responsibility to retrieve and persist those in the background. I'd be cautious with Design Patterns and Code Igniter though since they are known to use their own interpretation of patterns (for instance their Active Record is more like a Query Object).
So what happens if I want to get every post out of a database and put it into a Post object?
Lots of options here. A common approach would be to use a Data Mapper, but you could also use PDO and fetch the data rows directly into Post objects, etc.
My understanding of it is I would in fact need two classes 'Posts' and 'Post', one that defines the Post object and one that handles fetching the Posts from the database and putting them into Post objects.
That would be the aforementioned Repository or Data Mapper approach. You usually combine these with a Table Data Gateway. However, an alternative could also be to not have a Posts class and use an Active Record pattern, which represents a row in the database as an object with business and persistence logic attached to it.
Do these two types of class have a name ('Proper' objects / Collections of functions)? And is it common to have two classes working together like this or have I completely misunderstood how to use objects?
Yes, they work together. OOP is all about objects collaborating.
Instead of having a Post object would it make more sense to have a method in my Posts class called getSinglePost($id) that just returned an array?
That would be a Table Data Gateway returning Record Sets. It's fine when you don't have lots of business logic and can spare the Domain Model, like in CRUD applications
Class should ideally has the same interpretation as anywhere else in PHP as well. Class starts with abstraction, refining away what you don't need. So it's entirely up to you to define the class the way you want it.
Codeigniter does have a strange way of initiating and accessing objects. Mainly because they are loaded once and used afterwards, prevents it from having functionality around data. There are ways around it and normal handling of classes still possible. I usually use a auto loader and use normal classes.
"So what happens if I want to get every post out of a database and put it into a Post object? My understanding of it is I would in fact need two classes 'Posts' and 'Post',"
You are essentially referring to a MODEL to access the data ("posts") and an Entity to represent the "post". So you would load the model once and use it to load up as many entities as you would like.
$this->load->model("posts");
$this->posts->get_all(); // <- This can then initiate set of objects of type "Post" and return. Or even standard classes straight out from DB.
Your understanding of an object is correct. A post is a single object of a class Post. But of course you need a function, that retrieves posts from a database or collects them from somewhere else. Therefore you have so called Factory classes. That's what can cause some confusion.
Factories can be singletons, which normally means that you have one instance of this class. But you don't need to instantiate a factory at all (and instead use static functions to access the functionality):
$posts = PostFactory::getPosts();
And then the function:
static function getPosts() {
$list = array();
$sql = "select ID from posts order by datetime desc"; // example, ID is the primary key
// run your sql query and iterate over the retrieved IDs as $id
{
...
$post = new Post($id);
array_push($list, $post);
}
return $list;
}
Inside this factory you have a collection of "access"-functions, which do not fit elsewhere, like object creation (databasewise) and object retrieval. For the second part (retrieval) it is only necessary to put the function into a factory, if there is no "parent" object (in terms of a relation). So you could have an entity of class Blog, you instantiate the blog and then retrieve the posts of the blog via the blog instance and don't need a separate factory.
The naming is only there to help you understand. I wouldn't recommend to call a class Post and it's factory Posts since they can easily be mixed up and the code is harder to read (you need to pay attention to details). I usually have the word "factory" mixed in the class name, so I know that it is actually a factory class and others see it too.
Furthermore you can also have Helper classes, which don't really relate to any specific entity class. So you could have a PostHelper singleton, which could hold functionality, which doesn't fit neither in the object class nor in the factory. Although I can't think of any useful function for a Post object. An example would be some software, which calculates stuff and you have a Helper, which performs the actual calculation using different types of objects.
I am trying to use Dependency Injection as much as possible, but I am having trouble when it comes to things like short-lived dependencies.
For example, let's say I have a blog manager object that would like to generate a list of blogs that it found in the database. The options to do this (as far as I can tell) are:
new Blog();
$this->loader->blog();
the loader object creates various other types of objects like database objects, text filters, etc.
$this->blogEntryFactory->create();
However, #1 is bad because it creates a strong coupling. #2 still seems bad because it means that the object factory has to be previously injected - exposing all the other objects that it can create.
Number 3 seems okay, but if I use #3, do I put the "new" keywords in the blogEntryFactory itself, OR, do I inject the loader into the blogEntryFactory and use the loader?
If I have many different factories like blogEntryFactory (for example I could have userFactory and commentFactory) it would seem like putting the "new" keyword across all these different factories would be creating dependency problems.
I hope this makes sense...
NOTE
I have had some answers about how this is unnecessary for this specific blog example, but there are, in fact, cases where you should use the Abstract Factory Pattern, and that is the point I am getting at. Do you use "new" in that case, or do something else?
I'm no expert, but I'm going to take a crack at this. This assumes that Blog is just a data model object that acts as a container for some data and gets filled by the controller (new Blog is not very meaningful). In this case, Blog is a leaf of the object graph, and using new is okay. If you are going to test methods that need to create a Blog, you have to simultaneously test the creation of the Blog anyway, and using a mock object doesn't make sense .. the Blog does not persist past this method.
As an example, say that PHP did not have an array construct but had a collections object. Would you call $this->collectionsFactory->create() or would you be satisfied to say new Array;?
In answer to the title: yes, abstract factories typically use new. For example, see the MazeFactory code on page 92 of the GoF book. It includes, return new Maze; return new Wall; return new Room; return new Door;
In answer to the note: a design that uses abstract factories to create data models is highly suspect. The purpose is to vary the behavior of the factory's products while making their concrete implementations invisible to clients. Data models with no behavior do not benefit from an abstract factory.
I have a table called Cat, and an PHP class called Cat. Now I want to make a CatDataMapper class, so that Cat extends CatDataMapper.
I want that Data Mapper class to provide basic functionality for doing ORM, and for creating, editing and deleting Cat.
For that purpose, maybe someone who knows this pattern very well could give me some helpful advice? I feel it would be a little bit too simple to just provide some functions like update(), delete(), save().
I realize a Data Mapper has this problem: First you create the instance of Cat, then initialize all the variables like name, furColor, eyeColor, purrSound, meowSound, attendants, etc.. and after everything is set up, you call the save() function which is inherited from CatDataMapper. This was simple ;)
But now, the real problem: You query the database for cats and get back a plain boring result set with lots of cats data.
PDO features some ORM capability to create Cat instances. Lets say I use that, or lets even say I have a mapDataset() function that takes an associative array. However, as soon as I got my Cat object from a data set, I have redundant data. At the same time, twenty users could pick up the same cat data from the database and edit the cat object, i.e. rename the cat, and save() it, while another user still things about setting another furColor. When all of them save their edits, everything is messed up.
Err... ok, to keep this question really short: What's good practice here?
From DataMapper in PoEA
The Data Mapper is a layer of software
that separates the in-memory objects
from the database. Its responsibility
is to transfer data between the two
and also to isolate them from each
other. With Data Mapper the in-memory
objects needn't know even that there's
a database present; they need no SQL
interface code, and certainly no
knowledge of the database schema. (The
database schema is always ignorant of
the objects that use it.) Since it's a
form of Mapper (473), Data Mapper
itself is even unknown to the domain
layer.
Thus, a Cat should not extend CatDataMapper because that would create an is-a relationship and tie the Cat to the Persistence layer. If you want to be able to handle persistence from your Cats in this way, look into ActiveRecord or any of the other Data Source Architectural Patterns.
You usually use a DataMapper when using a Domain Model. A simple DataMapper would just map a database table to an equivalent in-memory class on a field-to-field basis. However, when the need for a DataMapper arises, you usually won't have such simple relationships. Tables will not map 1:1 to your objects. Instead multiple tables could form into one Object Aggregate and viceversa. Consequently, implementing just CRUD methods, can easily become quite a challenge.
Apart from that, it is one of the more complicated patterns (covers 15 pages in PoEA), often used in combination with the Repository pattern among others. Look into the related questions column on the right side of this page for similar questions.
As for your question about multiple users editing the same Cat, that's a common problem called Concurrency. One solution to that would be locking the row, while someone edits it. But like everything, this can lead to other issues.
If you rely on ORM's like Doctrine or Propel, the basic principle is to create a static class that would get the actual data from the database, (for instance Propel would create CatPeer), and the results retrieved by the Peer class would then be "hydrated" into Cat objects.
The hydration process is the process of converting a "plain boring" MySQL result set into nice objects having getters and setters.
So for a retrieve you'd use something like CatPeer::doSelect(). Then for a new object you'd first instantiate it (or retrieve and instance from the DB):
$cat = new Cat();
The insertion would be as simple as doing: $cat->save(); That'd be equivalent to an insert (or an update if the object already exists in the db... The ORM should know how to do the difference between new and existing objects by using, for instance, the presence ort absence of a primary key).
Implementing a Data Mapper is very hard in PHP < 5.3, since you cannot read/write protected/private fields. You have a few choices when loading and saving the objects:
Use some kind of workaround, like serializing the object, modifying it's string representation, and bringing it back with unserialize
Make all the fields public
Keep them private/protected, and write mutators/accessors for each of them
The first method has the possibility of breaking with a new release, and is very crude hack, the second one is considered a (very) bad practice.
The third option is also considered bad practice, since you should not provide getters/setters for all of your fields, only the ones that need it. Your model gets "damaged" from a pure DDD (domain driven design) perspective, since it contains methods that are only needed because of the persistence mechanism.
It also means that now you have to describe another mapping for the fields -> setter methods, next to the fields -> table columns.
PHP 5.3 introduces the ability to access/change all types of fields, by using reflection:
http://hu2.php.net/manual/en/reflectionproperty.setaccessible.php
With this, you can achieve a true data mapper, because the need to provide mutators for all of the fields has ceased.
PDO features some ORM capability to
create Cat instances. Lets say I use
that, or lets even say I have a
mapDataset() function that takes an
associative array. However, as soon as
I got my Cat object from a data set, I
have redundant data. At the same time,
twenty users could pick up the same
cat data from the database and edit
the cat object, i.e. rename the cat,
and save() it, while another user
still things about setting another
furColor. When all of them save their
edits, everything is messed up.
In order to keep track of the state of data typically and IdentityMap and/or a UnitOfWork would be used keep track of all teh different operations on mapped entities... and the end of the request cycle al the operations would then be performed.
keep the answer short:
You have an instance of Cat. (Maybe it extends CatDbMapper, or Cat3rdpartycatstoreMapper)
You call:
$cats = $cat_model->getBlueEyedCats();
//then you get an array of Cat objects, in the $cats array
Don't know what do you use, you might take a look at some php framework to the better understanding.