I have a model that contains several types of products that are all stored in different MySQL databases, but all have one "parent" product that is stored in another table. The parent table is called "products" and contains amongst others the variables:
id
type
price
name
An example of "children" would be "books" which would contain amongst others:
id
meta_id
pages
Another "child" could be "dvds":
id
meta_id
tracks
where the meta_id of the child is equal to the id of the parent.
In old fashioned MySQL I would get all books by using:
SELECT
p.id, p.type, p.price, p.name, b.pages
FROM
products p
LEFT JOIN
books b
ON
p.id=o.meta_id
I know how to read & write data to & from one database table using Zend, extending the Zend_Db_Table_Abstract, and using a Mapper & a Model. I'm just not sure how to do this if I have to read/write objects that are stored in multiple database tables. How do I set this up? What model/pattern should I use? I'm sure this is pretty standard stuff, but I been searchjing for days for clear examples, and I just can't seem to figure it out.
I had exactly the same confusion as you, and there is a great page here - Zend Framework Data Models - which explains how to solve this exact problem. You'll see ZF has excellent facilities to handle this sort of thing (short of using an ORM like Doctrine).
Also, when you're querying multiple tables, it is useful to be aware of the integrity check, as mentioned here Zend Framework Db Select Join table help
Related
I am using codeigniter and MySQL to build an ecommerce web application.
This one required three level of categories. So I have created 3 tables. These are-
category
category_id, category_name
subcategory
subcategory_id,subcategory_name,subcategory_category_id
subsubcategory
subsubcategory_id,subsubcategory_name,subsubcategory_subcategory_id
Here they are linked as parent of one another. Finally I have the product table
product
product_id, product_name, product_subsubcategory_id
Now, I need a sql query on this to fetch all product of any specific category.
Something like
$this->Mdl_data->select_products_by_category($category_id);
Please help me on this. I have tried PHP programming to solve this. But it was too slow with lot's of nested loops.
If you need to select all products, that match some specific category, try this request:
SELECT p.product_id, p.product_name, p.subsubcategory_id FROM category c
JOIN subcategory sc ON sc.subcategory_category_id = c.category_id
JOIN subsubcategory ssc ON ssc.subsubcategory_subcategory_id = sc.subcategory_id
JOIN product p ON p.subsubcategory_id = ssc.subsubcategory_id
WHERE c.category_id = 1;
But you should think about changing your database structure to make your requests faster and simpler.
Edit: Answering the comment about how to improve DB.
Current design of database looks correct, according to actual data relations. 1-many for cat-subcat and 1-many for subcat-subsubcat. But this leads to complicated (and possibly slow) queries while usage.
One way I see is to implement many-many relation with additional restriction. You can create additional table cat-subcat, just as you would do if you needed many-many. But in that table you can set unique limitation to subcat_id, so every subcat could belong only to 1 cat and it becomes in fact 1-many relation. But in this case you can move both up- and downwards the hierarchy. This approach will reduce number of JOINs in your query only by 1, but the whole logic of the query would be easier to understand.
Another way. As I understand this is the query for web-store filter. So, new products will be inserted much more seldom, than viewed by category. You can just add subcat_id and cat_id fields to your product, which is not good idea from the point of data structure, but for this particular situation this might be good solution. Every time new product is inserted to DB, you should control the correctness of those 2 fields by PHP or whatever you use on server. But when products are searched by category you will have simple request without JOINs at all.
Both approaches are based on the idea to sacrifice some space for speeding up and simplifying the queries, that are frequently used. Maybe there is even better solution, but I can't find it right now.
We have a php/mysql system with about 5 core entities. We now need to add the ability for customers to create custom fields for some of these entities on a per project basis.
They would contain a label, key, type, default value, and possible allowed values.
This is so they could add a custom date field, or a custom dropdown to the UI and save this value against the specific entity.
What is the best approach for storing this kind of data in a mySQL database? I need to store both the config for the field, and then the current value for a specific entity.
I've had a look at various options here.. https://ayende.com/blog/3498/multi-tenancy-extensible-data-model
But this is not really at a tenancy level, more a project level.
I was thinking...
A CustomFields table to hold the configuration of a field against an entity type and project id.
A CustomFieldValues table to hold the value saved against the field - a row per field ( entity_id | field_id | field_value)
Then we create relationships between the entities and these custom values when retrieving the entities.
The issue with this is that there will be as many rows in the Values table as there are custom fields - so saving a entity will result in X extra rows. On top of that, these are versioned, so once a new version is created, there will be another X rows created for that new version.
Also, you can't index the fields on name, joins would become pretty complex i think as you have to join to the configuration and the values to build the key value pair to return against the entity, and how would you select based on a custom field name, when the filed name was actually a value?
I don't want to add dynamic columns to the table, as this will affect ALL the entites in the whole system - not just the ones in the current client / project.
The other option is to store the values in a JSON column.
This could be on the entity row itself customFields or similar. This would prevent the extra rows per field, but also has issues with lack of indexing etc, and still need to join to the config table. However, you could perform queries by the property name if the key=value was stored in the JSON... WHERE entity.customFields->"$.myCustomFieldName" > 1.
Storing the filed name in the json does mean you cannot change it once created, without a lot of pain.
If anyone has any advice on approaches for this, or articles to point me at that would be much appreciated - Im sure this has been solved many times before....
JSON records: No! A thousand times no! If you do that, just wait until somebody actually uses your system for a few tens of millions of records, then asks you to search on one of your extra fields. Your support people will curse your name.
Key-value store. Probably yes. There's a very widely deployed existence proof of this design: WordPress. It has a table called wp_postmeta, containing metadata fields applying to wp_posts (blog pages and posts). It's proven successful.
You will need to do some multiple joining to use this stuff. For example, to search on height and eye-color, you'd need
SELECT p.person_id, p.first, p.last, h.value height, e.value eye_color
FROM person p
LEFT JOIN attrib h ON p.person_id = h.person_id AND h.key='eye_color'
LEFT JOIN attrib e ON p.person_id = e.person_id AND e.key='height'
WHERE e.value='green' and CAST(h.value AS INT) < 160
As the CAST in that WHERE clause shows, you'll have some struggles with data type as well.
You'll need LEFT JOIN operations in this sort of attribute lookup; ordinary inner JOIN operations will suppress rows with missing attributes, and that might not work for you.
But, if you do a good job with indexes, you'll be able to get decent performance from this approach.
The table structure envisioned in my example doesn't have your table describing each additional field, but you know how to add that. It also doesn't have explicit support for multi-project / multitenant data separation. But you can add that as well.
I'm currently working on an app backend (business directory). Main "actor" is an "Entry", which will have:
- main category
- subcategory
- tags (instead of unlimited sub-levels of division)
I'm pretty new to OOP but I still want to use it here. The database is MySql and I'll be using PDO.
In an attempt to figure out what database table structure should I use in order to support the above classification of entries, I was thinking about a solution that Wordpress uses - establish relationship between an entry and cats/subcats/tags through several tables (terms, taxonomies, relationships). What keeps me from this solution at the moment is the fact that each relationship of any kind is represented by a row in the relationships table. Given 50,000 entries I would have, attaching to a particular entry: main cat, subcat and up to 15 tags might slow down the app (or I am wrong)?
I then learned a bit about Table Data Gateway which seemed an excellent solution because I liked the idea of having one table per a class but then I read there is virtually no way of successful combating the impedence missmatch between the OOP and relational-mapping.
Are there any other approaches that you may see fit for this situation? I think I will be going with:
tblentry
tblcategory
tblsubcategory
tbltag
structure. Relationships would be based on the parent IDs but I+'m wondering is that enough? Can I be using foreign key and cascade delete options here (that is something I am not too familiar with and it seems to me as a more intuitive way of having relationships between the elements in tables)?
having a table where you store the relationship between your table is a good idea, and through indexes and careful thinking you can achieve very fast results.
since each entry must represent a different kind of link between two entities (subcategory to main entry, tag to subcategory) you need at least (and at the very most) three fields:
id1 (or the unique id of the first entity)
linkid (linking to a fourth table where each link is described)
id2 (or the unique id of the second entity)
those three fields can and should be indexed.
now the fourth table to achieve this kind of many-to-many relationship will describe the nature of the link. since many different type of relationship will exist in the table, you can't keep what the type is (child of, tag of, parent of) in the same table.
that fourth table (reference) could look like this:
id nature table1 table2
1 parent of entry tags
2 tag of tags entry
the table 1 field tells you which table the first id refers to, likewise with table2
the id is the number between the two fields in your relationship table. only the id field should be indexed. the nature field is more for the human reader then for joining tables or organizing data
what do you think would be performance-wise the better way to get the category-names of a news-system:
add an extra field for the cat-names inside a table, which allreade contains a field for the cat-ids
no extra field for the cat-names, but cat-ids and read in the cat-names (comma-seperated string: "cat1,cat2,cat3,cat4") into the php-file by an existing config-file and then build the cat-names with the help of the db-field "cat-ids" an array and a for-loop?
Thanx in advance,
Jayden
edit: cant seem to add a "hi" or "hallo" on top of the post, the editor just deletes it...
If you are measuring milliseconds and the disk IO of your system is not extremely slow, then option 2 would yield better performance. But, we are talking a negligible gain in execution time. Since you already will be querying the DB to get the news item it would be highly optimized to just get the category name at the same time. I would add a mapping table of category-name-id to category-names. And the join on that when getting news items.
From a flexibility standpoint and the standpoint of eliminating as many possible sources of error I would also go with my above idea. Since it adds flexibility to your system and keeps all your data in one spot. Changing the name of a category would require editing one column i the database instead of editing a php config file or, if option 1 was used, updating each and every news record.
So my best advise, add a table with category-name-id to category-names mappings and then have the news-items contain the id of the category they belong to.
For performance you could then cache the data you retrieve about existing categories and other data so you don't have to poll the DB for that information all the time.
For instance. You could, instead of joining at all, get all the categories from the category table I described above. Cache it in the application and only get it once the cache is invalidated. i.e. a timeout occurs or the data in the db is manipulated.
I think of two possible ways.
Have a category table, a articles table and a relationship table, and have a many-to-many relationship between categories and articles (as described in the relationship table).
If you feel smart today, declare each category as a binary number (0, 1, 2, 4, 8, 16 etc), and add them in a field on the articles table. If an article has a category value of 11, it has categories 1+2+8.
I like the first solution better, quite frankly.
I would create a categories table like this:
Categories
-----------
category_id name
-------------------------
1 Weather
2 Local
3 Sports
Then create a junction table, so each article can have 0 or more categories:
Article_Categories
-------------------
article_id category_id
-----------------------------
1 2
1 3
2 1
To get the articles with their categories (comma delimited) from MySQL server, you can use GROUP_CONCACT():
SELECT a.*, GROUP_CONCAT(c.name) AS cats
FROM Articles a
LEFT JOIN Article_Categories ac
ON ac.article_id = a.article_id
LEFT JOIN Categories c
ON c.category_id = ac.category_id
GROUP BY a.article_id
Add an additional table, that will save lots of issues in future for you. It is just the recommended way.
By the way, that idea of multiple id's in one field, don't try that way. It will give lots of code and issues which are totally unnecessary. If you really find performance issues you can always decide to take a step further and de-normalize or cache some of the data. There are lots of caching options available.
I think your first option is the suitable one. Because it make sense with the relationship with your data. And in a situation you want to display the category name with your news you can simply get everything by single select query with join.
So I recommend Option 1 You have mentioned.
And performance also can measure in two ways. Execution performance and development performance I feel both performance are in good position with your option 1. You don't need to do much just a one query. If you go for the option 2, then you have to load from config file, explode it with comma, then search using array elements which is time consuming.
I may be wrong, but since you already query the database, it's probably faster if you add a name field there..
Please also take into account that having the name in the same table as the ID provides consistency - if you have a config file you'll have to add a new category there plus in the table.
Also think of possible errors that may put wrong data into your config file - if this'd be the case your category names might get messed up..
We have a large number of data in many categories with many properties, e.g.
category 1: Book
properties: BookID, BookName, BookType, BookAuthor, BookPrice
category 2: Fruit
properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice
We have many categories like book and fruit. Obviously we can create many tables for them (MySQL e.g.), and each category a table. But this will have to create too many tables and we have to write many "adapters" to unify manipulating data.
The difficulties are:
1) Every category has different properties and this results in a different data structure.
2) The properties of every categoriy may have to be changed at anytime.
3) Hard to manipulate data if each category a table (too many tables)
How do you store such kind of data?
You can separate the database into two parts: Definition Tables and Data Tables. Basically the Definition Tables is used to interpret the Data Tables where the actual data is stored (some would say that the definition tables is more elegant if represented in XML).
The following is the basic idea.
Definition Tables:
TABLE class
class_id (int)
class_name (varchar)
TABLE class_property
property_id (int)
class_id (int)
property_name (varchar)
property_type (varchar)
Data Tables:
TABLE object
object_id (int)
class_id (varchar)
TABLE object_property
property_id (int)
property_value (varchar)
It would be best if you could also create additional Layer to interpret the structure so as to make it easier for the Data Layer to operate on the data. And you must of course take into consideration performance, ease of query, etc.
Just my two cents, I hope it could be of any help.
Regards.
If your data collection isn't too big, the Entity-Attribute-Value (EAV) model may fit nicely the bill.
In a nutshell, this structure allows the definition of Categories, the list of [required or optional] Attributes (aka properties) the entities in such category include etc, in a set of tables known as the meta-data, the logical schema of the data, if you will. The entity instances are stored in two tables a header and a values tables, whereby each attribute is stored in a single [SQL] record of the later table (aka "vertical" storage: what used to be a record in traditional DBMS model is made of several records of the value table).
This format is very practical in particular for its flexibility: it allows both late and on-going changes in the logical schema (addition of new categories, additions/changes in the attributes of a given category etc.), as well the implicit data-driven handling of the underlying catalog's logical schema, at the level of the application. The main drawbacks of this format are the [somewhat] more sophisticated, abstract, implementation and, mainly, some limitations with regards to scaling etc. when the catalog size grows, say in the million+ entities range.
See the EAV model described in more details in this SO answer of mine.
Triggered by this question and other similar ones, I wrote a blog post on how to handle such cases using a graph database. In short, graph databases don't have the problem "how to force a tree/hierarchy into tables" as there's simply no need for it: you store your tree structure as it is. They're not good at everything (like for example creating reports) but this is a case where graph databases shine.