MySQL - EAV or XML?

MySQL - EAV or XML? - php

I'm wondering about solution in my recent project. Site has categories (nested set model) and products. Main problem is that I must implement many categories properties. These properties are different for each category. Of course products which belongs to specific category have specific properties values.
My problem is how to solve storing these properties. I read about EAV model but main con for these solution is basicly slow, needs many additional tables and not suitable for larger data sets. The more I read about it the more I'm not convinced.
My second thought is XML. Categories table would have xml column that keeps properties names and in product table also xml column but with properties values.
Now question - which solution will be better and more flexible? Another problem is to allow user search by specific categories properties. I'm assuming that db would have many products so performance of queries is very important.

Related

Handling categories and sub categories MySQL or JSON

I am trying to find the best way to handle category and sub category. I have 20 category and 50 sub category. Which of these is best way to do so :
Save data in a json file and reading content directly on client side.
Save data in database in single table and using parent id to see the relation and using foreach on result array inside another foreach of same array.
Save data in database in two table, making one sql call to parent category another one call to sub category and using parent id to see the relation and using foreach of sub category array inside another foreach of parent array.
Save data in database in two table, making one sql call to parent category and then inside its foreach making multiple sql calls to database.
I tried to find the best practice to handle categories but couldn't find any article for the same.

The solution depends purely on how complex your database schema and other entities relate to the categories. And how you intend to read the information.
The json approach would be faster, but has issues when it comes to queries that would require you to link up to category additional information.
Another approach I have used and had good performance is storing all categories in a single table. The relationships are not stored in the main table.
Another table stores the relationships as graph edges. This is quite advantageous if you have cyclic relationships within the categories. Or more than one parent.
The schema would look like :
categories ( id, name )
category_edges ( parent_id, child_id)
I used oqgraph with my implementation to get the relationships queries faster. But that was with MariaDB and not mysql.
Hope this helps.

Best approach for Doctrine2 with productdata sharded per category

I'm using Doctrine2 in a Symfony2 project and I've to build against an already existing MySQL database with productdata. There are about 40 productcategories and millions of products. Each category has it's own table in MySQL, all with the same schema.
Simplified:
Every table also has its own primary key ID field, so it's possible to have several products with the same ID, but coming from a different category.
All references to these products are now by category ID + product ID. The current software selects the data from the correct table based based on a defined mapping between the category ID and the MySQL table. This mapping is made in a seperate categories MySQL table, and rarely changes.
I've looked at inheritance mapping in Doctrine2 and it seems the best solution would be to make a mapped superclass. After that I can make 40 subclasses that extend this superclass, for each category one. I would still need some kind of mapping between the category ID and the correct subclass entity.
Is there a better approach to this? Because with this solution I would have to find the correct subclass entity based on the category ID.

Single Category Table For Entire Site Or Different For Different Use Case

product table has category, media table has category, ticket table has category.
Each of these has a HasMany relation with category table. There are two ways of doing it:
Have a common Category table with probably a type column and have intermediatory table like MediaCategory, etc.
Have separate tables like MediaCategory with each having same structure as category

First one is better I think in point of integrity.

IF the categories are not shared then it is best (in most cases) to have separate tables for each category type.
Here's the rationale:
The database is the gatekeeper of your data's relational integrity. In a well written program there should not be any foreign key violation exceptions, i.e. the code does not rely on the database to keep the relational integrity of the data. However should a bug creep-in, the relations in the database make the bug less likely to cause data corruption.
When using separate tables for media, products, etc. and their valid categories, the relational integrity can be easily maintained with a foreign key relationship; essentially any record in the media table can belong to any category in the media categories table. Ensuring the relation:
"Records in media table can belong to any category of type 'media' in
the categories table"
is less straight forward at the database level.
That being said, a problem whose solution is duplication of data structures makes the whole underlying structure suspect. This may not be so in your case, but you should look at the underlying use-cases that require the categories to be introduced and see whether they are better served in a different manner (say Free Text Search indexed keywords.)

Need Advice on building the database. all in one table or split?

i am developing an application for a real-estate company. the problem i am facing is about implementing the database. however i am just confused on which way to adopt i would appreciate if you could help me out in reasoning the database implementation.
here is my situation.
a) i have to store the property details in the database.
b) the properties have approximately 4-5 categories to which it will belong for ex : resedential, commnercial, industrial etc.
c) now the categories have sub-categories. for example. a residential category will have sub category such as. Apartment / Independent House / Villa / Farm House/ Studio Apartment etc. and hence same way commercial and industrial or agricultural will too have sub-categories.
d) each sub-categories will have to store the different values. like a resident will have features like Bedrooms/ kitchens / Hall / bathroom etc. the features depends on the sub categories.
for an example on how i would want to implement my application you can have a look at this site.
http://www.magicbricks.com/bricks/postProperty.html
i could possibly think of the solution like this.
a) create four to five tables depending upon the categories which will be existing(the problem is categories might increase in the future).
b) create different tables for all the features, location, price, description and merge the common property table into one. for example all the property will have the common entity such as location, total area, etc.
what would you advice for me given the current situation.
thank you

In order to implement this properly you need to know (read) about database normalization.
Every entity needs its own table. You will have tables for:
objects (real estate objects)
categories
transactionTypes
... etc.
If you have hierarchical categories, strictly organised in a tree structure, you may want to implement this as a tree structure, all stored in one table. If there are possibilities of overlaps, then it means you need to have different tables for each, like:
propertyTypes
propertyRatings
propertyAvailability
... etc.

Generally you could have a table for each property "type" containing the "type" specific information but also have a corresponding "common" table that would contain common fields between all types such as "price", "address", etc...
This is how MLS data is structured.

Categories of properties is yet another example of the gen-spec design pattern.
For a prior discussion on gen-spec here is the link.

how to store data with many categories and many properties efficiently?

We have a large number of data in many categories with many properties, e.g.
category 1: Book
properties: BookID, BookName, BookType, BookAuthor, BookPrice
category 2: Fruit
properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice
We have many categories like book and fruit. Obviously we can create many tables for them (MySQL e.g.), and each category a table. But this will have to create too many tables and we have to write many "adapters" to unify manipulating data.
The difficulties are:
1) Every category has different properties and this results in a different data structure.
2) The properties of every categoriy may have to be changed at anytime.
3) Hard to manipulate data if each category a table (too many tables)
How do you store such kind of data?

You can separate the database into two parts: Definition Tables and Data Tables. Basically the Definition Tables is used to interpret the Data Tables where the actual data is stored (some would say that the definition tables is more elegant if represented in XML).
The following is the basic idea.
Definition Tables:
TABLE class
class_id (int)
class_name (varchar)
TABLE class_property
property_id (int)
class_id (int)
property_name (varchar)
property_type (varchar)
Data Tables:
TABLE object
object_id (int)
class_id (varchar)
TABLE object_property
property_id (int)
property_value (varchar)
It would be best if you could also create additional Layer to interpret the structure so as to make it easier for the Data Layer to operate on the data. And you must of course take into consideration performance, ease of query, etc.
Just my two cents, I hope it could be of any help.
Regards.

If your data collection isn't too big, the Entity-Attribute-Value (EAV) model may fit nicely the bill.
In a nutshell, this structure allows the definition of Categories, the list of [required or optional] Attributes (aka properties) the entities in such category include etc, in a set of tables known as the meta-data, the logical schema of the data, if you will. The entity instances are stored in two tables a header and a values tables, whereby each attribute is stored in a single [SQL] record of the later table (aka "vertical" storage: what used to be a record in traditional DBMS model is made of several records of the value table).
This format is very practical in particular for its flexibility: it allows both late and on-going changes in the logical schema (addition of new categories, additions/changes in the attributes of a given category etc.), as well the implicit data-driven handling of the underlying catalog's logical schema, at the level of the application. The main drawbacks of this format are the [somewhat] more sophisticated, abstract, implementation and, mainly, some limitations with regards to scaling etc. when the catalog size grows, say in the million+ entities range.
See the EAV model described in more details in this SO answer of mine.

Triggered by this question and other similar ones, I wrote a blog post on how to handle such cases using a graph database. In short, graph databases don't have the problem "how to force a tree/hierarchy into tables" as there's simply no need for it: you store your tree structure as it is. They're not good at everything (like for example creating reports) but this is a case where graph databases shine.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.