Backend app in OO PHP: Structuring classes/tables efficiently

Backend app in OO PHP: Structuring classes/tables efficiently - php

I'm currently working on an app backend (business directory). Main "actor" is an "Entry", which will have:
- main category
- subcategory
- tags (instead of unlimited sub-levels of division)
I'm pretty new to OOP but I still want to use it here. The database is MySql and I'll be using PDO.
In an attempt to figure out what database table structure should I use in order to support the above classification of entries, I was thinking about a solution that Wordpress uses - establish relationship between an entry and cats/subcats/tags through several tables (terms, taxonomies, relationships). What keeps me from this solution at the moment is the fact that each relationship of any kind is represented by a row in the relationships table. Given 50,000 entries I would have, attaching to a particular entry: main cat, subcat and up to 15 tags might slow down the app (or I am wrong)?
I then learned a bit about Table Data Gateway which seemed an excellent solution because I liked the idea of having one table per a class but then I read there is virtually no way of successful combating the impedence missmatch between the OOP and relational-mapping.
Are there any other approaches that you may see fit for this situation? I think I will be going with:
tblentry
tblcategory
tblsubcategory
tbltag
structure. Relationships would be based on the parent IDs but I+'m wondering is that enough? Can I be using foreign key and cascade delete options here (that is something I am not too familiar with and it seems to me as a more intuitive way of having relationships between the elements in tables)?

having a table where you store the relationship between your table is a good idea, and through indexes and careful thinking you can achieve very fast results.
since each entry must represent a different kind of link between two entities (subcategory to main entry, tag to subcategory) you need at least (and at the very most) three fields:
id1 (or the unique id of the first entity)
linkid (linking to a fourth table where each link is described)
id2 (or the unique id of the second entity)
those three fields can and should be indexed.
now the fourth table to achieve this kind of many-to-many relationship will describe the nature of the link. since many different type of relationship will exist in the table, you can't keep what the type is (child of, tag of, parent of) in the same table.
that fourth table (reference) could look like this:
id nature table1 table2
1 parent of entry tags
2 tag of tags entry
the table 1 field tells you which table the first id refers to, likewise with table2
the id is the number between the two fields in your relationship table. only the id field should be indexed. the nature field is more for the human reader then for joining tables or organizing data

Related

CakePHP database design: associations for job board

Hy everyone.
I'm actually building a job board with CakePHP and a little help for designing the database will be appreciated!
I have a table jobs with differents foreigns keys:
id, recruiter_id, title, sector_id, division_id, experience_id etc.
The associated table (sectors, divisions and experiences) have the same configuration id, name and job_count and sometimes on or two other fields (like company_count for sectors).
So I would like to know if there is better way to design these tables. I thought for putting the three of them in one table named lists with the keys: id, value and list_name. With this configuration I have just one request to do to get all the list and not 3.
My question is what is the "good way" solution ? May be there's another one ?

Seems kind of repetitive to have them in separate tables, when really they're all the same thing - properties of a job, and would have VERY similar table structures.
I would think you could create a single table for "job_properties" or something.
Each property could have a unique slug (if you wanted) or just use it's id.
// job_properties table example
id
slug // (optional or could be called "key" if you prefer)
type // (optional - "sector", "division", "min_exp")
name // (for use on the names of things like "marketing" or "technology")
value // (int - for use on things like minimum experience)
Then each Job would hasMany JobProperty. It would also allow any job to have more than one sector if that is ever needed.
This would allow you to pull based on if a job has a particular property or set of properties and seems overall cleaner and more consolidated while not making it too obfuscated.

I think a found a solution by using a system of taxonomy. I created a table terms which contain the list of all terms that can be associated (sector, division, type of contrat, etc.).
Table terms id, name, type
And I created a second table term_relationships which contain all the association including the name of the model that is associated.
Tabe term_relationships id, ref, ref_id, term_id
"ref" refers to the associated model (example: Job or Applicant in my case), the "ref_id" refers to the associated data (which job or which applicant) and term_id refers to which terms is associated. I think is the most evolutive and cleaner solution.
Thanks all for your help (especially Grafikart from where I get the idea) and hope that this topic can help someone else !

Foreign key vs. SET data type MySQL

I have a MySQL database set up with a list of all my movies, which I imported from a MS Access database. One field contains the possible values for the genre of the movie, movies can have more than one genre, so I need a data type which supports this feature. In access I could link one table 'genre' to the field 'genre' in my table 'movies', so I could choose none, one ore multiple genres per movie. When I switched to MySQL I used the SET data type to define all the possible values. So far everything is running perfectly.
I am now trying to set up a table in html/php to show the mysql table. I want the table to be able to sort on: title, genre, quality, rating, etc. But for the sorting on genre, I would need the possible values from the set data type. I don't know if it is possible to get the values with some php command/code, but after I lurked around on the web for a while, I didn't see many applications where they use the SET data type for obvious negative reasons.
So I started looking into the Foreign Key possibility. The problem I have here is that -for as far as I know- the key can only contain one possible value, which puts me right back at the start of my problem. I do like the idea of a foreign key, because it would make it way easier for me to add a new genre to the list.
Is there a possibility I am overlooking? Is it possible to either get the values from the SET type to php or to use a foreign key with multiple possibilities for one record?
I know I can also put every genre in my php script manually, but I'd like to have it all on one place. So that if I add a movie with a genre I haven't defined yet, I can just update it at one place and everything else adapts to it.

Dagon is absolutely right here - you have an issue with the structure of the tables in your back end. You are wanting to model a many to many relationship when at the moment with your current back end the best you can do is a one to many relationship.
To review:
You have individual films that can have many genres
And you have individual genres that are related to many films
Relational databases actually don't model many to many relationships with one relationship they use recursion of the one to many relationship and create two joins.
To model a many to many relationship you need three tables
A film table (which I think you already have)
A genre table (which I think you already have)
A junction table which as Dagon suggests will consist of two fields film id and genre id.
You then set up two separate one to many relationships. One from the film table to the junction table and one from the genre table to the junction table.
Now if you want to know all the genres a film is in you simply filter the junction table on the relevant film id and if you want to know all the films with a certain genre you filter the junction table on the genre id.
Set up lookups to relate your genre ids to textual descriptions and bang you are free to change the textual description as much as you want and the great thing if you've done it right it will upgrade every single value in your forms.
This is an absolute fundamental concept of the algebra of sets behind the design of SQL and relational database design.

Advice on database design for portfolio website

So I'm a visual designer type guy who has learned a respectable amount of PHP and a little SQL.
I am putting together a personal multimedia portfolio site. I'm using CI and loving it. The problem is I don't know squat about DB design and I keep rewriting (and breaking) my tables. Here is what I need.
I have a table to store the projects:
I want to do fulltext searcheson titles and descriptions so I think this needs to be MyISAM
PROJECTS
id
name (admin-only human readable)
title (headline for visitors to read)
description
date (the date the project was finished)
posted (timestamp when the project was posted)
Then I need tags:
I think I've figured this out. from researching.
TAGS
tag_id
tag_name
PROJECT_TAGS
project_id (foreign key PROJECTS TABLE)
tag_id (foreign key TAGS TABLE)
Here is the problem I have FOUR media types; Photo Albums, Flash Apps, Print Pieces, and Website Designs. no project can be of two types because (with one exception) they all require different logic to be displayed in the view. I am not sure whether to put the media type in the project table and join directly to the types table or use an intermediate table to define the relationships like the tags. I also thinking about parent-types/sub-types i.e.; Blogs, Projects - Flash, Projects - Web. I would really appreciate some direction.
Also maybe some help on how to efficiently query for the projects with the given solution.

The first think to address is your database engine, MyISAM. The database engine is how MySQL stores the data. For more information regarding MyISAM you can view: http://dev.mysql.com/doc/refman/5.0/en/myisam-storage-engine.html. If you want to have referential integrity (which is recommended), you want your database engine to be InnoDB (http://dev.mysql.com/doc/refman/5.0/en/innodb-storage-engine.html). InnoDB allows you to create foreign keys and enforce that foreign key relationship (I found out the hard way the MyISAM does not). MyISAM is the default engine for MySQL databases. If you are using phpMyAdmin (which is a highly recommended tool for MySQL and PHP development), you can easily change the engine type of the database (See: http://www.electrictoolbox.com/mysql-change-table-storage-engine/).
With that said, searches or queries can be done in both MyISAM and InnoDB database engines. You can also index the columns to make search queries (SELECT statements) faster, but the trade off will be that INSERT statements will take longer. If you database is not huge (i.e. millions of records), you shouldn't see a noticeable difference though.
In terms of your design, there are several things to address. The first thing to understand is an entity relationship diagram or an ERD. This is a diagram of your tables and their corresponding relationships.
There are several types of relationships that can exist: a one-to-one relationship, a one-to-many relationship, a many-to-many relationship, and a hierarchical or recursive relationship . A many-to-many relationship is the most complicated and cannot be produced directly within the database and must be resolved with an intermittent table (I will explain further with an example).
A one-to-one relationship is straightforward. An example of this is if you have an employee table with a list of all employees and a salary table with a list of all salaries. One employee can only have one salary and one salary can only belong to one employee.
With that being said, another element to add to the mix is cardinality. Cardinality refers to whether or not the relationship may exist or must exist. In the previous example of an employee, there has to be a relationship between the salary and the employee (or else the employee may not be paid). This the relationship is read as, an employee must have one and only one salary and a salary may or may not have one and only one employee (as a salary can exist without belonging to an employee).
The phrases "one and only one" refers to it being a one-to-one relationship. The phrases "must" and "may or may not" referring to a relationship requiring to exist or not being required. This translates into the design as my foreign key of salary id in the employee table cannot be null and in the salary table there is no foreign key referencing the employee.
EMPLOYEE
id PRIMARY KEY
name VARCHAR(100)
salary_id NOT NULL UNIQUE
SALARY
id PRIMARY KEY
amount INTEGER NOT NULL
The one-to-many relationship is defined as the potential of having more than one. For example, relating to your portfolio, a client may have one or more projects. Thus the foreign key field in the projects table client_id cannot be unique as it may be repeated.
The many-to-many relationship is defined where more than one can both ways. For example, as you have correctly shown, projects may have one or more tags and tags may assigned to one or more projects. Thus, you need the PROJECT_TAGS table to resolve that many-to-many.
In regards to addressing your question directly, you will want to create a separate media type table and if any potential exists whatsoever where a project is can be associated to multiple types, you would want to have an intermittent table and could add a field to the project_media_type table called primary_type which would allow you to distinguish the project type as primarily that media type although it could fall under other categories if you were to filter by category.
This brings me to recursive relationships. Because you have the potential to have a recursive relationship or media_types you will want to add a field called parent_id. You would add a foreign key index to parent_id referencing the id of the media_type table. It must allow nulls as all of your top level parent media_types will have a null value for parent_id. Thus to select all parent media_types you could use:
SELECT * FROM media_type WHERE parent_id IS NULL
Then, to get the children you loop through each of the parents and could use the following query:
SELECT * FROM media_type WHERE parent_id = {$media_type_row->id}
This would need to be in a recursive function so you loop until there are no more children. An example of this using PHP related to hierarchical categories can be viewed at recursive function category database.
I hope this helps and know it's a lot but essentially, I tried to highlight a whole semester of database design and modeling. If you need any more information, I can attach an example ERD as well.

Another posibble idea is to add columns to projects table that would satisfy all media types needs and then while editting data you will use only certain columns needed for given media type.
That would be more database efficient (less joins).
If your media types are not very different in columns you need I would choose that aproach.
If they differ a lot, I would choose #cosmicsafari recommendation.

Why don't you take whats common to all and put that in a table & have the specific stuff in tables themelves, that way you can search through all the titles & descriptions in one.
Basic Table
- ID int
- Name varchar()
- Title varchar()
etc
Blogs
-ID int (just an auto_increment key)
-basicID int (this matches the id of the item in the basic table)
etc
Have one for each media type. That way you can do a search on all the descriptions & titles at the one time and load the appropriate data when the person clicked through the link from a search page. (I assume thats the sort of functionality you mean when you say you want to be able to let people search.)

Should I split the mySQL tables?

Lets take the example from Yelp: http://www.yelp.com/boston
You can see that it's a website with several different categories, each category containing a listing of places. Should I include all the different places/listing in a single table, or let each category have its own tables?
EDIT: this means having tables 'places_restaurants' and 'places_nightlife', instead of just having the single table 'places' and every entry of every different category will be stored in one huge table... Will this affect performance?

One table per category will require that you CREATE a table every time there's a new category. I'd prefer CATEGORY and PLACE tables, with a one-to-many or many-to-many relationship between them.

You should keep all of the categories in the same table and then have a CategoryID which actually maps each category to the specific / desired category. Your application should be built in a way that is inherently extensible which creating tables each time is definitely not.

It depends. You could normalize the database so that all categories are in their own table, and only referred to from other tables by a foreign key. But there are some arguments that performance outweighs normalization, and so it may be beneficial to keep category names both in their own table of record, and also to include a category name column in other, frequently-joined tables.
If you took the second approach, you would need to ensure data integrity by implementing UPDATE and DELETE triggers such that whenever a category changes in the table of record (presumably, not often), that other tables containing copies of category names also get updated.

It still depends on the application ,also, all the categories is a many to many fields with a main table and of course beliving u have some unique columns in each table

how to store data with many categories and many properties efficiently?

We have a large number of data in many categories with many properties, e.g.
category 1: Book
properties: BookID, BookName, BookType, BookAuthor, BookPrice
category 2: Fruit
properties: FruitID, FruitName, FruitShape, FruitColor, FruitPrice
We have many categories like book and fruit. Obviously we can create many tables for them (MySQL e.g.), and each category a table. But this will have to create too many tables and we have to write many "adapters" to unify manipulating data.
The difficulties are:
1) Every category has different properties and this results in a different data structure.
2) The properties of every categoriy may have to be changed at anytime.
3) Hard to manipulate data if each category a table (too many tables)
How do you store such kind of data?

You can separate the database into two parts: Definition Tables and Data Tables. Basically the Definition Tables is used to interpret the Data Tables where the actual data is stored (some would say that the definition tables is more elegant if represented in XML).
The following is the basic idea.
Definition Tables:
TABLE class
class_id (int)
class_name (varchar)
TABLE class_property
property_id (int)
class_id (int)
property_name (varchar)
property_type (varchar)
Data Tables:
TABLE object
object_id (int)
class_id (varchar)
TABLE object_property
property_id (int)
property_value (varchar)
It would be best if you could also create additional Layer to interpret the structure so as to make it easier for the Data Layer to operate on the data. And you must of course take into consideration performance, ease of query, etc.
Just my two cents, I hope it could be of any help.
Regards.

If your data collection isn't too big, the Entity-Attribute-Value (EAV) model may fit nicely the bill.
In a nutshell, this structure allows the definition of Categories, the list of [required or optional] Attributes (aka properties) the entities in such category include etc, in a set of tables known as the meta-data, the logical schema of the data, if you will. The entity instances are stored in two tables a header and a values tables, whereby each attribute is stored in a single [SQL] record of the later table (aka "vertical" storage: what used to be a record in traditional DBMS model is made of several records of the value table).
This format is very practical in particular for its flexibility: it allows both late and on-going changes in the logical schema (addition of new categories, additions/changes in the attributes of a given category etc.), as well the implicit data-driven handling of the underlying catalog's logical schema, at the level of the application. The main drawbacks of this format are the [somewhat] more sophisticated, abstract, implementation and, mainly, some limitations with regards to scaling etc. when the catalog size grows, say in the million+ entities range.
See the EAV model described in more details in this SO answer of mine.

Triggered by this question and other similar ones, I wrote a blog post on how to handle such cases using a graph database. In short, graph databases don't have the problem "how to force a tree/hierarchy into tables" as there's simply no need for it: you store your tree structure as it is. They're not good at everything (like for example creating reports) but this is a case where graph databases shine.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.