CakePHP database design: associations for job board - php

Hy everyone.
I'm actually building a job board with CakePHP and a little help for designing the database will be appreciated!
I have a table jobs with differents foreigns keys:
id, recruiter_id, title, sector_id, division_id, experience_id etc.
The associated table (sectors, divisions and experiences) have the same configuration id, name and job_count and sometimes on or two other fields (like company_count for sectors).
So I would like to know if there is better way to design these tables. I thought for putting the three of them in one table named lists with the keys: id, value and list_name. With this configuration I have just one request to do to get all the list and not 3.
My question is what is the "good way" solution ? May be there's another one ?

Seems kind of repetitive to have them in separate tables, when really they're all the same thing - properties of a job, and would have VERY similar table structures.
I would think you could create a single table for "job_properties" or something.
Each property could have a unique slug (if you wanted) or just use it's id.
// job_properties table example
id
slug // (optional or could be called "key" if you prefer)
type // (optional - "sector", "division", "min_exp")
name // (for use on the names of things like "marketing" or "technology")
value // (int - for use on things like minimum experience)
Then each Job would hasMany JobProperty. It would also allow any job to have more than one sector if that is ever needed.
This would allow you to pull based on if a job has a particular property or set of properties and seems overall cleaner and more consolidated while not making it too obfuscated.

I think a found a solution by using a system of taxonomy. I created a table terms which contain the list of all terms that can be associated (sector, division, type of contrat, etc.).
Table terms id, name, type
And I created a second table term_relationships which contain all the association including the name of the model that is associated.
Tabe term_relationships id, ref, ref_id, term_id
"ref" refers to the associated model (example: Job or Applicant in my case), the "ref_id" refers to the associated data (which job or which applicant) and term_id refers to which terms is associated. I think is the most evolutive and cleaner solution.
Thanks all for your help (especially Grafikart from where I get the idea) and hope that this topic can help someone else !

Related

PHP + MySQL - one table holding reference to IDs from multiple tables

I just would like to know what are the most common approaches to get a table to hold a reference to IDs from multiple tables.
I have a system with modules like customers, suppliers, orders, etc. and I would like to add a "Notes" functionality to all of those modules to be able to add/read notes.
As one customer/supplier/order can have multiple notes, I have chosen the one-to-many relation way and so the notes in their table should refer to the particular item id in a separate column.
But as I will refer to IDs from multiple tables, their IDs will be overlapping and I need a way to say in which particular table to search for that ID.
I don't want to create exact the same notes module for each of my modules and here I could concentrate notes in one table. Those notes differ only in the fact, to which module they belong to.
Shall I
store the particular table name in the notes table? But that name can
change later and the system will break
introduce something like UNIQUE ID or a hash to all of my modules,
which would be unique among different tables and store it's id in
the notes table?
create separate notes table for every module and don't worry about
code/class/table duplication?
Thanks for your ideas!
We do something similar with notes that can be attached to many objects. Each of our objects has a unique class id (we store each type of object in it's own table), and we store the unique class id + specific object id in the notes table.
We then just have to maintain a lookup of unique class id -> table name. By using the unique class id + object id as the key we ensure that the same id in different tables isn't an issue.

Foreign key vs. SET data type MySQL

I have a MySQL database set up with a list of all my movies, which I imported from a MS Access database. One field contains the possible values for the genre of the movie, movies can have more than one genre, so I need a data type which supports this feature. In access I could link one table 'genre' to the field 'genre' in my table 'movies', so I could choose none, one ore multiple genres per movie. When I switched to MySQL I used the SET data type to define all the possible values. So far everything is running perfectly.
I am now trying to set up a table in html/php to show the mysql table. I want the table to be able to sort on: title, genre, quality, rating, etc. But for the sorting on genre, I would need the possible values from the set data type. I don't know if it is possible to get the values with some php command/code, but after I lurked around on the web for a while, I didn't see many applications where they use the SET data type for obvious negative reasons.
So I started looking into the Foreign Key possibility. The problem I have here is that -for as far as I know- the key can only contain one possible value, which puts me right back at the start of my problem. I do like the idea of a foreign key, because it would make it way easier for me to add a new genre to the list.
Is there a possibility I am overlooking? Is it possible to either get the values from the SET type to php or to use a foreign key with multiple possibilities for one record?
I know I can also put every genre in my php script manually, but I'd like to have it all on one place. So that if I add a movie with a genre I haven't defined yet, I can just update it at one place and everything else adapts to it.
Dagon is absolutely right here - you have an issue with the structure of the tables in your back end. You are wanting to model a many to many relationship when at the moment with your current back end the best you can do is a one to many relationship.
To review:
You have individual films that can have many genres
And you have individual genres that are related to many films
Relational databases actually don't model many to many relationships with one relationship they use recursion of the one to many relationship and create two joins.
To model a many to many relationship you need three tables
A film table (which I think you already have)
A genre table (which I think you already have)
A junction table which as Dagon suggests will consist of two fields film id and genre id.
You then set up two separate one to many relationships. One from the film table to the junction table and one from the genre table to the junction table.
Now if you want to know all the genres a film is in you simply filter the junction table on the relevant film id and if you want to know all the films with a certain genre you filter the junction table on the genre id.
Set up lookups to relate your genre ids to textual descriptions and bang you are free to change the textual description as much as you want and the great thing if you've done it right it will upgrade every single value in your forms.
This is an absolute fundamental concept of the algebra of sets behind the design of SQL and relational database design.

Backend app in OO PHP: Structuring classes/tables efficiently

I'm currently working on an app backend (business directory). Main "actor" is an "Entry", which will have:
- main category
- subcategory
- tags (instead of unlimited sub-levels of division)
I'm pretty new to OOP but I still want to use it here. The database is MySql and I'll be using PDO.
In an attempt to figure out what database table structure should I use in order to support the above classification of entries, I was thinking about a solution that Wordpress uses - establish relationship between an entry and cats/subcats/tags through several tables (terms, taxonomies, relationships). What keeps me from this solution at the moment is the fact that each relationship of any kind is represented by a row in the relationships table. Given 50,000 entries I would have, attaching to a particular entry: main cat, subcat and up to 15 tags might slow down the app (or I am wrong)?
I then learned a bit about Table Data Gateway which seemed an excellent solution because I liked the idea of having one table per a class but then I read there is virtually no way of successful combating the impedence missmatch between the OOP and relational-mapping.
Are there any other approaches that you may see fit for this situation? I think I will be going with:
tblentry
tblcategory
tblsubcategory
tbltag
structure. Relationships would be based on the parent IDs but I+'m wondering is that enough? Can I be using foreign key and cascade delete options here (that is something I am not too familiar with and it seems to me as a more intuitive way of having relationships between the elements in tables)?
having a table where you store the relationship between your table is a good idea, and through indexes and careful thinking you can achieve very fast results.
since each entry must represent a different kind of link between two entities (subcategory to main entry, tag to subcategory) you need at least (and at the very most) three fields:
id1 (or the unique id of the first entity)
linkid (linking to a fourth table where each link is described)
id2 (or the unique id of the second entity)
those three fields can and should be indexed.
now the fourth table to achieve this kind of many-to-many relationship will describe the nature of the link. since many different type of relationship will exist in the table, you can't keep what the type is (child of, tag of, parent of) in the same table.
that fourth table (reference) could look like this:
id nature table1 table2
1 parent of entry tags
2 tag of tags entry
the table 1 field tells you which table the first id refers to, likewise with table2
the id is the number between the two fields in your relationship table. only the id field should be indexed. the nature field is more for the human reader then for joining tables or organizing data

Advice on database design for portfolio website

So I'm a visual designer type guy who has learned a respectable amount of PHP and a little SQL.
I am putting together a personal multimedia portfolio site. I'm using CI and loving it. The problem is I don't know squat about DB design and I keep rewriting (and breaking) my tables. Here is what I need.
I have a table to store the projects:
I want to do fulltext searcheson titles and descriptions so I think this needs to be MyISAM
PROJECTS
id
name (admin-only human readable)
title (headline for visitors to read)
description
date (the date the project was finished)
posted (timestamp when the project was posted)
Then I need tags:
I think I've figured this out. from researching.
TAGS
tag_id
tag_name
PROJECT_TAGS
project_id (foreign key PROJECTS TABLE)
tag_id (foreign key TAGS TABLE)
Here is the problem I have FOUR media types; Photo Albums, Flash Apps, Print Pieces, and Website Designs. no project can be of two types because (with one exception) they all require different logic to be displayed in the view. I am not sure whether to put the media type in the project table and join directly to the types table or use an intermediate table to define the relationships like the tags. I also thinking about parent-types/sub-types i.e.; Blogs, Projects - Flash, Projects - Web. I would really appreciate some direction.
Also maybe some help on how to efficiently query for the projects with the given solution.
The first think to address is your database engine, MyISAM. The database engine is how MySQL stores the data. For more information regarding MyISAM you can view: http://dev.mysql.com/doc/refman/5.0/en/myisam-storage-engine.html. If you want to have referential integrity (which is recommended), you want your database engine to be InnoDB (http://dev.mysql.com/doc/refman/5.0/en/innodb-storage-engine.html). InnoDB allows you to create foreign keys and enforce that foreign key relationship (I found out the hard way the MyISAM does not). MyISAM is the default engine for MySQL databases. If you are using phpMyAdmin (which is a highly recommended tool for MySQL and PHP development), you can easily change the engine type of the database (See: http://www.electrictoolbox.com/mysql-change-table-storage-engine/).
With that said, searches or queries can be done in both MyISAM and InnoDB database engines. You can also index the columns to make search queries (SELECT statements) faster, but the trade off will be that INSERT statements will take longer. If you database is not huge (i.e. millions of records), you shouldn't see a noticeable difference though.
In terms of your design, there are several things to address. The first thing to understand is an entity relationship diagram or an ERD. This is a diagram of your tables and their corresponding relationships.
There are several types of relationships that can exist: a one-to-one relationship, a one-to-many relationship, a many-to-many relationship, and a hierarchical or recursive relationship . A many-to-many relationship is the most complicated and cannot be produced directly within the database and must be resolved with an intermittent table (I will explain further with an example).
A one-to-one relationship is straightforward. An example of this is if you have an employee table with a list of all employees and a salary table with a list of all salaries. One employee can only have one salary and one salary can only belong to one employee.
With that being said, another element to add to the mix is cardinality. Cardinality refers to whether or not the relationship may exist or must exist. In the previous example of an employee, there has to be a relationship between the salary and the employee (or else the employee may not be paid). This the relationship is read as, an employee must have one and only one salary and a salary may or may not have one and only one employee (as a salary can exist without belonging to an employee).
The phrases "one and only one" refers to it being a one-to-one relationship. The phrases "must" and "may or may not" referring to a relationship requiring to exist or not being required. This translates into the design as my foreign key of salary id in the employee table cannot be null and in the salary table there is no foreign key referencing the employee.
EMPLOYEE
id PRIMARY KEY
name VARCHAR(100)
salary_id NOT NULL UNIQUE
SALARY
id PRIMARY KEY
amount INTEGER NOT NULL
The one-to-many relationship is defined as the potential of having more than one. For example, relating to your portfolio, a client may have one or more projects. Thus the foreign key field in the projects table client_id cannot be unique as it may be repeated.
The many-to-many relationship is defined where more than one can both ways. For example, as you have correctly shown, projects may have one or more tags and tags may assigned to one or more projects. Thus, you need the PROJECT_TAGS table to resolve that many-to-many.
In regards to addressing your question directly, you will want to create a separate media type table and if any potential exists whatsoever where a project is can be associated to multiple types, you would want to have an intermittent table and could add a field to the project_media_type table called primary_type which would allow you to distinguish the project type as primarily that media type although it could fall under other categories if you were to filter by category.
This brings me to recursive relationships. Because you have the potential to have a recursive relationship or media_types you will want to add a field called parent_id. You would add a foreign key index to parent_id referencing the id of the media_type table. It must allow nulls as all of your top level parent media_types will have a null value for parent_id. Thus to select all parent media_types you could use:
SELECT * FROM media_type WHERE parent_id IS NULL
Then, to get the children you loop through each of the parents and could use the following query:
SELECT * FROM media_type WHERE parent_id = {$media_type_row->id}
This would need to be in a recursive function so you loop until there are no more children. An example of this using PHP related to hierarchical categories can be viewed at recursive function category database.
I hope this helps and know it's a lot but essentially, I tried to highlight a whole semester of database design and modeling. If you need any more information, I can attach an example ERD as well.
Another posibble idea is to add columns to projects table that would satisfy all media types needs and then while editting data you will use only certain columns needed for given media type.
That would be more database efficient (less joins).
If your media types are not very different in columns you need I would choose that aproach.
If they differ a lot, I would choose #cosmicsafari recommendation.
Why don't you take whats common to all and put that in a table & have the specific stuff in tables themelves, that way you can search through all the titles & descriptions in one.
Basic Table
- ID int
- Name varchar()
- Title varchar()
etc
Blogs
-ID int (just an auto_increment key)
-basicID int (this matches the id of the item in the basic table)
etc
Have one for each media type. That way you can do a search on all the descriptions & titles at the one time and load the appropriate data when the person clicked through the link from a search page. (I assume thats the sort of functionality you mean when you say you want to be able to let people search.)

database table design for some unknown data

So, not having come from a database design background, I've been tasked with designing a web app where the end user will be entering products, and specs for their products. Normally I think I would just create rows for each of the types of spec that they would be entering. Instead, they have a variety of products that don't share the same spec types, so my question is, what's the most efficient and future-proof way to organize this data? I was leaning towards pushing a serialized object into a generic "data" row, but then are you able to do full-text searches on this data? Any other avenues to explore?
split products and specifications into two tables like this:
products
id name
specifications
id name value product_id
get all the specifations of a product when you know the product id:
SELECT name,
value
FROM specifications
WHERE product_id = ?;
add a specification to a product when you know the product id, the specification's name and the value of said specification:
INSERT INTO specifications(
name,
value,
product_id
) VALUES(
?,
?,
?
);
so before you can add specifications to a product, this product must exist. also, you can't reuse specifications for several products. that would require a somewhat more complex solution :) namely...
three tables this time:
products
id name
specifications
id name value
products_specifications
product_id specification_id
get all the specifations of a product when you know the product id:
SELECT specifications.name,
specifications.value
FROM specifications
JOIN products_specifications
ON products_specifications.specification_id = specifications.id
WHERE products_specifications.product_id = ?;
now, adding a specification becomes a little bit more tricky, cause you have to check if that specification already exists. so this will be a little heavier than the first way of doing this, since there are more queries on the db, and there's more logic in the application.
first, find the id of the specification:
SELECT id
FROM specifications
WHERE name = ?
AND value = ?;
if no id is returned, this means that said specification doesn't exist, so it must be created:
INSERT INTO specifications(
name,
value
) VALUES(
?,
?
);
next, either use the id from the select query, or get the last insert id to find the id of the newly created specification. use that id together with the id of the product that's getting the new specification, and link the two together:
INSERT INTO products_specifications(
product_id,
specification_id
) VALUES(
?,
?
);
however, this means that you have to create one row for every specific specification. e.g. if you have size for shoes, there would be one row for every known shoe size
specifications
id name value
1 size 7
2 size 7½
3 size 8
and so on. i think this should be enough though.
You could take a look at using an EAV model.
I've never built a products database, but I can point you to a data model for that. It's one of over 200 models available for the taking, at Database Answers. Here is the model
If you don't like this one, you can find 15 different data models for Product oriented databases. Click on "Data Models" to get a list and scroll down to "Products".
You should pick up some good design ideas there.
This is a pretty common problem - and there are different solutions for different scenarios.
If the different types of product and their attributes are fixed and known at development time, you could look at the description in Craig Larman's book (http://www.amazon.com/Applying-UML-Patterns-Introduction-Object-Oriented/dp/0131489062/ref=sr_1_1/002-2801511-2159202?ie=UTF8&s=books&qid=1194351090&sr=1-1) - there's a section on object-relational mapping and how to handle inheritance.
This boils down to "put all the possible columns into one table", "create one table for each sub class" or "put all base class items into a common table, and put sub class data into their own tables".
This is by far the most natural way of working with a relational database - it allows you to create reports, use off-the-shelf tools for object relational mapping if that takes your fancy, and you can use standard concepts such as "not null", indexing etc.
Of course, if you don't know the data attributes at development time, you have to create a flexible database schema.
I've seen 3 general approaches.
The first is the one described by davogotland. I built a solution on similar lines for an ecommerce store; it worked great, and allowed us to be very flexible about the product database. It performed very well, even with half a million products.
Major drawbacks were creating retrieval queries - e.g. "find all products with a price under x, in category y, whose manufacturer is z". It was also tricky bringing in new developers - they had a fairly steep learning curve.
It also forced us to push a lot of relational concepts into the application layer. For instance, it was hard to create foreign keys to other tables (e.g. "manufacturer") and enforce them using standard SQL functionality.
The second approach I've seen is the one you mention - storing the variable data in some kind of serialized format. This is a pain when querying, and suffers from the same drawbacks with the relational model. Overall, I'd only want to use serialization for data you don't have to be able to query or reason about.
The final solution I've seen is to accept that the addition of new product types will always require some level of development effort - you have to build the UI, if nothing else. I've seen applications which use a scaffolding style approach to automatically generate the underlying database structures when a new product type is created.
This is a fairly major undertaking - only really suitable for major projects, though the use of ORM tools often helps.

Categories