I have a MySQL table with clients in it, the usual data, names, addresses, phone numbers etc etc i also have a field which is called 'roles' in which a client ticks off what they like to do i.e coding, graphic design, illustrations etc etc .. this data gets pushed into the field serialized with each roles code, the following is an example.
a:3:{s:4:"_wfa";s:2:"on";s:3:"_CS";s:2:"on";s:3:"_CM";s:2:"on";}
On a 'viewall' page, i need to output all the details for a user that has ticked a specific box, as an example, i need to output all users that have ticked the '_wfa' box.
I hope this makes sense, i cant seem to figure out how to do it.
I hope someone can shed some light on this.
Cheers,
You should never have more than one value in a single column of a row. Store the roles in their own database table, with the user's ID, and you will be able to simply ask MySQL for the users with a role as desired.
CREATE TABLE user_roles (user_id INT, role_name VARCHAR(100));
INSERT INTO user_roles (1, '_wfa');
INSERT INTO user_roles (1, '_CS');
INSERT INTO user_roles (1, '_CM');
SELECT users.id FROM users INNER JOIN user_roles ON users.id = user_roles.user_id WHERE user_roles.role_name = '_wfa';
You should normalise that into a table. Having it serialised means you can not use any of the benefits of SQL on it, and also that parsing it requires PHP (or custom code for other language).
MySQL, or any database, can not unserialize data performed by an external programming language. The only way to get the data out, is to pull it out and unserialize in PHP before you can use the data.
The only way to get any value out of using a database is to store data in it, using tables and native data types to enforce data consistency. Normalization and referential integrity work to minimize data duplication while enforcing business rules.
Transitioning to SQL, objects become tables -- they're like arrays. Object attributes become columns, but when an object contains an array of other objects - that attribute gets promoted to being a table... Normalization means taking things like roles, and making a code table for them that you can refer to in other tables.
Related
I have this MySQL table, where row contact_id is unique for each user_id.
history:
- hist_id: int(11) auto_increment primary key
- user_id: int(11)
- contact_id: int(11)
- name: varchar(50)
- phone: varchar(30)
From time to time, server will receive a new list of contacts for a specific user_id and need to update this table, inserting, deleting or updating data that is different from previous information.
For example, currenty data is:
So, server receive this data:
And the new data is:
As you can see, first row (John) was updated, second row (Mary) was deleted and some other row (Jeniffer) was included.
Today what I am doing is deleting all rows with a specific user_id, and inserting the new data. But the autoincrement field (hist_id) is getting bigger and bigger...
Obs: Table have about 80 thousand records, and this update will occur 30 times a day or more.
I have some (related) questions:
1. In this scenario, do you think deleting all records from a specific user_id and inserting updated data is a good approach?
2. What about removing the autoincrement field? I don't need it, but I think it is not a good idea to have a table without a primary key.
3. Or maybe the better approach is to loop new data, selecting each user_id / contact_id for comparing values to update?
PS. For better approach I mean the most efficient way
Thank you so much for any help!
In this scenario, do you think deleting all records from a specific user_id and inserting updated data is a good approach?
Short Answer
No. You should be taking advantage of 'upsert' which is short for 'insert on duplicate key update'. What this means is that if they key pair you're inserting already exists, update the specified columns with the specified data. You then shorten your logic and reduce increments. Here's an example, using your table structure that should work. This is also assuming that you have set the user_id and contact_id fields to unique.
INSERT INTO history (user_id, contact_id, name, phone)
VALUES
(1, 23, 'James Jr.', '(619)-543-6222')
ON DUPLICATE KEY UPDATE
name=VALUES(name),
phone=VALUES(phone);
This query should retain the contact_id but overwrite the prexisting data with the new data.
What about removing the autoincrement field? I don't need it, but I think it is not a good idea to have a table without a primary key.
Primary keys do not imply auto incremented values. I could have a varchar field as the primary key containing names of fruits and vegetables. Is this optimized for performance? Probably not. There many situations that might call for auto increment and there are definite reasons to avoid it. It all depends on how you wish to access the data and how this can impact future expansion. In your situation, I would start over on the table structure and re-think how you wish to store and access the data. Do you want to write more logic to control the data OR do you want the data to flow naturally by itself? You've made a history table that is functioning more like a hybrid many-to-one crosswalk at first glance. Without looking at the remaining table structure, I can't necessarily say on a whim that it's not a good idea. What I can say is that I would do this a bit differently. I will answer this more specifically in the next question.
Or maybe the better approach is to loop new data, selecting each user_id / contact_id for comparing values to update?
I would avoid looping through the data in order to update it. That is a job for SQL and it does this job well. Sometimes, we might find ourselves in a situation where we must do this to either extract data in a specific format or to repair data in some way however, avoid doing this for inserting or updating the data. It can negatively impact performance and you will likely paint yourself into a corner.
Back to what I said toward the end of your second question which will help you see what I am talking about. I am going to assume that user_id is a primary key that is auto-incremented in your user table. I will do some guestimation here and show you an example of how you can redesign your user, contact and phone number structure. The following is a quick model I threw together that shows the foreign key relationship between the tables.
Note: The column names and overall data arrangement could be done differently but I did this quickly to give you a decent example of a normalized database structure. All of the foreign keys have a structural layout which separates your data in a way that enables you to control the flow of data as it enters and leaves your system. Here's the screenshot of the database model I threw together using MySQL Workbench.
(source: xonos.net)
Here's the SQL so that you can look at it more closely.
You'll notice that the "person" table is extracted from users but shares data with contacts. This enables you to store all "people" in one place, all "users" in another and all "contacts" in another. Now, why would we do this? The number one reason can be explained in two scenarios.
1.) Say we have someone, in this example I'll call him "Jim Bean". "Jim Bean" works for the company, so he is a user of the system. But, "Jim Bean" happens to own a side business and does contact work for the company at the same time. So, he is both a contact and a user of the system. In a more "flat table" environment, we would have two records for Jim Bean that contain the same data which could become outdated or incorrect, quickly.
2.) Let's say that Jim did some bad things and the company wants nothing to do with him anymore. They don't want any record of him - as if he never existed. All that we have to do is delete Jim Bean from the Person table. That's it. Since the foreign relationship has "CASCADE" on update/delete - this automatically propagate and clears out the other tables related to him.
I highly recommend that you do some reading on normalized data structure. It has saved me many hours once I got the hang of it and I will never go back.
I have a MySQL database that stores user emails and news articles that my service provides. I want users to be able to save/bookmark articles they would like to read later.
My plan for accomplishing this was to have a column, in the table where I store the users' emails, that holds comma-delineated strings of unique IDs, where the unique IDs are values assigned to each article as they are added into the database. These articles are stored in a separate table and I use UUID_SHORT() to generate the unique IDs of type BIGINT.
For example, let's say in the table where I store my articles, I have
ArticleID OtherColumn
4419350002044764160 other stuff
4419351050184556544 other stuff
In the table where I store user data, I would have
UserEmail ArticlesSaved OtherColumn
examlple1#email.com 4419350002044764160,4419351050184556544,... other stuff
examlple2#email.com 4419350002044764160,4419351050184556544,... other stuff
to indicate the first two users have saved the articles with IDs 4419350002044764160 and 4419351050184556544.
Is this a proper way to store something like this on a database? If there is a better method, could someone explain it please?
One other option I was thinking of was having a separate table for each user where I can store the IDs of the articles they saved into a column, though the answer for this post that this is not very efficient: Database efficiency - table per user vs. table of users
I would suggest one table for the user and one table his/her bookmarked articles.
USERs
id - int autoincrement
user_email - varchar50
PREFERENCES
id int autoincrement
article_index (datatype that you find accurate according to your structure)
id_user (integer)
This way it will be easy for a user to bookmark and unbookmark an article. Connecting the two tables are done with id in users and id_user in preferences. Make sure that each row in the preferences/bookmarks is one article (don't do anything comma seperated). Doing it this way will save you much time/complications - I promise!
A typical query to fetch a user's bookmarked pages would look something like this.
SELECT u.id,p.article_index,p.id_user FROM users u
LEFT JOIN preferences ON u.id=p.id_user
WHERE u.id='1' //user id goes here, make sure it's an int.. apply appropriate security to your queries.
"Proper" is a squirrely word, but the approach you suggest is pretty flawed. The resulting database no longer satisfies even first normal form, and that predicts practical problems even if you don't immediately see them. Some of the problems you would be likely to encounter are
the number of articles each user can "save" will be limited by the data type of the ArticlesSaved column;
you will have issues around duplicate "saved" article IDs; and
queries about which articles are saved will be more difficult to formulate and will probably run slower; in part because
you cannot meaningfully index the the ArticlesSaved column.
The usual way to model a many-to-many relationship (such as between users and articles) is via a separate table. In this case, such a table would have one row for each (user, saved article) pair.
Saving data in CSV format in a database field is (almost) never a good idea. You should have 3 tables :
1 table describing users with everything concerning directly the user
1 table describing articles with data about it
1 table with 2 columns "userid" and "articleid" linking both. If a user bookmarks 10 articles, this table will have 10 records with a different aticleid each time.
I have two tables in mysql. When I insert/delete values in the first table I want that the values get duplicated in table 2 to keep them "aligned".
table1:
id - username
1 - test_user
table2:
Same id as table1 and username as table1 (on insert/delete)
I want to keep the data between the tables aligned without doing multiple queries. I've read about triggers not sure if it's the correct road, i am a beninner.
I said two tables but i will need to do this in multiple tables.
You can use Mysql triggers. This way you can auto insert/update/delete datas from second table.
MySql Using Triggers
When you INSERT new records, given that you don't want to do two inserts for some reason, using a trigger to insert into the second table will work. For UPDATE and DELETE you might want to look at the CASCADE option with foreign keys. If all you are doing is keeping the data consistent between tables, that's exactly what cascade is for.
When you create table2 you just add a foreign key like this:
FOREIGN KEY (id, username)
REFERENCES table1(id, username) ON UPDATE CASCADE ON DELETE CASCADE
Then whenever you alter table1 the changes will automatically get pushed through to table2.
Couple prerequisites for this to work:
You have to use a storage engine that supports foreign keys, something like InnoDB and not MyISAM
You need to have an index on (id,username) in table1; the foriegn key needs to match a key in the parent table
You should read the doc page for foreign keys. There are a couple other ways you can tweak them, and you should figure out what works best for your purposes.
You can certainly put triggers on your table1 to make parallel changes to your other tables as your application changes table1.
See here for the documentation: http://dev.mysql.com/doc/refman/5.0/en/trigger-syntax.html
But, you should think over your design. It will take multiple queries to do your inserts and updates; they'll just be done "behind your back" on the server. They'll still take time. Triggers can really slow things down.
Also, triggers are a little bit fragile. If you add a column to a table, you'll have to rework your triggers. Triggers are generally a pain in the neck to keep in a source-control system and a huge pain in the neck to test, so using them will make your application more troublesome to maintain.
Could you think of another approach to handling this need for duplication? Could you, for example, use a view or a join to present the data you need to your application program without actually duplicating tables and the rows in them? If you figure out how to do that you'll be much happier in the long run.
CREATE VIEW table2 AS
SELECT *
FROM table1;
will produce a "fake" table2 with the contents of table1.
Or if you're hoping to view only the test users in a second table, a view can do that for you too, for example:
CREATE VIEW table3 AS
SELECT *
FROM table1
WHERE usertype = 'test_user' ;
If you're using duplicate tables for "backup," that's a bad way to make sure your information is safe. Instead, you need to back up your MySQL server instance.
Formal relational database design principles teach us to duplicating data, but instead use view and joins to structure the data the way applications need to see it.
I am trying to build a robust php function that allows me to traverse over my normalized database. My mySQL database has 6 tables with the following column names (I am only including the primary and foreign keys, as well as some limited table columns for simplicity) so that you can see how they are related.
tableA:
partID (primary key)
tableABJunction
itemID (foreign key)
partID (foreign key)
tableB
itemID (primary key)
itemName
sales
customerID (foreign key)
itemID (foreign key)
partDate
itemID (foreign key)
customer
customerID (primary key)
nameFirst
nameLast
When I need to generate a query, such as: What are the names of the customers that ordered itemID = 12? I have to first do a query from the sales database for all customerIDs where itemID=12 and then query the customer table to find out their first and last names. Some times, I may need to perform a query where I have to return data from all 6 tables, based on a query asking for all information pertaining to customer whose name is John Smith. Is there any easy way to build a function to handle this variety of queries, without having to build a query for every possible type of search?
Currently, my approach is to pass the following to php via AJAX:
web_conditionArray (contains the column name and value of the data provided. Such as nameFirst => 'John', nameLast => 'Smith'); web_resultArray (contains the table name and the columns that I am requesting: sales => 'itemID, itemName').
The issue that I am having with this approach is a way to store the relationships between all of the mySQL datatables with their foreign keys so that my php program knows how to link all the tables together to run the correct query to get from the data provided from one table to the data requested in another table. Any suggestions or a better way to solve this? I was initially thinking of a doubly linked list but the flow from table to table is not linear given that there is a fork where the tableB links to the sales and partDate tables.
I tried to be as specific as I could in describing this situation without writing a novel; however, please let me know if you need any additional information to refine my question further.
Looking at your table structure, I imagine it would be possible to construct logic to calculate the relationships between tables, and dynamically construct queries, but it seems to me that that would be far more work than manually constructing queries for your particular database. I'm assuming that your tables have many more fields in them, but that you've only included the most important, and have definitely included all primary and foreign keys.
Based on that, you have only three information objects in your database: Parts, Items and Customers. You should, therefore, not need more than 12 manually constructed queries to make your system work. You just need to ensure that you simplify your queries to work with whole information objects, and use the PHP layer to filter them later.
So, you reduce your query logic to:
"Fetch me all [Parts, Items or Customers] (and possibly also all [Parts, Items or Customers]) related to [Part, Item or Custromer] (and possibly [Part, Item or Customer])"
This results in the following queries:
All Customers for a Part
All Customers for an Item
All Customers for a Part and an Item
All Items for a Part
All Items for a Customer
All Items for a Part and a Customer
All Parts for an Item
All Parts for a Customer
All Parts for a Customer and an Item
All Parts and Customers for an Item
All Customers and Items for a Part
All Items and Parts for a Customer
(This is the full list of logical relationships - some may not make any sense practically, which makes your life easier)
So, your PHP script needs to perform the following tasks:
Identify which object(s) are required for the criteria of the query. This is based on the fields supplied.
Construct a WHERE clause for your query which identifies the primary key for the criteria objects from the fields passed.
Identify which object(s) are required for the result of the query, based on the fields requested.
Select the query based on the criteria and return objects, and insert the constructed WHERE clause.
Perform the query, extracting all information available about the requested objects
Filter the results, extracting only the required information
Return the final results.
First, know that my answer will most likely be downvoted to hell (as this methodology is constantly downvoted despite its' correctness). DBAs want you to believe that just because a complex query can be done with a SQL statement that it should (like how server-siders think all client-side should be done with server-side or how client-siders think layouts should be done with client-side instead of CSS). No. Complex queries are for people sitting at command lines needing to come up with on demand data grabbing for specific, non-routine reasons. For processing speed, SELECTing, UPDATEing, and DELETEing should always be done off the PK server-side.
It sounds like you have a set of legitimately large tables.
Assuming it's large and speed is the primary concern (and not development time), use only a primary key and no other indexes because the more indexes you have, the more those indexes need to be reindexed by the database when really the comparisons that DBAs would have you do are faster server-side.
The primary key will take some finagling, but it's the most important thing past data types and lengths. For instance, the non-FK, independent tables like tableA, tableB, and customer should probably have an ai INT PK (Generally, remember that computers think in terms of integers), but the ones with multiple FKs should probably have no ai INT but instead a composite PK with the less variant SELECTed FK first. For example, with my site, I store vote totals on links by userID and linkID. If a user's logged in, they'll need to know how many votes they've placed on a link, so the userID is the one less likely to change, so that's first in my PK on that table. Counting this on demand database side or server-side was a performance nightmare.
For just a few lines of code, you will GREATLY improve speed. Sorting on the PK via php will cut latency by 50%. Absorbing JOINs into php will decrease the rate of latency spikes. Having no on demand MySQL calculations will keep your site from becoming paralyzed.
If you step away from the dogma that just because a SQL statement can get you the results that you should use a SQL statement instead of a server-side language (C++ being the fastest), you'll see performance skyrocket.
If you can be more specific with the tables you're trying to obfuscate, I can get more specific, but you probably get the idea.
AJAX has changed the game and forced refocus. CSS for layouts; js for client-side programming; server-side for...server-side processing; database for storing everything that lasts longer than a moment.
Bring on the downvotes! LOL
I am trying to create a site where users can register and create a profile, therefore I am using two MySQL tables within a database e.g. users and user_profile.
The users table has an auto increment primary key called user_id.
The user_profile table has the same primary key called user_id however it is not auto increment.
*see note for why I have multiple tables.
When a user signs up, data from the registration form is inserted into users, then the last_insert_id() is input into the user_id field of the user_profile table. I use transactions to ensure this always happens.
My question is, is this bad practice?
Should I have a unique auto increment primary key for the user_profile table, even though one user can only ever have one profile?
Maybe there are other downsides to creating a database like this?
I'd appreciate if anyone can explain why this is a problem or if it's fine, I'd like to make sure my database is as efficient as possible.
Note: I am using seperate tables for user and user_profile because user_profile contains fields that are potentially null and also will be requested much more than the user table, due to the data being displayed on a public profile.
Maybe this is also bad practice and they should be lumped in one table?
I find this a good approach, I'd give bonus point if you use a foreign key relation and preferably cascade when deleting the user from the user table.
As too separated the core user data in one table, and the option profile data in another - good job. Nothing more annoying then a 50 field dragonish entry with 90% empty values.
It is generally frowned upon, but as long as you can provide the reasoning for the 1 to 1 relationship I'm sure it is fine.
I have used them when I have hundreds of columns (and it would be more logical to split them out into separate tables)
or I need a thinner table to speed up fullscans
In your case I would use a single table and create a couple of views.
see: http://dev.mysql.com/doc/refman/5.0/en/create-view.html
In general a single table approach is more logical, quicker, simpiler, and uses less space.
I don't think it's a bad practice. Sometimes it's quite useful, especially if you want one class to deal with authentication, and not load all profile data. You can then modify how your authentication works, build web services and so on, with little care about maintaining data structures about profiles information which is likely to change as your project evolves.
This is very good practice.
It's right at the core of writing good, modular, normalised relational database structures.