It's a daily struggle to work with the previous programmer his code... And now, apparently, also his database.
Problem description
So here we've got a table to store the availability of a user and normally you would assign a unique id to every row of data. Except... he didn't. He made the user_id the first primary key (probably a composite).
So the user changes his availability for each weekday (monday to friday) and every timeslot in that week.
This is made into one row each:
user_id,day,hour_nr,hour_type,location_id
But you might see this one coming, I can't manually insert fake data for developing purposes. I'm trying to add a period and college year (it's for an educational institution) Which worked fine but because the old data didn't require this it's all set to 0.
The new row will consist of:
user_id,day,hour_nr,hour_type,location_id,period_id,collegeyear_id
I've tried uploading data to the table containing the period and college year information but I get an instant error telling me that there is a duplicate entry.
That's correct there is but there already were duplicates as well.
Question
And so the question is: how do I force this without altering the tables keys? I don't feel much for altering the indexed properties of the composite primary key.
Lastly, I know this is wrong and I know that it should have been done differently. Again it's not my work or design and I don't have any time on hand to fix or alter it during this project.
Edit
As requested, hereby a snapshot of the table with data and a snapshot of how it should be
The snapshot shows different headers than mentioned, they're the same but in Dutch.
Current data snapshot (I forgot to put the last 2 columns that are in the Desired data result snapshot on the snapshot but they're already there containing nothing but 0's)
Desired data result
I do need anINSERT, the data has to be added not altered. Or another fix for this issue ofcourse but the data has to be added.
Fix
So in a perfect example of tunnel vision I fixed and therefore answered my own question.
Instead of looking blindly at inserting the data I should have looked more towards the composite key part. I've added the 2 new columns to the key and now all is fine and dandy.
I said that I didn't want to mess with the keys but that was pointed towards the already existing keys not adding to the composite key.
I still dislike the fact that there isn't a single unique id but it is workable.
Q.
And so the question is: how do I force this without altering the tables keys? I don't feel much for altering the indexed properties of the composite primary key.
A.
You cannot force the primary key to have multiple values of the same ID.
The best thing for you to do would be to add an extra column with a new ID and reference that within the software.
A primary key is a special relational database table column (or combination of columns) designated to uniquely identify all table records.
A primary key’s main features are:
It must contain a unique value for each row of data.
It cannot contain null values.
A primary key is either an existing table column or a column that is specifically generated by the database according to a defined sequence.
Resources:- techopedia
So in a perfect example of tunnel vision I fixed and therefore answered my own question.
Instead of looking blindly at inserting the data I should have looked more towards the composite key part. I've added the 2 new columns to the key and now all is fine and dandy.
I said that I didn't want to mess with the keys but that was pointed towards the already existing keys not adding to the composite key.
I still dislike the fact that there isn't a single unique id but it is workable.
Related
I have this MySQL table, where row contact_id is unique for each user_id.
history:
- hist_id: int(11) auto_increment primary key
- user_id: int(11)
- contact_id: int(11)
- name: varchar(50)
- phone: varchar(30)
From time to time, server will receive a new list of contacts for a specific user_id and need to update this table, inserting, deleting or updating data that is different from previous information.
For example, currenty data is:
So, server receive this data:
And the new data is:
As you can see, first row (John) was updated, second row (Mary) was deleted and some other row (Jeniffer) was included.
Today what I am doing is deleting all rows with a specific user_id, and inserting the new data. But the autoincrement field (hist_id) is getting bigger and bigger...
Obs: Table have about 80 thousand records, and this update will occur 30 times a day or more.
I have some (related) questions:
1. In this scenario, do you think deleting all records from a specific user_id and inserting updated data is a good approach?
2. What about removing the autoincrement field? I don't need it, but I think it is not a good idea to have a table without a primary key.
3. Or maybe the better approach is to loop new data, selecting each user_id / contact_id for comparing values to update?
PS. For better approach I mean the most efficient way
Thank you so much for any help!
In this scenario, do you think deleting all records from a specific user_id and inserting updated data is a good approach?
Short Answer
No. You should be taking advantage of 'upsert' which is short for 'insert on duplicate key update'. What this means is that if they key pair you're inserting already exists, update the specified columns with the specified data. You then shorten your logic and reduce increments. Here's an example, using your table structure that should work. This is also assuming that you have set the user_id and contact_id fields to unique.
INSERT INTO history (user_id, contact_id, name, phone)
VALUES
(1, 23, 'James Jr.', '(619)-543-6222')
ON DUPLICATE KEY UPDATE
name=VALUES(name),
phone=VALUES(phone);
This query should retain the contact_id but overwrite the prexisting data with the new data.
What about removing the autoincrement field? I don't need it, but I think it is not a good idea to have a table without a primary key.
Primary keys do not imply auto incremented values. I could have a varchar field as the primary key containing names of fruits and vegetables. Is this optimized for performance? Probably not. There many situations that might call for auto increment and there are definite reasons to avoid it. It all depends on how you wish to access the data and how this can impact future expansion. In your situation, I would start over on the table structure and re-think how you wish to store and access the data. Do you want to write more logic to control the data OR do you want the data to flow naturally by itself? You've made a history table that is functioning more like a hybrid many-to-one crosswalk at first glance. Without looking at the remaining table structure, I can't necessarily say on a whim that it's not a good idea. What I can say is that I would do this a bit differently. I will answer this more specifically in the next question.
Or maybe the better approach is to loop new data, selecting each user_id / contact_id for comparing values to update?
I would avoid looping through the data in order to update it. That is a job for SQL and it does this job well. Sometimes, we might find ourselves in a situation where we must do this to either extract data in a specific format or to repair data in some way however, avoid doing this for inserting or updating the data. It can negatively impact performance and you will likely paint yourself into a corner.
Back to what I said toward the end of your second question which will help you see what I am talking about. I am going to assume that user_id is a primary key that is auto-incremented in your user table. I will do some guestimation here and show you an example of how you can redesign your user, contact and phone number structure. The following is a quick model I threw together that shows the foreign key relationship between the tables.
Note: The column names and overall data arrangement could be done differently but I did this quickly to give you a decent example of a normalized database structure. All of the foreign keys have a structural layout which separates your data in a way that enables you to control the flow of data as it enters and leaves your system. Here's the screenshot of the database model I threw together using MySQL Workbench.
(source: xonos.net)
Here's the SQL so that you can look at it more closely.
You'll notice that the "person" table is extracted from users but shares data with contacts. This enables you to store all "people" in one place, all "users" in another and all "contacts" in another. Now, why would we do this? The number one reason can be explained in two scenarios.
1.) Say we have someone, in this example I'll call him "Jim Bean". "Jim Bean" works for the company, so he is a user of the system. But, "Jim Bean" happens to own a side business and does contact work for the company at the same time. So, he is both a contact and a user of the system. In a more "flat table" environment, we would have two records for Jim Bean that contain the same data which could become outdated or incorrect, quickly.
2.) Let's say that Jim did some bad things and the company wants nothing to do with him anymore. They don't want any record of him - as if he never existed. All that we have to do is delete Jim Bean from the Person table. That's it. Since the foreign relationship has "CASCADE" on update/delete - this automatically propagate and clears out the other tables related to him.
I highly recommend that you do some reading on normalized data structure. It has saved me many hours once I got the hang of it and I will never go back.
I have a website connected to a database. In one of its tables, one entity attribute that is not the primary key needs to be unique in that table.
Currently, I am querying the database before inserting a value into that column to check, if the value already exists. If it does, the value gets altered by my script and the same procedure starts again until no result gets back, which means it doesn't exist yet in the database.
While this works, I feel it's a great performance hog – even when the value is unique, the database needs to queried at least two times: One time for checking & one time for writing.
To improve performance & to make my (possible buggy/unnecessary) code obsolete, I have the idea to mark the column as Unique Key & to use a try/catch block for the writing/error handling process. That way, the database engine needs to handle the uniqueness, which seems a bit more reasonable than my query-write procedure.
Is this a good idea or are Unique Keys not made for this behavior? What is the typical use case of a Unique Key in a SQL database?
INSERT INTO table (uniquerow) VALUES(1) ON DUPLICATE KEY UPDATE uniquerow = 1;
With this statement, you can insert if it is unique and update if the key allready exists.
With unique constraints you can check a tuple of values not to be there multiple times, without being a primary key.
I'm using MySQL as database for online examination system. here Question No. is primary key so, when a question in middle is deleted that number is wasted,(just like in queue data structure).I want the next question numbers to be automatically decremented. is it possible by using PHP and MySQL. If yes, then please write the solution.
Please do not do this!
The number of the primary key has only one function: Uniquely identifying a record. It does not matter what number it is and if there are gaps in between.
If you want to sort your data use another column like a datetime column or an extra ranking index.
I have keys for a project I made where I am trying to test a licensing system (Just for fun, and learning) a part that I thought I'd run into, is how to distribute the keys. I have about 100 keys in a database, and I'm trying to figure out the best way to distribute them. The database is layed out as follows,
ID (Auto Increment) | key
Using the PDO library, what is the most effective way to either to go in chronological order by ID? But even if I did chronological order, when I deleted the key that was given out, how would I go in chronological order? Or maybe random ID number? I have no clue how to go about the most effective way to distribute these keys?
If I understand your question correctly...
You might try this query through PDO:
SELECT * FROM `table-name`
ORDER BY `ID` ASC
Then when you step through the rows in a while() loop from the execution's return, it will be in chronological order like you asked.
As far as losing ID's, like if you delete the key with ID # 10, your table will jump from 9 to 11 in the returned rows IDs. When you add a new key, # 10 will not be used unless you specifically specify that ID when inserting.
EDIT: From the phrasing of your question, it sounds like you may be concerned about how you set up the ID's for the keys. Maybe you understand this already, but since you have Auto Increment, your IDs will be automatically generated when you insert new keys, so a new key would be assigned an ID of (ID of last inserted key) + 1.
Chronology isn't exactly a feature of PDO, or for that matter whatever database driver you are using... it's more a matter of your schema.
Typically, a commonly employed field in any database structure is a "timestamp" or "created" field that holds the time the record was created in the database. These fields can be MySQL datatype TIMESTAMP (in which case the driver will return seconds since the Unix Epoch), or DATETIME (in which case most drivers will attempt to return the language's native DateTime object if one exists.) Even though monotonically-increasing primary keys imply a certain amount of chronological order when sorted, a timestamp field can record the exact time a record was created at the server, as well as update on change using ON UPDATE CURRENT_TIMESTAMP. So I would suggest adding this to your schema.
With such a field in your database, you can always sort your queries using:
SORT BY timestamp_field_name ASC
Also, if by "distribute" you mean some data will be publicly accessible by using this key as query param of some sort, I wouldn't use the monotonic primary key for the exact reason you described, especially if this is a "licensing" proof of concept, which if you mean a DRM-type thing should probably produce a complex string. Hashed timestamps in a UNIQUE field, or the php uniqid function can produce values that can be stored in a VARCHAR database field with the UNIQUE key restraint. This is if I have understood your described goal.
I've got a PHP script pulling a file from a server and plugging the values in it into a Database every 4 hours.
This file can and most likely change within the 4 hours (or whatever timeframe I finally choose). It's a list of properties and their owners.
Would it be better to check the file and compare it to each DB entry and update any if they need it, or create a temp table and then compare the two using an SQL query?
None.
What I'd personally do is run the INSERT command using ON DUPLICATE KEY UPDATE (assuming your table is properly designed and that you are using at least one piece of information from your file as UNIQUE key which you should based on your comment).
Reasons
Creating temp table is a hassle.
Comparing is a hassle too. You need to select a record, compare a record, if not equal update the record and so on - it's just a giant waste of time to compare a piece of info and there's a better way to do it.
It would be so much easier if you just insert everything you find and if a clash occurs - that means the record exists and most likely needs updating.
That way you took care of everything with 1 query and your data integrity is preserved also so you can just keep filling your table or updating with new records.
I think it would be best to download the file and update the existing table, maybe using REPLACE or REPLACE INTO. "REPLACE works exactly like INSERT, except that if an old row in the table has the same value as a new row for a PRIMARY KEY or a UNIQUE index, the old row is deleted before the new row is inserted." http://dev.mysql.com/doc/refman/5.0/en/replace.html
Presumably you have a list of columns that will have to match in order for you to decide that the two things match.
If you create a UNIQUE index over those columns then you can use either INSERT ... ON DUPLICATE KEY UPDATE(manual) or REPLACE INTO ...(manual)