I am looking for the best approach to save changes to a list of checkboxes to the database. Here is my setup:
I have a list of checkboxes that are either checked or unchecked, based on some database entries, as such:
Animals I like:
0 Cats
X Dogs
0 Birds
X Elefants
Now the user can completely change his/her mind and select Birds and deselect Dogs and I want to save this change in the database as efficiently as possible.
The way the db is structured, I have a user table (with a user_id) and an animal table (with an animal_id). The "likes" are tracked in a pivot table (because it is a many-to-many relationship).
Here are a few approaches I have considered, but I am interested in any other/better/more efficient ones:
1) On save, delete all entries in the pivot table for this user and enter only the checked ones.
This has the advantage that I don't need to compare much of the before/after choices. The disadvantage is that I delete an entry that has not changed (i.e. elefants in the above example). If I attach a creation timestamp for example to the elefant like, it will change every time, even when I don't change that selection
2) On save, I query the db to get a list of all original likes. The I compare this list to new the new likes. Every time I encounter an original like, that is not in the new list, I remove it. If I encounter a new like that is not in the original list, I add it.
This has the advantage of only changing the changes to the db, but it seems like an awful lot of queries. If the list of animals is long and many changes are made, the looping could result in a lot of db transactions.
So, what would be the best practice to solve this issue. I mean, it must be a common problem and I don't want to reinvent the wheel here.
option 1 would be the best, but since you don't want your timestamps disturbed, you CAN do a somewhat more efficient system than "check each record individually".
1) fetch a list of the user's choices from the db, $original.
2) fetch the list of choices from the submitted form, $submitted.
3) use array_diff() to figure out what's changed and what you need to do to the database:
e.g.
$original = array(2,4,6,8);
$submitted = array(4,7,8);
// so 2 and 6 have been removed, and 7's been added.
$unchanged = array_intersect($original, $submitted); // 4, 8
$removed = array_diff($original, $submitted); // 2, 6
$added = array_diff($submitted, $original); // 7
$sql = "DELETE FROM pivot_table WHERE animal_id IN (2, 6);"; // remove 2&6
$sql = "INSERT INTO (animal_id, ...) VALUES (7, ...)"; // insert 7
When you originally present the choices to the user you've done a database query to display what s/he has already checked. Now you can compare the changes to the original. If something changed, update, otherwise, leave it alone.
Related
So the situation I have is this. I have 2 tables one for users and the other stores email(s) for those users. The email table is related to users by a foreign key user_id. I am trying to set up an archive system where the user can archive these users and keep them in the production db. But I need to delete the original and re-insert the user to give them a new id. This way they can archive again later and the data won't be corrupt and any new information will then be archived also. This is just a small example I am actually doing this with 10 tables, some of them related to each other. The amount of data that is involved can be quite large and when the archive starts it can take several minutes to complete because for every user I am also checking, archiving, deleting and re-inserting that user. The way I have it set up now there can be literally several 1000 queries to accomplish the complete archive.
So I am trying to re-write this so that the number of query calls is limited to one. By getting all the users, looping through them and building an insert query where I can insert many records with one call. Everything is great for the users but when I want to do this for their emails I run into an issue. As I'm looping through the users I save the id (the original id) to an array, hoping to use it later to update the new email records with the new user_id that was created. I can get the first id from my big insert and I know how many records there were so I know all the new id's. I can set up a for loop to create a new array with the new id's thereby matching the original id array.
So the question is, is there a way to set up an update statement that would allow for a multiple update statement in one call? The indexes from the two arrays will match.
update email_table
set user_id = {new_id_array}
where user_id = {old_id_array}
I have seen several different option out there but nothing that quite does what I'm trying to do. Any help is very much appreciated.
The simplest way to do what you need I think is to have some table containing old_id <-> new_id relations.
Once you have that data somewhere in your database you just need to join:
https://www.db-fiddle.com/f/bYump2wDn5n2hCxCrZemgt/1
UPDATE email_table e
JOIN replacement r
ON e.user_id = r.old_id
SET e.user_id = r.new_id;
But if you still want to do something with plain lists you need to generate query to manipulate with ELT and FIELD:
https://www.db-fiddle.com/f/noBWqJERm2t399F4jxnUJR/1
UPDATE email_table e
SET e.user_id = ELT(FIELD(e.user_id, 1, 2, 3, 5), 7, 8, 9, 10)
WHERE e.user_id IN (1,2,3,5);
I have a drupal site, and am trying to use php to grab some data from my database. What I need to do is to display, in a user's profile, how many times they were the first person to review a venue (exactly like Yelp's "First" tally). I'm looking at two options, and trying to decide which is the better way to approach it.
First Option: The first time a venue is reviewed, save the value of the reviewer's user ID into a table in the database. This table will be dedicated to storing the UID of the first user to review each venue. Then, use a simple query to display a count in the user's profile of the number of times their UID appears in this table.
Second Option: Use a set of several more complex queries to display the count in the user's profile, without storing any extra data in the database. This will rely on several queries which will have to do something along the lines of:
Find the ID for each review the user has created
Check the ID of the venue contained in each review
First review for each venue based on the venue ID stored in the review
Get the User ID of the author for the first review
Check which, if any, of these Author UIDs match the current user's UID
I'm assuming that this would involve creating an array of the IDs in step one, and then somehow executing each step for each item in the array. There would also be 3 or 4 different tables involved in the query.
I'm relatively new to writing SQL queries, so I'm wondering if it would be better to perform the set of potentially longer queries, or to take the small database hit and use a much much smaller count query instead. Is there any way to compare the advantages of either, or is it like comparing apples and oranges?
The volume of extra data stored will be negligible; the simplification to the processing will be significant. The data won't change (the first person to review a venue won't change), so there is a negligible update burden. Go with the extra data and simpler query.
I've got a problem that I just can't seem to find the answer to. I've developed a very small CRM-like application in PHP that's driven by MySQL. Users of this application can import new data to the database via an uploaded CSV file. One of the issues we're working to solve right now is duplicate, or more importantly, near duplicate records. For example, if I have the following:
Record A: [1, Bob, Jones, Atlanta, GA, 30327, (404) 555-1234]
and
Record B: [2, Bobby, Jones, Atlanta, GA, 30327, Bob's Shoe Store, (404) 555-1234]
I need a way to see that these are both similar, take the record with more information (in this case record B) and remove record A.
But here's where it gets even more complicated. This must be done upon importing new data, and a function I can execute to remove duplicates from the database at any time. I have been able to put something together in PHP that gets all duplicate rows from the MySQL table and matches them up by phone number, or by using implode() on all columns in the row and then using strlen() to decide the longest record.
There has got to be a better way of doing this, and one that is more accurate.
Do any of you have any brilliant suggestions that I may be able to implement or build on? It's obvious that when importing new data I'll need to open their CSV file into an array or temporary MySQL table, do the duplicate/similar search, then recompile the CSV file or add everything from the temporary table to the main table. I think. :)
I'm hoping that some of you can point out something that I may be missing that can scale somewhat decently and that's somewhat accurate. I'd rather present a list of duplicates we're 'unsure' about to a user that's 5 records long, not 5,000.
Thanks in advance!
Alex
If I were you I'd give a UNIQUE key to name, surname and phone number since in theory if all these three are equal then it means that it is a duplicate. I am thinking so because a phone number can have only one owner. Anyways, you should find a combination of 2-3 or maybe 4 columns and assign them a unique key. Once you have such a structure, run something like this:
// assuming that you have defined something like the following in your CREATE TABLE:
UNIQUE(phone, name, surname)
// then you should perform something like:
INSERT INTO your_table (phone, name, surname) VALUES ($val1, $val2, $val3)
ON DUPLICATE KEY UPDATE phone = IFNULL($val1, phone),
name = IFNULL($val2, name),
surname = IFNULL($val3, surname);
So basically, if the inserted value is a duplicate, this code will update the row, rather than inserting a new one. The IFNULL function performs a check to see whether the first expression is null or not. If it is null, then it picks the second expression, which in this case is the column value that already exists in your table. Hence, it will update your row with as much as information possible.
I don't think there're brilliant solutions. You need to determine priority of your data fields you can rely on for detecting similarity, for example phone, some kind of IDs, of some uniform address or official name.
You can save some cleaned up values (reduced to the same format like only digits in phones, concatenated full address) along with row which you would be able to use for similarity search when adding records.
Then you need to decide on data completeness in any case to update existing rows with more complete fields, or delete old and add new row.
Don't know any ready solutions for such a variable task and doubt they exist.
I'm having some trouble approaching a +1/-1 voting system in PHP, it should vaguely resemble the SO voting system. On average, it will get about ~100 to ~1,000 votes per item, and will be viewed by many.
I don't know whether I should use:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
A text field in the "item" table, containing the uids that already voted (in a serialized array), and also a numeric field that contains the total sum of the votes.
A numeric field in the "item" table, that contains the total sum of the votes, then store whether or not the user voted in a text field (with a serialized array of the poll id).
Something completely different (please post some more ideas!)
I'd probably go with option 3 that you've got listed above. By putting the total number of votes as another column in the item table you can get the total number of votes for an item without doing any more sql queries.
If you need to store which user voted on which item I'd probably create another table with the fields of item, user and vote. item would be the itemID, user would be the userID, vote would contain + or - depending on whether it's an up or down vote.
I'm guessing you'll only need to access this table when a user is logged in to show them which items they've voted on.
I recommend storing the individual votes in one table.
In another table store the summary information like question/poll ID, tally
Do one insert in to the individual votes table.
For the summary table you can do this:
$votedUpOrDown = ($voted = 1) ? 1 : -1;
$query = 'UPDATE summary SET tally = tally + '.$votedUpOrDown.' WHERE pollid = '.$pollId;
I'd go with a slight variant of the first option:
A database table dedicated for voting, which has the userid and their vote... store their vote as a boolean, then calculate the "sum" of the votes in MySQL.
Replace the boolean with an integer: +1 for an up-vote and -1 for a down-vote.
Then, instead of computing the sum over and over again, keep a running total somewhere; every time there is an up-vote, add one to the total and subtract one every time there is a down-vote. You could do this with an insert-trigger in the database or you could send an extra UPDATE thing SET vote_total = vote_total + this_vote to the database when adding new votes.
You'd probably want a unique constraint on the thing/userid pair in the vote tracking table too.
Keeping track of individual votes makes it easy to keep people from voting twice. Keeping a running total makes displaying the total quick and easy (and presumably this will be the most common operation).
Adding a simple sanity checker that you can run to ensure that the totals match the votes would be a nice addition as well.
serialized array: Please don't do that, such things make it very difficult to root around the database by hand to check and fix things, serialized data structures also make it very difficult (impossible in some cases) to properly constrain your data with foreign keys, check constraints, unique constraints, and what have you. Storing serialized data structures in the database is usually a bad idea unless the database doesn't need to know anything about the data other than how to give it back to you. Packing an array into a text column is a recipe for broken and inconsistent data in your database: broken code is easy to fix, broken data is often forever.
I had a hard time summing up my question. Basically:
There's a table called "files". Files holds an entry called "grades". It is used to identify the particular grade level a file might be useful for. Because a file can be useful for > 1 grade level, I store things like this
If it's only good for 3rd grade
grades: 3
If it's good for 3rd, 4th and 5th:
grades: 3,4,5
etc etc.
When putting together a SQL query to retrieve these files, I ran into a weird issue- Basically a user can say "I only want things that are good for 2nd and 3rd grade". So I should look for files that have "2,3" in the Grades area. Easy! BUT!
It could also have "1,2,3" or "2,3,4" or "2,4".
I;m getting a headache just thinking about it. It's easy enough to parse those entries via the commas to get "1" and "2", but what's the most efficient way to match a SQL record to the query? It seems like a waste to get EVERY RECORD in the DB, parse them down and then match them up again.
Is it better to go back to square one and create a DB called "files" and individual tables for each grade? That also seems like a waste- Writing multiple records for one file.
What's the solution here? I'm a little flummoxed.
several options here...
1) store the grades as an integer where each grade corresponds to a bit. grade 1 = bit 0, grade 2 = bit 1, grade 3 = bit 2, and so on. then grades 1,2,3 would correspond to 0x00000111 (8) and grades 2,4 would be 0x00001010 (10) etc; then querying becomes a simple matter of doing an AND comparison... if you want all rows where grades 2 and 4 are selected (and possibly others) then select * from files where (grades & 10) == true
2) if there are only a relatively few grades you could store each as a boolean column.
3) store the grades in a separate table and then the relationship between grades and files n a 3rd join table (since it is a many to many relationship).
To elaborate on what #emh said. Best option IMHO, would be having a grades table that connects to the files table on the file id (#3). You can then store the connection between grade and file in a new row each time (if the connection doesn't already exist)
tbl_file_grades
-----------
file_id
grade
When you're doing the search, you can join the two tables and filter the search by the grade column.
SELECT files.file_info FROM files
INNER JOIN tbl_file_grades ON files.file_id = tbl_file_grades.file_id
WHERE tbl_file_grades.grade = 1 AND tbl_file_grades.grade = 2 ...
I'm not sure whether the extra table for grades is necessary. That would depend on your needs. It seems like if you're happy without it now, then it isn't all that important to have.
And also, most important, welcome to SO.