Im going to develop Stock maintaining system using php+mysql. which will runs on server machine, so many users can update stock data. (in/out)
Im currently working on this system. I have following problems.
User A opens record “A”. ex- val=10
User B opens record “A”. ex - val=10
User A saves changes to record “A”. ex - val=10+2=12 (add 3 items, then stock should be 12)
User B saves changes to record “A”. ex - here i need to get record "A" value AS = 12, then B update val=12+3=15. (then add 3 items final stock will be 15)
In this example, User A’s changes are lost – replaced by User B’s changes.
I know mysql Innodb facilitate row level locking. My question is ,
is innodb engine do concurrent control ; and is this enough to (Innodb) to avoid "lost update" problem. or need to do extra coding to avoid this problem.
Is this enough please tell me how innodb works with my previous example. (lost update)
(sorry for my bad english)
thanks
InnoDB allows concurrent access, so User A and User B could definitely be handling the same data. User A will update the row based on his/her data, then User B can do the same -- ultimately resulting in User A's loss of data.
You should consider an alternative, if every update is vital to keep. For example, if both users are updating a blog article, you could make a new table that holds all these edits. Both user's edits would be preserved, despite when they retrieved the article content. When the article is retrieved, you can check when the most recent edit occurred and retrieve that instead.
Look, there's something called "versioning".
The idea is simple:
When a user opens a record, he also gets the version number.
When he saves changes to that record, at the sql level, the update is conditional, meaning that the update will happen ONLY if the current version is the same. This update also increases the version by one.
This way ensures you're not writing to a "stale" copy of your record.
Hope it's clear.
You could also implement some polling to the server, keep a record of the last update of the row and if it changes where if user B updates the record before A then you can notify user A that the record has been updated and that his changes wont take effect or you could update the values dynamically.
You can use two tables for this purpose. First - StockItems with item name, id, and count. Second - StockActivities with item id and operation amount.
To add or remove items from stock you need to insert records to the second table StockActivities, with item id and quantity that is added / removed.
item id:1, qnt: +10
item id:1, qnt: +1
item id:10, qnt: -2
Field count of StockItems table should be "read only" for users and should be calculated based on StockActivities table.
For example, you can create after insert trigger for StockActivities table that will update count field of added / removed stock item.
Judging by comments left, I think it prudent to respond with some pointers I have come across, in case someone needs to.
If you only want to update a value by an offset, you can do this quite easily and atomically. Assume the following data:
+----+--------+-------+
| id | name | price |
+----+--------+-------+
| 1 | Foo | 49 |
| 2 | Bar | 532 |
| 3 | Foobar | 24 |
+----+--------+-------+
We can now run the following queries to add one to the price:
select id, price from prices where name like "Foo";
// Later in the application
update prices set price=50 where id=1;
This is the non-concurrent/non-atomic way to do this, assuming that there is no changes or fetches in between the two queries. A more atomic way to do this, is the following.
select id, price from prices where name like "Foo";
// Later in the application
update prices set price=price+1 where id=1;
Here, this query allows us to increment the price in one query, eliminating the ability for others to come and update between two queries.
Additionally, there are methods of updating data safely, where the nature of the update is not a simple addition or subtraction. Let's say, here, that we have the following data:
+----+----------+---------------------+
| id | job_name | last_run |
+----+----------+---------------------+
| 1 | foo_job | 2016-07-13 00:00:00 |
| 2 | bar_job | 2016-07-14 00:00:00 |
+----+----------+---------------------+
In this case, we have multiple different clients, where all clients can do any job. We then need a way to dispatch work to one client, and only one client.
We can either use a transaction, where we will error out if the record has been updated or we can use a technique called CAS, or Compare and Swap.
Here's how we do this in MySQL:
update jobs set last_run=NOW() where id=1 and last_run='2016-07-13 00:00:00'
Then, in the data returned from mysql, we can tell the number of rows affected. If we have affected a row, then we have successfully updated it, and the job is ours. If there were no rows updated, then another machine has updated it, claiming the job there.
This works because any update from our application will cause the column to change, and since the column's value is a condition for completing the updated, it will avoid concurrent changes, allowing the application to decide what occurs next.
Related
I have a table dislikes which contains two columns, idone and idtwo.
These are unique ids' from users, for example:
| idone | idtwo |
-----------------
| 5 | 4 |
This means that user with id=5 does not like user with id=4. What I have in PHP is an array containing the ids' of all the users that the current user has selected as not liking them.
So say dislikes={1,2,3}, this means that the current user does not like user 1,2, or 3. There is an unknown number of users in the database.
So if user 1 chooses to dislike user 2 and user 3 (this is done via HTML dropdown), I pass dislike={2,3} to a PHP page which processes this data.
I want the PHP page to then add entries (1,2) and (1,3). Here is the first problem, how can I make sure only to add unique entries?
Also say that user 1 changes the fact that he dislikes user 2. Then I pass dislike={3} to the php page and must somehow remove all entries (1,!3), i.e. all entries in which user 1 dislikes anyone except user 3. How can I achieve this? Or is there a better way?
Since you're using MySQL the easiest thing is probably to use REPLACE INTO instead of INSERT with a primary key or unique index on the pair of columns (idone, idtwo).
Alternatively, on update, you can run a transaction that does any one of:
Remove existing rows for this user, add all rows, commit
Select existing rows, remove the rows from your local set that you would duplicate, add only new rows, commit
For a project I am making I need the possibility (like stackoverflow does) to save all the previous edit (revisions) for posts.
Consider I can have some 1 to N association with the post (for example 1 post with 5 images associated).
How would you suggest me to design the database for this?
Of course the ID of the post should stay the same to don't broke URLs:
site/post/123 (whenever revisions it is)
Each revisions to posts should be manually approved so you can't show directly the last revisions inserted. How would you suggest me to design the db?
I have tought
Table: Post
postID | reviewID | isApproved | authorID | text
And the image table (for example image, but it could be everything)
Secondary Table: Image
imageID | postID | reviewID | imagedata
Actually, I would split the post table in two, with the approved revisions in one, and the latest (not approved) revision in another. The rational is that any non approved revision which is not the latest would be supersceded by the next one (unless you really want to keep track of all the intermediate modifications, approved or not).
Table: OldPost
postID | reviewID | authorID | text
Table: PendingPost
postID | authorID | text
In that layout, whenever a new revision has been approved, it must be moved to the approved ones, but you don't have to filter them out when displaying the whole history, and conversely, you wont have to filter the approved revisions in the approval part of your site.
You could even refine the layout with yet another dedicated table for the latest approved revision (so three tables for the post in total, not counting attachements). This partitioning would improve the overall performance of your site for the most common queries, at the cost of more complex queries when you need all the data (less frequent operations).
Table: CurrentPost
postID | authorID | text
As you can see, this table structure is the same as the one for pending posts, so the updates would be trivial.
moving a revision to the old post table requires to find out the revision count, but you would have to do that operation anyway with a more classic db layout.
Regarding the attachment table, the layout seems to work.
Separate all aspects of a post between global information and versionable information. In other words, what things can be changed in a revision and what are always going to apply to any revision. These are going to be the fields in your two tables, one for your posts, and one for the revisions. You will also need a row to specify what post the revision is for as well as whether the revision is approved, and on the posts table, you need a row to specify what the current revision in.
I created a commenting system that allow users to submit comments on each item.
It turned into bit of a project/scope creep and now I need to implement the ability for users to edit their original comments and keep track of those comments.
All comments are located in the comments table
comments: id, comment, item_id, timestamp
Now that revisions must be tracked, I created a new table titled revisions:
comment_id, revision_id, timestamp
All comments (new or old) are entered into the comments table, if the user decides to revise an existing comment, it will be entered as a new record in the comments, then recorded into the revisions table. Once the new comment is entered into the comments table, it will take the id that was created and pass it into the revisions.reivison_id, and it will populate revisions.comment_id with the id of the original comment the user revised (hope I didn't lose you).
Now I've come to the problem I need help with: I need to display a list of all comments for a specific item, which would have a query of something like
select * from comments where item_id = 1
Now that I added the revisions table, I need to retrieve a list of comments for the specific item (just like the above query does) and (and heres the kicker) if any comment is revised, I need to return the most recent version of that comment.
What is the best way of accomplishing this?
I thought about running two queries, one to retrieve all the comments in the comments table, store in an array, and another query to return all records within the revisions table where I would set revisions.comment_id to be distinct and would only want to return the more recent one
the revisions query might look something like this
select comment_id DISTINCT, revision_id, timestamp
from revisions order by timestamp desc
What is the best way of only displaying the most recent version of each comment (some will have revisions and most won't)?
I am not a sql expert, so it might be accomplished using sql or will I need to run two different queries, store data into separate arrays, then run thru each array, compare and strip out the older versions of that comment? example (part in theory) below
foreach($revisions as $r):
$comments = strip key/value from comments array where $r['comment_id'] is
found in comments array
endforeach;
return $comments; // return the comments array after it was stripped of the older comments
I imagine if there was a way of running one query to only return a list of the most recent versions of a comment is the best practice, if so, could you provide the appropriate query for that, otherwise is the two queries into two arrays and striping out values from the comments array the best way or a better way?
Thanks in advance.
First off, I'll add two alternative approaches and then I'll edit with a query to deal with your current schema.
Option 1 - Add a deleted flag to your comments. When a comment is revised, do as you already do but also mak the original as deleted. Then you just need WHERE deleted = 0 wher you want active comments.
Option 2 - Change your revision table to be a clone of the comment table, plus an additional field for when the revision was made. Now, whenever you revise a comment, don't create a new record in comment, just update the existign row and add a new row to the revisions table. This is easily maintained with a trigger and is a very standard auditting pattern.
EDIT Option 3 - A query to cope with your schema.
As described, if I make a comment, then edit it twice (with no other activity), I get something like this...
id | comment | item_id | timestamp
----+--------------+---------+-----------
1 | Hello, | 1 | 13:00
2 | World! | 1 | 14:00
3 | Hello, World | 1 | 15:00
comment_id | revision_id | timestamp
-----------+-------------+-----------
1 | 2 | 14:00
2 | 3 | 15:00
Base on this, the live comment is the only one without an entry in the revision table...
SELECT *
FROM comment
WHERE NOT EXISTS (SELECT * FROM revision WHERE comment_id = comment.id)
AND item_id = #item_id
I have a users table that has the following fields: userid, phone, and address. Since this is user data, I'm letting the user change them whenever he wants. Problem is I'd like to keep track of those changes and preserve the old data too. Here's some of the ideas I considered:
appending the new data to the old data and using a separator like a pipe. When retrieving the field, I would check for the existence of that separator and if exists, get the chars after it as the new data. (feels cumbersome and doesn't feel right)
setting up a different changes table with the following fields: userid, fieldname, fieldcontent. When/if a user changes data (any data), I would log the event in this separate table under the user's userid, and the name/id of the field and the old content of the field, then I can now overwrite his old data in users with the new. If I want to find all changes made by this user, I would search the changes table by his userid. Problem with this is that I'm mixing all data changes (of all fields) into one table and so the fieldcontent field in changes has to be text to accommodate the varying field types. This still seems better than the first idea, but still not sure if I'm doing the right thing.
What other ideas are there or known best practices to keep old data?
Thanks in advance
Whatever you do don't do the first one.
The changes table is a better approach. It's also called an audit or history table. I wouldn't do a history of key-value pairs however. Instead do a history per relevant table. You can do this in application code or via database triggers. Basically whenever an insert, update or delete happens you record which happened and what data was changed.
Table user:
id
username
email address
phone
address
Table user_history:
id
change_type (I, U or D for insert, update or delete)
user_id (FK user.id)
email address
phone
address
date/time of change
optionally, also store who changed the record
A very simple way that we have used to track such changes is this:
users_history`
userid
changenumber smallint not null
changedate datetime not null
changeaddr varchar(32) not null
phone NULL,
address NULL
primary key on (userid, linenumber)
Each time you INSERT or UPDATE a record in the users table, simply INSERT a new record in the users_history table. changenumber starts at 1 and increments from there. changedate and changeaddr could be used to track when and where.
If a field value has not changed, feel free to put NULL in the respective users_history table field.
At the end of the day, your app does not need to change or store bulky history data in the users table, but you have all if it at your fingertips.
Edit:
This does preserve the old data. See the following example where the user started with a given address and phone, and then 4 days later updated the address, and 5 days later updated the phone. You have everything.
Current users record:
100 | 234-567-8901 | 123 Sesame Street
Sample History Table
100 | 1 | 2009-10-01 12:00 | 123-456-7890 | 555 Johnson Street
100 | 2 | 2009-10-05 13:00 | NULL | 123 Sesame Street
100 | 3 | 2009-10-10 15:00 | 234-567-8901 | NULL
The simplest way to implement this will be have another table just for history purpose, a snapshot. You don't need to mirror all the fields, just
change_id // row id (just for easy management later on if you need to delete specific row, otherwise its not really necessary)
user_id // Original user id
change_time // time of change
data // serialized data before change.
I have been browsing this site for the answer but I'm still a little unsure how to plan a similar system in its database structure and implementation.
In PHP and MySQL it would be clear that some achievements are earned immediately (when a specialized action is taken, in SO case: Filled out all profile fields), although I know SO updates and assigns badges after a certain amount of time. With so many users & badges wouldn't this create performance problems (in terms of scale: high number of both users & badges).
So the database structure I assume would something as simple as:
Badges | Badges_User | User
----------------------------------------------
bd_id | bd_id | user_id
bd_name | user_id | etc
bd_desc | assigned(bool) |
| assigned_at |
But as some people have said it would be better to have an incremental style approach so a user who has 1,000,000 forum posts wont slow any function down.
Would it then be another table for badges that could be incremental or just a 'progress' field in the badges_user table above?
Thanks for reading and please focus on the scalability of the desired system (like SO thousands of users and 20 to 40 badges).
EDIT: to some iron out some confusion I had assigned_at as a Date/Time, the criteria for awarding the badge would be best placed inside prepared queries/functions for each badge wouldn't it? (better flexibility)
I think the structure you've suggested (without the "assigned" field as per the comments) would work, with the addition of an additional table, say "Submissions_User", containing a reference to user_id & an incrementing field for counting submissions. Then all you'd need is an "event listener" as per this post and methinks you'd be set.
EDIT: For the achievement badges, run the event listener upon each submission (only for the user making the submission of course), and award any relevant badge on the spot. For the time-based badges, I would run a CRON job each night. Loop through the complete user list once and award badges as applicable.
regarding the sketch you included: get rid of the boolean column on badges_user. it makes no sense there: that relation is defined in terms of the predicate "user user_id earned the badge bd_id at assigned_at".
as for your overall question: define the schema to be relational without regard for speed first (that'll get you rid of half of potential perf. problems, possibly in exchange for different perf. problems), index it properly (what's proper depends on the query patterns), then if it's slow, derive a (still relational) design from that that's faster. like you may need to have some aggregates precomputed, etc.
I would keep a similar type structure to what you have
Badges(badge_id, badge_name, badge_desc)
Users(user_id, etc)
UserBadges(badge_id, user_id, date_awarded)
And then add tracking table(s) depending on what you want to track and # what detail level... then you can update the table accordingly and set triggers on it to "award" the badges
User_Activity(user_id, posts, upvotes, downvotes, etc...)
You can also track stats from the other direction too and trigger badge awards
Posts(post_id, user_id, upvotes, downvotes, etc...)
Some other good points are made here
I think this is one of those cases where your many-to-many table (Badges_User) is appropriate.
But with a small alteration so that unassigned badges isn't stored.
I assume assigned_at is a date and/or time.
Default is that the user does not have the badges.
Badges | Badges_User | User
----------------------------------------------
bd_id | bd_id | user_id
bd_name | user_id | etc
bd_desc | assigned_at |
| |
This way only badges actually awarded is stored.
A Badges_User row is only created when a user gets a badge.
Regards
Sigersted