Data is being deleted from table - php

I've got a really horrible problem on my hands, for some reason data is being wiped from a particular table's column and I have no idea why.
The table is called events. Inside the table there are a number of columns, name, date etc however recently I put a field in called 'schedule' this is of type TEXT.
On the web page, you can edit the event in different tabs and create a schedule (fields using jQuery clone / etc) and then when the schedule data gets saved into th database the HTML $_POST data gets converted into a JSON array using json_encode:
$db->prepared['schedule'] = json_encode(
array(
"scheduleday" => ($_POST['scheduleday'] ?? array()),
"scheduletime" => ($_POST['scheduletime'] ?? array()),
"scheduledescription" => ($_POST['scheduledescription'] ?? array()),
"schedulevenue" => ($_POST['schedulevenue'] ?? "")
)
);
The resulting json array could look like this, for example:
{"scheduleday":{"1":"2020-08-25","2":"2020-08-26"},"scheduletime":{"1":["19:30","20:00 - 20:50"],"2":["14:00 - 14:50","14:50 - 15:00","15:00 - 15:50","15:50 - 16:00","16:00 - 16:50"]},"scheduledescription":{"1":["Introduction","John Smith"],"2":["John Smith","Break","John Smith teaching","Break","John Smith teaching"]},"schedulevenue":"1 Acia Avenue"}
Straight after this is INSERTED or UPDATEd into the mysql database it is then sent to a function that uses fpdf to turn the json array into a PDF which can be downloaded.
This works well, apart from one problem. Random (or what appears to be random) schedules are going missing. We can create a schedule one day and then a week later, or a few days later, suddenly the file will have been deleted and the column in the table removed. No other data in the table is getting deleted, nothing apart from the schedule.
Here's what I've tried to find out whats going on:
SQL Trigger
I've set a trigger on the event table which dumps data into a table called event_trigger AFTER an update is run. This is the code for the trigger:
BEGIN
INSERT INTO `event_trigger` (oldID,name,old_schedule,new_schedule) VALUES
(OLD.id,
OLD.name,
OLD.schedule,
NEW.schedule);
END
This has been helpful in the sense that I now know if there is an event with a schedule missing, because the NEW.schedule will be empty. However that last two times a schedule has gone missng (yesterday and today, in fact) its been outside of office hours, 18:45 and 19:22 so no one should be performing any updates on events in those times.
.txt File Log
The other thing i've done is on the page where the events are updated I've put a text log, which dumps the prepared variables, the SQL statement and the user and the user ID to a text file. Unfortunately this isnt working because when a schedule goes missing, its not getting logged in there. All that tells me is that its happening somewhere else.
I dont know how to narrow this down further. There is no user activity at the times when an event is deleted. The closest i've gotten is using the TRIGGER. But I am so limited as the information from the trigger is not enough; I can't get IP, SQL statement, user ID or anything like that. Just the OLD and NEW variables.
Can anyone help me think of ways to investigate, I'd be so grateful. This has been going on for over a month now, and its infuriating because I just cannot see why it is happening.
The only extra option I can think of doing is turning on full SQL logs, but I am reluctant to do that as it will slow the server down immensly.

Have you examined your web server's logs? If this data corruption comes in via your web application, you should be able to see the exact time and originating IP address of the unfortunate event. You may also be able to figure out which page of your web app caused the problem.
To track things down more accurately in your database server, add, to your event_trigger table, these columns:
timestamp (when the trigger fired)
user (the MySQL host and user that issued the UPDATE that fired the trigger)
query (the text of the UPDATE query)
Then change your trigger to say
BEGIN
INSERT INTO event_trigger
(oldID,name,old_schedule,new_schedule, ts, user, query)
VALUES
(OLD.id,
OLD.name,
OLD.schedule,
NEW.schedule,
NOW(),
CURRENT_USER(),
(SELECT INFO FROM information_schema.process_list WHERE id=CONNECTION_ID())
);
END
Then, your event_trigger table will show you the user webapp#localhost or cybercreep#cracker.example.com that issued the UPDATE, when it was issued, and what the exact query was.
(With the exact query you can search your code base to try to track down what function is running amok.)
Once you know what database user is issuing the query, you can consider suspending access for that particular user after business hours, and see who complains. But keep in mind that the database user is probably the generic username used by your web application, so it may not tell you much.
It's very likely that this is a legitimate web app user misusing, by mistake, your system.
Pro tip : Put automatic timestamp columns in all your application tables so you can keep track of changes. Add
ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
columns to your tables, and MySQL takes care of updating them.
Pro tip 2 (harder) add an action log table to your database, and make your web app insert a row to it each time a user takes an action. Then you can run queries like "who did what yesterday between 19:00 and 19:30?" Customer support people love to be able to do that.

Related

What do you think of this approach for logging changes in mysql and have some kind of audit trail

I've been reading through several topics now and did some research about logging changes to a mysql table. First let me explain my situation:
I've a ticket system with a table: 'ticket'
As of now I've created triggers which will enter a duplicate entry in my table: 'ticket_history' which has "action" "user" and "timestamp" as additional columns. After some weeks and testing I'm somewhat not happy with that build since every change is creating a full copy of my row in the history table. I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me. Also with the trigger I've written I get a new row in the history even if there is no change. But this is just a design flaw of my trigger!
Here my trigger:
BEFORE UPDATE ON ticket FOR EACH ROW
BEGIN
INSERT INTO ticket_history
SET
idticket = NEW.idticket,
time_arrival = NEW.time_arrival,
idticket_status = NEW.idticket_status,
tmp_user = NEW.tmp_user,
action = 'update',
timestamp = NOW();
END
My new approach in order to avoid having triggers
After spening some time on this topic I came up with an approach I would like to discuss and implement. But first I would have some questions about that:
My idea is to create a new table:
id sql_fwd sql_bwd keys values user timestamp
-------------------------------------------------------------------------
1 UPDATE... UPDATE... status 5 14 12345678
2 UPDATE... UPDATE... status 4 7 12345678
The flow would look like this in my mind:
At first I would select something or more from the DB:
SELECT keys FROM ticket;
Then I display the data in 2 input fields:
<input name="key" value="value" />
<input type="hidden" name="key" value="value" />
Hit submit and give it to my function:
I would start with a SELECT again: SELECT * FROM ticket;
and make sure that the hidden input field == the value from the latest select. If so I can proceed and know that no other user has changed something in the meanwhile. If the hidden field does not match I bring the user back to the form and display a message.
Next I would build the SQL Queries for the action and also the query to undo those changes.
$sql_fwd = "UPDATE ticket
SET idticket_status = 1
WHERE idticket = '".$c_get['id']."';";
$sql_bwd = "UPDATE ticket
SET idticket_status = 0
WHERE idticket = '".$c_get['id']."';";
Having that I run the UPDATE on ticket and insert a new entry in my new table for logging.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time and for my history I could simply look up the keys and values and generate some kind of list. Also having the SQL_BWD I simply can undo changes.
My questions to that would be:
Would it be noticeable doing an additional select everytime I want to update something?
Do I lose some benefits I would have with triggers?
Are there any big disadvantages
Are there any functions on my mysql server or with php which already do something like that?
Or is there might be a much easier way to do something like that
Is maybe a slight change to my trigger I've now already enough?
If I understad this right MySQL is only performing an update if the value has changed but the trigger is executed anyways right?
If I'm able to change the trigger, can I still prevent somehow the overwriting of data while 2 users try to edit the ticket the same time on the mysql server or would I do this anyways with PHP?
Thank you for the help already
Another approach...
When a worker starts to make a change...
Store the time and worker_id in the row.
Proceed to do the tasks.
When the worker finishes, fetch the last worker_id that touched the record; if it is himself, all is well. Clear the time and worker_id.
If, on the other hand, another worker slips in, then some resolution is needed. This gets into your concept that some things can proceed in parallel.
Comments could be added to a different table, hence no conflict.
Changing the priority may not be an issue by itself.
Other things may be messier.
It may be better to have another table for the time & worker_ids (& ticket_id). This would allow for flagging that multiple workers are currently touching a single record.
As for History versus Current, I (usually) like to have 2 tables:
History -- blow-by-blow list of what changes were made, when, and by whom. This is table is only INSERTed into.
Current -- the current status of the ticket. This table is mostly UPDATEd.
Also, I prefer to write the History directly from the "database layer" of the app, not via Triggers. This gives me much better control over the details of what goes into each table and when. Plus the 'transactions' are clear. This gives me confidence that I am keeping the two tables in sync:
BEGIN; INSERT INTO History...; UPDATE Current...; COMMIT;
I've answered a similar question before. You'll see some good alternatives in that question.
In your case, I think you're merging several concerns - one is "storing an audit trail", and the other is "managing the case where many clients may want to update a single row".
Firstly, I don't like triggers. They are a side effect of some other action, and for non-trivial cases, they make debugging much harder. A poorly designed trigger or audit table can really slow down your application, and you have to make sure that your trigger logic is coordinated between lots of developers. I realize this is personal preference and bias.
Secondly, in my experience, the requirement is rarely "show the status of this one table over time" - it's nearly always "allow me to see what happened to the system over time", and if that requirement exists at all, it's usually fairly high priority. With a ticketing system, for instance, you probably want the name and email address of the users who created, and changed the ticket status; the name of the category/classification, perhaps the name of the project etc. All of those attributes are likely to be foreign keys on to other tables. And when something does happen that requires audit, the requirement is likely "let me see immediately", not "get a database developer to spend hours trying to piece together the picture from 8 different history tables. In a ticketing system, it's likely a requirement for the ticket detail screen to show this.
If all that is true, then I don't think history tables populated by triggers are a good idea - you have to build all the business logic into two sets of code, one to show the "regular" application, and one to show the "audit trail".
Instead, you might want to build "time" into your data model (that was the point of my answer to the other question).
Since then, a new style of data architecture has come along, known as CQRS. This requires a very different way of looking at application design, but it is explicitly designed for reactive applications; these offer much nicer ways of dealing with the "what happens if someone edits the record while the current user is completing the form" question. Stack Overflow is an example - we can see, whilst typing our comments or answers, whether the question was updated, or other answers or comments are posted. There's a reactive library for PHP.
I do understand that disk space is cheap and I should not worry about it but in order to retrieve some kind of log or nice looking history for the user is painful, at least for me.
A large history table is not necessarily a problem. Huge tables only use disk space, which is cheap. They slow things down only when making queries on them. Fortunately, the history is not something you'd use all the time, most likely it is only used to solve problems or for auditing.
It is useful to partition the history table, for example by month or week. This allows you to simply drop very old records, and more important, since the history of the previous months has already been backed up, your daily backup schedule only needs to backup the current month. This means a huge history table will not slow down your backups.
With that I can try to catch possible overwrites while two users are editing the same ticket in the same time
There is a simple solution:
Add a column "version_number".
When you select with intent to modify, you grab this version_number.
Then, when the user submits new data, you do:
UPDATE ...
SET all modified columns,
version_number=version_number+1
WHERE ticket_id=...
AND version_number = (the value you got)
If someone came in-between and modified it, then they will have incremented the version number, so the WHERE will not find the row. The query will return a row count of 0. Thus you know it was modified. You can then SELECT it, compare the values, and offer conflict resolution options to the user.
You can also add columns like who modified it last, and when, and present this information to the user.
If you want the user who opens the modification page to lock out other users, it can be done too, but this needs a timeout (in case they leave the window open and go home, for example). So this is more complex.
Now, about history:
You don't want to have, say, one large TEXT column called "comments" where everyone enters stuff, because it will need to be copied into the history every time someone adds even a single letter.
It is much better to view it like a forum: each ticket is like a topic, which can have a string of comments (like posts), stored in another table, with the info about who wrote it, when, etc. You can also historize that.
The drawback of using a trigger is that the trigger does not know about the user who is logged in, only the MySQL user. So if you want to record who did what, you will have to add a column with the user_id as I proposed above. You can also use Rick James' solution. Both would work.
Remember though that MySQL triggers don't fire on foreign key cascade deletes... so if the row is deleted in this way, it won't work. In this case doing it in the application is better.

Getting the time on a MySQL DB table entry

I have a application form that should have been closed last friday. Due to things I can't control, it wasn't closed, and now people have applied after last friday. Unfortunately, those people can't come to the event, so we have to contact them to tell them the bad news.
My problem is, that when I made the form I was too fast and forgot to add a timestamp in the table.
Is there some way (either using php, phpmyadmin or sql-commands) to find out who has applied after a certain time (i.e. when the row was added to the table)?
The Database is a MySQL-database.
Does the table have an autoincrementing ID field?
If so, figure out the value of that ID field for the last person you were able to accept, then use this query in an SQL client like phpmyadmin.
SELECT *
FROM tablename
WHERE tablename.id > last_accepted_id
That will show all the rows for people to which you owe bad news.
This may be possible if you have logging on (not on by default).
Your log file location is configured here in a .conf file.
/etc/*.conf
The logging will be written in that file as..
log=/tmp/yourlogfilename.log
If this is set you can go to that log and see when/what queries where used.

Getting a MySQL database difference

I have a mysql database. What I'd like to do is perform an arbitrary action on it, and then figure out what changed. Something like this:
//assume connection to db already established
before();//saves db state
perform_action();//does stuff to db
diff();//prints what happened
I'd want it to output something like:
Row added in table_0 ]details]
Row added in table_1 [details]
Row modified in table_5 [details]
Row deleted in table_2 [details]
Any ideas?
To further clarify: You know how on stackoverflow, if you check a post's edits, you can see red lines/green highlights indicating what's been changed? I want something like that, but for mysql databases.
Instead of copying your whole database in order to save the state for a later diff, you might be better off by using triggers:
http://dev.mysql.com/doc/refman/5.0/en/triggers.html
When you setup appropriate triggers, you can log changes to a table - for example, you can setup a trigger that automatically logs the old values and the new values for every update. To see the changes, query the table that was filled by the trigger.
Of course, the trigger is not restricted to changes made by your application, it will also log updates done by other applications. But this is also the case if you diff the old version of the database with the new version of the database.
I think normally your application would log any interesting changes as it makes them. Or you would set up history tables for everything with datetimes.
To do it the way you describe, you could dump the contents of the database into a file before and after your action and do a diff on the two files. In php, you can check out xdiff: http://us.php.net/manual/en/book.xdiff.php
If this is something you're doing only occasionally in controlled circumstances to test some queries you're not sure about, you can dump and diff on the command line.
One way is to parse the log files, which will give you exact SQL statements executed in your database. I'm not exactly sure how to separate SQL statements made by your application from other applications (if thats the case)
The only thing I can think of is to do some combination of a few somewhat hackey things:
Save a [temporary?] table of row IDs, to check for new rows. If you need to know what was in deleted or modified rows before, you'll need to copy the whole DB, which would be rather messy.
Have each row have a datestamp that gets modified on update; grab rows for whom the updated datestamp is newer than when the analysis started.
Have a layer between your application and the database (if you have something like the classic $db->query(), it would make this easy), log queries sent, which can then be looked at.
I suppose the real question is if you want to know what queries are being executed against the DB, or if you want to know what they queries you're running are actually doing.

MYSQL moving information with php

I am wondering if it is possible to automate or by button press to move mysql table information from one table to another table deleting it from the first table and putting it in another table? Using php.
My mysql table is big and the page that adds the information to that table has 70 query's on it which slows the page refresh times. I need to move information from the first table to the second at a certain time of day everyday so that those querys don't have to look through all of my giant 27k row table.
Is this possible?
Also if someone could help me with my comment on this page I would be grateful.
link text
PHP doesn't have a constantly running server you can schedule background tasks with.
If you have access to the server you can set up a cron job (or scheduled task under windows) to run the PHP script for you.
Or (and this isnt so nice) you can put the script on the webserver and call it manually at the appropriate time by entering the URL in your browser.
A 27k row table is small by SQL standards, as long as it is properly indexed.
For instance, if you don't care about data from yesterday, you can add an indexed date column and filter with WHERE myDate > NOW() - INTERVAL 1 DAY, and SQL will automatically restrict the query to the rows younger than 24 hours.
I am wondering if it is possible to automate or by button press to move mysql table information from one table to another table deleting it from the first table and putting it in another table? Using php.
You can initiate it from PHP, but what you ask is effectively MySQL's domain.
It can be accomplished in two statements:
Use an INSERT INTO statement to copy the rows from the old table to the new one
Delete the old table
My preference would be that this occurs in a stored procedure for sake of a transaction and ease of execution (in case you want it initiated by CRON/etc) because it would be easier to call one thing vs a couple or more.
27k is not very big table and MySQL should work ok with that. Do you have all the required indexes? Did you used EXPLAIN on your slow queries?
As for the question about moving data from one table to another - create a php script that will be run by CRON and will move rows one by one. What's the problem here?

Locking a MySQL database so only one person at once can run a query?

I am having a few issues when people are trying to access a MySQL database and they are trying to update tables with the same information.
I have a webpage written using PHP. In this webpage is a query to check if certain data has been entered into the database. If the data hasn't, then i proceed to insert it. The trouble is that if two people try at the same time, the check might say the data has not been entered yet but when the insert takes place it has been by the other person.
What is the best way to handle this scenario? Can i lock the database to only process my queries first then anothers?
Read up on database transactions. That's probably a better way to handle what you need than running LOCK TABLES.
Manually locking tables is the worst think you could ever do. What happens if the code to unlock them never runs (because the PHP fails, or the user next clicks the next step, walks away from the PC, etc).
One way to minimize this in a web app, and a common mistake devs do, is to have a datagrid full of text boxes of data to edit, with a save button per row or on the whole table. Obviously if the person opens this on Friday and comes back Monday, the data could be wrong and they could be saving over new data. One easy way to fix this is to instead have EDIT buttons on each row, and clicking the button then loads an editing form, this way they are hopefully loading fresh data and can only submit 1 row change at a time.
But even more importantly, you should include a datetime field as a hidden input box, and when they try to submit the data look at the date and decide how old the data is and make a decision how old is too old and to warn or deny the user about their action.
You're looking for LOCK.
http://dev.mysql.com/doc/refman/5.0/en/lock-tables.html
This can be run as a simple mysql_query (or MySQLi::query/prepare).
I'd say it's better to lock specific tables (although you can probably try LOCK TABLES *) that need to be locked rather than the whole database - as nothing will be able to be read. I think you're looking for something like:
LOCK TABLES items;
START TRANSACTION;
INSERT INTO items (name, label) VALUES ('foo', 'bar');
UNLOCK TABLES;
Or in PHP:
mysql_query('LOCK TABLES items');
mysql_query("INSERT INTO items (name, label) VALUES ('foo', 'bar')");
mysql_query('UNLOCK TABLES');
You could check if data has been changed before you edit something. In that case if someone has edited data while other person is doing his edit, he will be informed about it.
Kind of like stackoverflow handles commenting.

Categories