Got a big problem that's confusing as hell. I'm using Laravel (3.2.5 and now 3.2.7) and I'm using the Eloquent ORM for updating a database (PostgreSQL).
Here's what I'm doing:
I have a db full of data
I'm pulling info from an external API to update my db full of data
I run a script that puts the db full of data into arrays and same with API. This gets compared
I fill a data object with an array full of changes
I "save" it
nothing happens -.-
$updateLinks = array_diff($dbLinkArray, $dbLinkArrayOriginal);
$dbLink->fill($updateLinks);
Log::info('1st LOG Original: '.$dbLinkArrayOriginal['link_text'].' New: '.$dbLinkArray['link_text']);
Log::info('2nd Log Dirty: '.implode(', ', $dbLink->get_dirty()));
$dbLink->save();
Log::info('3rd Log Supposed to be changed: '.implode(', ',array_keys($updateLinks)));
I employed some logging and the debug toolbar to figure out wtf happened. Here's the info:
all the SQL queries run to update with correct information. When the Query is run via phpPgAdmin, it updates as it should. The problem here is that the query updates EVERY column in the row instead of just the changes. Using "update" instead of "fill/save" creates the same problem.
none of the table information gets updated, ever.
The 1st log shows that the link_text isn't equal. This is okay because it shows the link_text needs to be updated. However, it's a clear indicator that nothing was updated the next time I run my script. The log shows the same info every time and just as many log events happen.
The 2nd log shows that the ENTIRE object is dirty rather than just what was supposed to be updated. This is why the SQL gets updated
The 3rd log spits out exactly what's supposed to be updated. 3-5 columns max and that's it. And all is in correct format.
Any idea why, first of all, the database is not getting updated even though Laravel marks the SQL as being run and shows the correct query?
Also, any idea why the ENTIRE object is dirty and the query tries to update the entire object (23+ columns) instead of only the changes (3-5 columns)?
For your second question (why all columns update, instead of just the dirty ones). The Laravel documentation states:
By default, all attribute key/value pairs will be store during mass-assignment. However, it is possible to create a white-list of attributes that will be set. If the accessible attribute white-list is set then no attributes other than those specified will be set during mass-assignment.
Does this help you?
Kind regards,
Hendrik
Related
I've got a really horrible problem on my hands, for some reason data is being wiped from a particular table's column and I have no idea why.
The table is called events. Inside the table there are a number of columns, name, date etc however recently I put a field in called 'schedule' this is of type TEXT.
On the web page, you can edit the event in different tabs and create a schedule (fields using jQuery clone / etc) and then when the schedule data gets saved into th database the HTML $_POST data gets converted into a JSON array using json_encode:
$db->prepared['schedule'] = json_encode(
array(
"scheduleday" => ($_POST['scheduleday'] ?? array()),
"scheduletime" => ($_POST['scheduletime'] ?? array()),
"scheduledescription" => ($_POST['scheduledescription'] ?? array()),
"schedulevenue" => ($_POST['schedulevenue'] ?? "")
)
);
The resulting json array could look like this, for example:
{"scheduleday":{"1":"2020-08-25","2":"2020-08-26"},"scheduletime":{"1":["19:30","20:00 - 20:50"],"2":["14:00 - 14:50","14:50 - 15:00","15:00 - 15:50","15:50 - 16:00","16:00 - 16:50"]},"scheduledescription":{"1":["Introduction","John Smith"],"2":["John Smith","Break","John Smith teaching","Break","John Smith teaching"]},"schedulevenue":"1 Acia Avenue"}
Straight after this is INSERTED or UPDATEd into the mysql database it is then sent to a function that uses fpdf to turn the json array into a PDF which can be downloaded.
This works well, apart from one problem. Random (or what appears to be random) schedules are going missing. We can create a schedule one day and then a week later, or a few days later, suddenly the file will have been deleted and the column in the table removed. No other data in the table is getting deleted, nothing apart from the schedule.
Here's what I've tried to find out whats going on:
SQL Trigger
I've set a trigger on the event table which dumps data into a table called event_trigger AFTER an update is run. This is the code for the trigger:
BEGIN
INSERT INTO `event_trigger` (oldID,name,old_schedule,new_schedule) VALUES
(OLD.id,
OLD.name,
OLD.schedule,
NEW.schedule);
END
This has been helpful in the sense that I now know if there is an event with a schedule missing, because the NEW.schedule will be empty. However that last two times a schedule has gone missng (yesterday and today, in fact) its been outside of office hours, 18:45 and 19:22 so no one should be performing any updates on events in those times.
.txt File Log
The other thing i've done is on the page where the events are updated I've put a text log, which dumps the prepared variables, the SQL statement and the user and the user ID to a text file. Unfortunately this isnt working because when a schedule goes missing, its not getting logged in there. All that tells me is that its happening somewhere else.
I dont know how to narrow this down further. There is no user activity at the times when an event is deleted. The closest i've gotten is using the TRIGGER. But I am so limited as the information from the trigger is not enough; I can't get IP, SQL statement, user ID or anything like that. Just the OLD and NEW variables.
Can anyone help me think of ways to investigate, I'd be so grateful. This has been going on for over a month now, and its infuriating because I just cannot see why it is happening.
The only extra option I can think of doing is turning on full SQL logs, but I am reluctant to do that as it will slow the server down immensly.
Have you examined your web server's logs? If this data corruption comes in via your web application, you should be able to see the exact time and originating IP address of the unfortunate event. You may also be able to figure out which page of your web app caused the problem.
To track things down more accurately in your database server, add, to your event_trigger table, these columns:
timestamp (when the trigger fired)
user (the MySQL host and user that issued the UPDATE that fired the trigger)
query (the text of the UPDATE query)
Then change your trigger to say
BEGIN
INSERT INTO event_trigger
(oldID,name,old_schedule,new_schedule, ts, user, query)
VALUES
(OLD.id,
OLD.name,
OLD.schedule,
NEW.schedule,
NOW(),
CURRENT_USER(),
(SELECT INFO FROM information_schema.process_list WHERE id=CONNECTION_ID())
);
END
Then, your event_trigger table will show you the user webapp#localhost or cybercreep#cracker.example.com that issued the UPDATE, when it was issued, and what the exact query was.
(With the exact query you can search your code base to try to track down what function is running amok.)
Once you know what database user is issuing the query, you can consider suspending access for that particular user after business hours, and see who complains. But keep in mind that the database user is probably the generic username used by your web application, so it may not tell you much.
It's very likely that this is a legitimate web app user misusing, by mistake, your system.
Pro tip : Put automatic timestamp columns in all your application tables so you can keep track of changes. Add
ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
columns to your tables, and MySQL takes care of updating them.
Pro tip 2 (harder) add an action log table to your database, and make your web app insert a row to it each time a user takes an action. Then you can run queries like "who did what yesterday between 19:00 and 19:30?" Customer support people love to be able to do that.
Hey Guys I have researched and have tested few method for logging user activity such as when an user updates his profile details or when an user updates his status in a task.
What I require to log :
User ID from session
Table being updated
Field Name
Old Value
New Value
Timestamps
Method 1:
Run an additional query along with the insert/update/delete query to store details.
Method 2:
Using http://packalyst.com/packages/package/regulus/activity-log
In both the above methods I have to write multiple code for each create/update/delete I would like to know if there exist a better way to handel this problem.
You want to store revisions of the data being manipulated by the user.
This calls for Revisionable.
Revisionable works using trait-implementation. Every action made by the user, will have the old and new value of the column stored in a seperate table. You can then query the revisionable table to get the changes made by the user.
Please note that the Revisionable version quoted above, doesn't store INSERT actions.
A few days ago I've created such package, which, unlike VentureCraft's one, logs only static data - tables, values. No fks, no model names etc.
Also it handles the revisions in different manner, which makes it much easier to eg. compare given 2 versions, since it doesn't log single field change per row, but all the data involved per row.
Check this out: Sofa/Revisionable
This is pretty young and will be improved.
It's also not Eloquent specific, but it works out of the box with Laravel 4. You simply download it, adjust config if needed, add a few lines of code to your models and it's ready to go.
I am writing information from an XML feed to a database for use on our site. We have found the xml feeds can be inconsistent, so writing info to the database has been a good solution for us.
Ideally I want to cron a file once a day that parses the xml and then writes it to the database. What methodology should I use to eliminate the data from the previous day because I no longer need it once we cron the file and update with the new daily records.
Bad:
cron file -> delete old records -> write new records
What if the xml is not quite right or there is a problem with the script? Then we blew away the data and can't get any new data at the moment.
If the XML info is bad, at least I can then write in some php on the front end to still display the older data but with dates modified or something.
What type of checks and fail safes would be best for my application? I need to update the records each day but only delete the old records if I know for sure we have good new data to import.
I would suggest a backup in the form of a mysql dump. Essentially, the dump is a snapshot of a database at a given time. So if you start the process and something goes wrong, you can revert it back to the point it was at before you started. The workflow would be something along the lines of:
Create dump -> try {Delete old records -> Create new records } catch (Load dump back into database)
If you are using mySQL more information on dumps can be found at: http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html
most other databases have some form of dump as well
Create a guid for your table by hashing a couple of the fields together - whichever ones are persistant between updates. For example, if you are updating inventory you might use the distributor and sku as the input for your guid.
Then when you update just use a mysql REPLACE query to exchange the old data for new data.
REPLACE
Or use an INSERT...on duplicate key update
The nice thing about this is if your script fails for some reason you can safely run it again without getting extra rows pushed into your table.
If you are worried about bad XML data being pushed into your db just validate all your data before pushing it into your table and anything that shouldn't go just gets skipped.
You might want to take a sql backup at the beginning of the script - and if somehow your table gets really messed up you can always go back and restore to a safe backup.
Basically, I am trying to create an interface that will tell an administrator "Hey, we ran this query, and we weren't so sure about it, so if it broke things click here to undo it".
The easiest way I can think to do this is to somehow figure out what tables and cells an identified "risky" query writes to, and store this data along with some bookkeeping data in a "backups" table, so that if necessary the fields can be repopulated with their original contents.
How do I go about figuring out which fields get overwritten by a particular (possibly complicated) mysql command?
Edit: "risky" in terms of completing successfully but doing unwanted things, not in terms of throwing an error or failing and leaving the system in an inconsistent state.
I suggest the following things:
- add an AFTER UPDATE trigger to every table you want to monitor
- create a copy of every table (example: [yourtable]_backup) you want to monitor
- in all AFTER UPDATE triggers, add code: INSERT INTO yourtable_backup VALUES(OLD.field1, OLD.field2..., OLD.fieldN)
How it works: the AFTER UPDATE trigger detects an update of the table, and backups the old values into the backup table
Important: you need to use INNODB table format for triggers to work. Triggers don't work with MyISAM tables.
You may add a timestamp field to the backup tables to know when each row was inserted.
Documentation: http://dev.mysql.com/doc/refman/5.5/en/create-trigger.html
I have a mysql database. What I'd like to do is perform an arbitrary action on it, and then figure out what changed. Something like this:
//assume connection to db already established
before();//saves db state
perform_action();//does stuff to db
diff();//prints what happened
I'd want it to output something like:
Row added in table_0 ]details]
Row added in table_1 [details]
Row modified in table_5 [details]
Row deleted in table_2 [details]
Any ideas?
To further clarify: You know how on stackoverflow, if you check a post's edits, you can see red lines/green highlights indicating what's been changed? I want something like that, but for mysql databases.
Instead of copying your whole database in order to save the state for a later diff, you might be better off by using triggers:
http://dev.mysql.com/doc/refman/5.0/en/triggers.html
When you setup appropriate triggers, you can log changes to a table - for example, you can setup a trigger that automatically logs the old values and the new values for every update. To see the changes, query the table that was filled by the trigger.
Of course, the trigger is not restricted to changes made by your application, it will also log updates done by other applications. But this is also the case if you diff the old version of the database with the new version of the database.
I think normally your application would log any interesting changes as it makes them. Or you would set up history tables for everything with datetimes.
To do it the way you describe, you could dump the contents of the database into a file before and after your action and do a diff on the two files. In php, you can check out xdiff: http://us.php.net/manual/en/book.xdiff.php
If this is something you're doing only occasionally in controlled circumstances to test some queries you're not sure about, you can dump and diff on the command line.
One way is to parse the log files, which will give you exact SQL statements executed in your database. I'm not exactly sure how to separate SQL statements made by your application from other applications (if thats the case)
The only thing I can think of is to do some combination of a few somewhat hackey things:
Save a [temporary?] table of row IDs, to check for new rows. If you need to know what was in deleted or modified rows before, you'll need to copy the whole DB, which would be rather messy.
Have each row have a datestamp that gets modified on update; grab rows for whom the updated datestamp is newer than when the analysis started.
Have a layer between your application and the database (if you have something like the classic $db->query(), it would make this easy), log queries sent, which can then be looked at.
I suppose the real question is if you want to know what queries are being executed against the DB, or if you want to know what they queries you're running are actually doing.