Drupal 7 database API very slow compared to PHP mysqli_connect() - php

I am trying to loop over some data coming to me from a SOAP request and insert the records into a custom table in my Drupal install.
At first I created a custom module and used standard mysqli_connect() syntax to connect to the database and loop through the records and insert them. This was working great and fetched and inserted my remote data in about 2 seconds without a hitch.
I then remembered that Drupal has a database API (I am fairly new to Drupal) so I decided to do it right and use the API instead. I converted my code to how I think I should be doing it per the API docs, but now the process takes more like 5 or 6 seconds and sometimes even randomly hangs and doesn't complete at all and I get weird Session errors. The records end up inserting fine, but it just takes forever.
I'm wondering if I am doing it wrong. I would also like to wrap the inserts into a transaction, because I will first be deleting ALL of the records in the destination table first and then inserting the new data and since I am deleting first, I want to be able to roll back if the inserts fail for whatever reason.
I did not add transaction code to my original PHP only code, but did try to attempt it with the Drupal API, although completely removing the transaction/try/catch code doesn't seem to affect the speed or issues at all.
Anyway here is my original code:
$data = simplexml_load_string($jobsXml);
$connection = mysqli_connect("localhost","user","pass","database");
if (mysqli_connect_errno($connection))
{
echo "Failed to connect to MySQL: " . mysqli_connect_error();
exit();
}
// delete * current jobs
mysqli_query($connection,'TRUNCATE TABLE jobs;');
$recordsInserted = 0;
foreach ($data->NewDataSet->Table as $item) {
//escape and cleanup some fields
$image = str_replace('http://www.example.com/public/images/job_headers/', '', $item->job_image_file);
$specialty_description = mysqli_real_escape_string($connection, $item->specialty_description);
$job_board_title = mysqli_real_escape_string($connection, $item->job_board_title);
$job_board_subtitle = mysqli_real_escape_string($connection, $item->job_board_subtitle);
$job_state_code = ($item->job_country_code == 'NZ') ? 'NZ' : $item->job_state_code;
$sql = "
INSERT INTO jobs (
job_number,
specialty,
specialty_description,
division_code,
job_type,
job_type_description,
job_state_code,
job_country_code,
job_location_display,
job_board_type,
job_image_file,
job_board_title,
job_board_subtitle
) VALUES (
$item->job_number,
'$item->specialty',
'$specialty_description',
'$item->division_code',
'$item->job_type',
'$item->job_type_description',
'$job_state_code',
'$item->job_country_code',
'$item->job_location_display',
'$item->job_board_type',
'$image',
'$job_board_title',
'$job_board_subtitle'
)
";
if (!mysqli_query($connection,$sql))
{
die('Error: ' . mysqli_error($connection) . $sql);
}
$recordsInserted++;
}
mysqli_close($connection);
echo $recordsInserted . ' records inserted';
and this is my Drupal code. Can anyone tell me if maybe I am doing this wrong or not the most efficient way?
$data = simplexml_load_string($jobsXml);
// The transaction opens here.
$txn = db_transaction();
// delete all current jobs
$records_deleted = db_delete('jobs')
->execute();
$records_inserted = 0;
try {
$records = array();
foreach ($data->NewDataSet->Table as $item) {
$records[] = array(
'job_number' => $item->job_number,
'specialty' => $item->specialty,
'specialty_description' => $item->specialty_description,
'division_code' => $item->division_code,
'job_type' => $item->job_type,
'job_type_description' => $item->job_type_description,
'job_state_code' => ($item->job_country_code == 'NZ') ? 'NZ' : $item->job_state_code,
'job_country_code' => $item->job_country_code,
'job_location_display' => $item->job_location_display,
'job_board_type' => $item->job_board_type,
'job_image_file' => str_replace('http://www.example.com/public/images/job_headers/', '', $item->job_image_file),
'job_board_title' => $item->$job_board_title,
'job_board_subtitle' => $item->job_board_subtitle,
);
$records_inserted++;
}
$fields = array(
'job_number',
'specialty',
'specialty_description',
'division_code',
'job_type',
'job_type_description',
'job_state_code',
'job_country_code',
'job_location_display',
'job_board_type',
'job_image_file',
'job_board_title',
'job_board_subtitle'
);
$query = db_insert('jobs')
->fields($fields);
foreach ($records as $record) {
$query->values($record);
}
$query->execute();
} catch (Exception $e) {
// Something went wrong somewhere, so roll back now.
$txn->rollback();
// Log the exception to watchdog.
watchdog_exception('Job Import', $e);
echo $e;
}
echo $records_deleted . ' records deleted<br>';
echo $records_inserted . ' records inserted';

How big is the dataset you are trying to insert? If the dataset is very large then perhaps you might right into query size issues. Try looping over records and inserting each record one by one like you did with PHP.

Related

Looking for a more efficient way to store CSV data on my database

I'm provided with a .txt file every day which contains semicolon-separated data. Users of my app are meant to upload this file to the database daily.
Currently, I'm reading and storing the information as such:
$array = array();
$csv = str_getcsv($request->file, "\n");
foreach ($csv as &$row) {
$row = str_getcsv($row, ";");
$array[] = $row;
}
array_splice($array, 0, 1);
foreach ($array as &$row) {
$query = Table::firstOrNew(['col2' => $row[1], 'col3' => $row[2]]);
$query->col1 = $row[0];
$query->col2 = $row[1];
$query->col3 = $row[2];
$query->col4 = $row[3];
$query->col5 = $row[4];
// [...]
$query->col72 = $row[71];
$query->col73 = $row[72];
$query->save();
}
The thing with this method is that it takes too long to run successfully (the volume of data is about 5000 entries a day, which takes ~2 minutes to complete with the above code). As you can see, the amount of columns is immense, and the data has to be read as if dealing with a.CSV file, plus I cannot discard any of them at all.
Not to mention this problem increases in great magnitude if, for some reason, a user (or more) has to upload multiple days or even a month worth of data at once.
I need to figure out a better way to handle this situation. I've searched around for a solution but the best I could find was that I should use a for loop instead of foreach, which didn't really solve the issue.
You are checking for each row if it exists and if so updating and if not inserting, right?
If so you can't optimize this code to run faster unless you have unique column for each row and run raw queries with ON DUPLICATE KEY UPDATE see this:
Insert into a MySQL table or update if exists
Second solution is to delete all old records which belongs to that file or user or some unique which can't be uploaded twice and then insert new row chunks with insert method
it will be much faster. Example
DB::beginTransaction();
try {
Table::where('unique_file_rows_identified_column', $something)->delete();
foreach(array_chunk($array, 1000) as $rows) {
$rows_to_insert = [];
foreach($rows as $row){
$rows_to_insert[] = [
'col1' => $row[0],
'col2' => $row[1],
'col3' => $row[2],
'col4' => $row[3],
'col5' => $row[4],
// [...]
'col72' => $row[71],
'col73' => $row[72],
];
}
Table::insert($rows_to_insert);
}
} catch (\Exception $e){ // If something went wrong and exception is thrown deleted rows will be restored
DB::rollBack();
dd($e->getMessage());
}
DB::commit();
This will run only 5 queries if file contains 5000 rows and will be inserted much faster
Would it be an option to let the database do the work for you?
LOAD DATA INFILE '/tmp/user_data.csv' INTO TABLE test FIELDS TERMINATED BY ';';
https://dev.mysql.com/doc/refman/8.0/en/load-data.html
You need to be sure that the CSV is valid of course.

Multiple tabs, Multiple while

I'm running multiple PHP scripts that have a while loop. This while insert and read from a MySQL database.
It is a long running process so it takes up to 2 hours.
What i need to do is to open the script in multiple tabs in the same browser.
When i do this and open the script in multiple tabs, I can't open over 6 tabs . any tab that is over 6 it just keeps loading and shows nothing.
When going to an other browser it works but when i reach the 6 tabs it does the same.
Code :
<?php
ini_set('memory_limit', -1);
ob_implicit_flush(TRUE);
set_time_limit(0);
$sqlselect = "SELECT * FROM old_Users Where age < 18";
$content2 = mysqli_query($conn,$sqlselect);
While($row = mysqli_fetch_assoc($content2)){
$Sql = "INSERT INTO New_Table_Users('first_name','last_name','ID') VALUES('".$row["firstname"]."','".$row["lastname"]."','".$row["idd"]."');
mysqli_query($conn,$sql);
}
?>
The problem is not about the RAM, CPU because whenever i open a new browser it works fine, But when I try to open the 7'th tab it just keeps loading...
So to open 12 Tabs i would need to have 2 browsers each should have 6 tabs open...
Any help would be really appreciated
Dagon solution is the best but in case you need to process stuff in PHP and still be able to insert at a fast pace.
Using PDO (sorry dont like mysqli neither while) to do it faster than you breath. This will insert all data with very few queries (batches). It could even be with only one insert for it all.
WARNING: this technique is fast but know your limits . It needs RAM or lower the number of simultaneous inserts.
Depending on the size of what you are inserting, limit the number of simultaneous inserts, depending on your RAM capacity. With 3 params , as you have (very very few), insert batches of 10000 sounds reasonable. Try various to see how your database and server handles it.
ini_set('memory_limit', -1);
set_time_limit(0);
$table = 'New_Table_Users'; // inserted table name
$nb_max_insert = 10000; // number of maximum simultaneous insert
$age=18;// param age
$stmt = $conn->prepare("SELECT * FROM old_Users Where age < ?");
$stmt->bindParam(1, $age, PDO::PARAM_INT); // prepare binder
try {
$stmt->execute();
$result = $stmt->fetchAll(PDO::FETCH_ASSOC);
} catch (PDOException $e) {
var_dump('error main');
}
if (count($result) !== 0) {
$data = array();// extract needed data , yes you are using * in you query. mheeeee
foreach ($result as $key => $el) {
$row['first_name'] = $el['first_name'];
$row['last_name'] = $el['last_name'];
$row['ID'] = $el['ID'];
array_push($data, $row);
}
$batches = array_chunk($data, $nb_max_insert);// split data into batches
foreach ($batches as $key => $batch) {
foreach ($batch as $d) {
$question_marks[] = '(' . query_placeholders('?', sizeof($d)) . ')'; // create question_marks sequence for PDO
$insert_values = array_merge($insert_values, array_values($d));// what to insert
}
$sql = "INSERT INTO $table (" . implode(",", array_keys($row)) . ") VALUES " . implode(',', $question_marks); //concat the query
$stmt = $conn->prepare($sql);
try {
$stmt->execute($insert_values);
} catch (PDOException $e) {
var_dump('error batch');
}
}
}
Note: I am using it to insert millions of rows into huge tables, across PHP7 pthreads (12 CPU x 20 cores) reaching the limit of the server with 1024 async connections with 3X 12Go RAID X4 SSD 1to. So I guess it should work for you too....

Codeigniter's insert_batch() with thousands of inserts has missing records

I’m using insert_batch() to mass insert 10000+ rows into a table in a database. I’m making some tests and I have noticed that sometimes all the 10.000+ rows are getting inserted correctly but in some occasions I miss 100+ rows in my table’s total count.
The field data in the records I have are ok as I'm using the same data for each of my tests and most of the times I have no problem. For example I tried 20 times to insert the same data into my database and at 19 times all rows will be inserted correctly but in this one time I will miss 100 or maybe more rows.
The function for the insert_batch() follows:
protected function save_sms_to_database() {
//insert_Batch
$datestring = "%Y-%m-%d %h:%m:%s";
$time = time();
$datetime = mdate($datestring, $time);
$this->date_sent = $datetime;
foreach ($this->destinations as $k => $v) {
$sms_data[$k] = array(
'campaign_id' => $this->campaign_id,
'sender_id' => $this->from,
'destination' => $v,
'token' => md5(time() . 'smstoken' . rand(1, 99999999999)),
'message' => $this->body,
'unicode' => $this->unicode,
'long' => $this->longsms,
'credit_cost' => $this->eachMsgCreditCost,
'date_sent' => $this->date_sent,
'deleted' => 0,
'status' => 1,
'scheduled' => $this->scheduled,
);
}
$this->ci->db->insert_batch('outgoingSMS', $sms_data);
if ($this->ci->db->affected_rows() > 0) {
// outgoingSMS data were successfully inserted
return TRUE;
} else {
log_message('error', $this->campaign_id.' :: Could not insert sms into database');
log_message('error', $this->ci->db->_error_message());
return FALSE; // sms was not inserted correctly
}
}
How can I debug insert_batch() for such an occasion?
I have made some changes on the DB_active_rec.php to do some logging during the insert_batch and so far I can’t successfully reproduce the problem to see what is going wrong. But as far as the problem appeared 2-3 times at the beginning and I did not major changes to my logic to fix it, I can’t leave it like this as I don’t trust codeigniter’s insert_batch() function for production.
I'm also adding codeigniter's insert_batch() function:
public function insert_batch($table = '', $set = NULL)
{
$countz = 0;
if ( ! is_null($set))
{
$this->set_insert_batch($set);
}
if (count($this->ar_set) == 0)
{
if ($this->db_debug)
{
//No valid data array. Folds in cases where keys and values did not match up
return $this->display_error('db_must_use_set');
}
return FALSE;
}
if ($table == '')
{
if ( ! isset($this->ar_from[0]))
{
if ($this->db_debug)
{
return $this->display_error('db_must_set_table');
}
return FALSE;
}
$table = $this->ar_from[0];
}
// Batch this baby
for ($i = 0, $total = count($this->ar_set); $i < $total; $i = $i + 100)
{
$sql = $this->_insert_batch($this->_protect_identifiers($table, TRUE, NULL, FALSE), $this->ar_keys, array_slice($this->ar_set, $i, 100));
//echo $sql;
$this->query($sql);
$countz = $countz + $this->affected_rows();
}
$this->_reset_write();
log_message('info', "Total inserts from batch:".$countz);
return TRUE;
}
The last log_message() with the total inserts from batch also shows the problem as when I have less inserts than expected I get the non-expected number of inserts there as well.
I have to think something else for inserting thousands of rows into my database w/ or w/o codeigniter.
anyone has any clue for this kind of problem? Maybe it has something to do with the hard drive or the memory of the system during to lack of performance? It's an old PC with 1gb of ram.
EDIT: As requested I'm putting here an example INSERT statement with 9 rows that is being produced by codeigniter's insert_batch() function
INSERT INTO `outgoingSMS` (`campaign_id`, `credit_cost`, `date_sent`, `deleted`, `destination`, `long`, `message`, `scheduled`, `sender_id`, `status`, `token`, `unicode`) VALUES ('279',1,'2013-08-02 02:08:34',0,'14141415151515',0,'fd',0,'sotos',1,'4d270f6cc2fb32fb47f81e8e15412a36',0), ('279',1,'2013-08-02 02:08:34',0,'30697000000140',0,'fd',0,'sotos',1,'9d5a0572f5bb2807e33571c3cbf8bd09',0), ('279',1,'2013-08-02 02:08:34',0,'30697000000142',0,'fd',0,'sotos',1,'ab99174d88f7d19850fde010a1518854',0), ('279',1,'2013-08-02 02:08:34',0,'30697000000147',0,'fd',0,'sotos',1,'95c48b96397b21ddbe17ad8ed026221e',0), ('279',1,'2013-08-02 02:08:34',0,'306972233469',0,'fd',0,'sotos',1,'6c55bc3181be50d8a99f0ddba1e783bf',0), ('279',1,'2013-08-02 02:08:34',0,'306972233470',0,'fd',0,'sotos',1,'d9cae1cbe7eaecb9c0726dce5f872e1c',0), ('279',1,'2013-08-02 02:08:34',0,'306972233474',0,'fd',0,'sotos',1,'579c34fa7778ac2e329afe894339a43d',0), ('279',1,'2013-08-02 02:08:34',0,'306972233475',0,'fd',0,'sotos',1,'77d68c23422bb11558cf6fa9718b73d2',0), ('279',1,'2013-08-02 02:08:34',0,'30697444333',0,'fd',0,'sotos',1,'a7fd63b8b053b04bc9f83dcd4cf1df55',0)
That was a completed insert.
insert_batch() tries to avoid exactly your problem - trying to insert data larger than MySQL is configured to process at a time. I'm not sure if MySQL's option for that was max_allowed_packet or something else, but the problem with it is that it sets a limit in bytes and not a number of rows.
If you'll be editing DB_active_rec.php, mysql_driver.php or whatever appropriate ... try changing that 100 count in the for() loop. 50 should be a safer choice.
Other than that, FYI - affected_rows() won't return the correct value if you're inserting more than 100 rows via insert_batch(), so it's not reliable to use it as a success/error check. That's because insert_batch() inserts your data by 100 records at a time, while affected_rows() would only return data for the last query.
The solution of this problem is, you need to go in directory /system/database/ and open file DB_query_builder.php and update
public function insert_batch($table, $set = NULL, $escape = NULL, $batch_size = 1000)
you can set the size of your requirement.

Efficient php code to insert array data into mysql table?

So I have a flatfile db in the format of
username:$SHA$1010101010101010$010110010101010010101010100101010101001010:255.255.255.255:1342078265214
Each record on a new line... about 5000+ lines.. I want to import it into a mysql table. Normally I'd do this using phpmyadmin and "file import", but now I want to automate this process by using php to download the db via ftp and then clean up the existing table data and upload the updated db.
id(AUTH INCREMENT) | username | password | ip | lastlogin
The script I've got below for the most part works.. although php will generate an error:
"PHP Fatal error: Maximum execution time of 30 seconds exceeded" I believe I could just increase this time, but on remote server I doubt I'll be allowed, so I need to find better way of doing this.
Only about 1000 records will get inserted into the database before that timeout...
The code I'm using is below.. I will say right now I'm not a pro in php and this was mainly gathered up and cobbled together. I'm looking for some help to make this more efficient as I've heard that doing an insert like this is just bad. And it really sounds bad aswel, as a lot of disk scratching when I run this script on local pc.. I mean why does it want to kill the hdd for doing such a seemingly simple task.
<?php
require ('Connections/local.php');
$wx = array_map('trim',file("auths.db"));
$username = array();
$password = array();
$ip = array();
$lastlogin = array();
foreach($wx as $i => $line) {
$tmp = array_filter(explode(':',$line));
$username[$i] = $tmp[0];
$password[$i] = $tmp[1];
$ip[$i] = $tmp[2];
$lastlogin[$i] = $tmp[3];
mysql_query("INSERT INTO authdb (username,password,ip,lastlogin) VALUES('$username[$i]', '$password[$i]', '$ip[$i]', '$lastlogin[$i]') ") or die(mysql_error());
}
?>
Try this, with bound parameters and PDO.
<?php
require ('Connections/local.php');
$wx = array_map('trim',file("auths.db"));
$username = array();
$password = array();
$ip = array();
$lastlogin = array();
try {
$dbh = new PDO("mysql:host=$ip;dbname=$database", $dbUsername, $dbPassword);
$dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);
} catch(PDOException $e) {
echo 'ERROR: ' . $e->getMessage();
}
$mysql_query = "INSERT INTO authdb (username,password,ip,lastlogin) VALUES(:username, :password, :ip, :lastlogin)";
$statement = $dbh->prepare($mysql_query);
foreach($wx as $i => $line) {
set_time_limit(0);
$tmp = array_filter(explode(':',$line));
$username[$i] = $tmp[0];
$password[$i] = $tmp[1];
$ip[$i] = $tmp[2];
$lastlogin[$i] = $tmp[3];
$params = array(":username" => $username[$i],
":password" => $password[$i],
":ip" => $ip[$i],
":lastlogin" => $lastlogin[$i]);
$statement->execute($params);
}
?>
Instead of sending queries to server one by one in the form
insert into table (x,y,z) values (1,2,3)
You should use extended insert syntax, as in:
insert into table (x,y,z) values (1,2,3),(4,5,6),(7,8,9),...
This will increase insert performance by miles. However you need to be careful about how many rows you insert in one statement, since there is a limit to the size of a single SQL can be. So, I'd say start with 100 row packs and see how it goes, then adjust pack size accordingly. Chances are your insert time will go down to like 5 seconds, putting it way under max_execution_time limit.

Migrating databases using phpMyAdmin's tracking mechanism

In a development database, I have phpMyAdmin Tracking enabled on all tables. It logs all the changes I make to the tables' structures (in this case I'm not interested in data tracking.) So far so good.
What I want to do then is to take out a report, for ALL tracked tables, with the changes made from a specific version (or a date would even work,) so that I can run the resulting SQL on my production database, when upgrading to new versions, and make sure that the databases are identical, without the worry of the errors that come with manual handling of this.
However, there is no function that I can find that generates such a report. All the tracking reports are for individual tables, and if I have to click through all tables (20+) it takes away the benefit of this function. All tables don't change, but I don't want to keep track of what's changed, that's what I want phpMyAdmin to do for me.
I have tried to make my own query against the pma_tracking table where the changes are stored, and had partial success. The problem is that all changes for one version are stored as one BLOB, and with each new version a DROP TABLE / CREATE TABLE statement is made, and I can't drop tables on the production db since there is data there (I'm not recreating the database every time, only adding incremental changes). I just want to upgrade the structure, and the only time I want CREATE TABLE statements is when I actually create a new table in the database. So I thought I could filter those out with SQL, but then it's stored as a blog, and then I would have to parse and mess with the blob text which seems overly complicated.
So, as a summary, this is what I'm looking for:
An automated tracking system/workflow that logs all structure updates, and can create incremental SQL reports for the whole database from a version or point in time.
I'd prefer to not use any additional third party apps (I'd like to use phpMyAdmin or MySQL only), if possible
Also, I would love comments on the workflow, if someone has ideas of a better one. Any help appreciated.
The algorithm for parsing the BLOB field of the "pma_tracking" table is located in the getTrackedData method of the PMA_Tracker class, in the libraries/Tracker.class.php source file.
Starting from that code, I've written a simple PHP script to extract all the data definition statements (except the "DROP TABLE" statements) from the "pma_tracking" table.
For example, suppose that you want to get the list of all the changes of all the tables of the "test" database since version "1":
<?php
$link = mysqli_init();
// Adjust hostname, username, password and db name before use!
$db = mysqli_real_connect($link, "localhost", "myuser", "mypass", "phpmyadmin")
or die(mysqli_connect_error());
// Adjust also target db name and tracking version
$db_name = "test";
$version = "1";
$sql = "SELECT schema_sql FROM pma_tracking
WHERE db_name='{$db_name}' AND version>='{$version}'
ORDER BY version,date_created";
$result = mysqli_query($link, $sql) or die(mysqli_error($link));
while ($myrow = mysqli_fetch_assoc($result)) {
$log_schema_entries = explode('# log ', $myrow['schema_sql']);
foreach ($log_schema_entries as $log_entry) {
if (trim($log_entry) != '') {
$statement = trim(strstr($log_entry, "\n"));
if (substr($statement, 0, 11) != "DROP TABLE ") {
echo "{$statement}\n";
}
}
}
}
?>
By redirecting the script output on a file, you'll obtain a SQL commands file with (almost) all the statements needed to replicate the schema changes on the target (eg. production) database; this file must be executed by specifying the "-f" (force) MySQL option:
-f, --force Continue even if we get an SQL error.
By doing so, MySQL will ignore all the "Table already exists" error that will be thrown each time that a CREATE TABLE statement for an existing table is encountered, thus creating only the tables that still does'nt exist in the target database.
This kind of approach obviously has some drawbacks:
ALL the DROP TABLE commands will be ignored (not only those automatically inserted from phpMyAdmin) so, if you have deleted a table in the source database, that table won't be deleted in the target database.
ALL the script errors will be ignored, so it may not be 100% affordable.
A final word of advice: always do a full backup of your target database before proceeding!
I don't know how you could solve this problem using phpMyAdmin, but there are other tools that might help you achieve the effect your looking for. Liquibase is one of them. I've used it some times in the past and it was pretty good. It takes a little to get the hang of it, but I think it might help you.
I'm not too familiar with SQL tools, so I cannot recommend anything to help you out there, but I can try and help with a custom workflow...
Create a table called structure_log
Create a PHP script called print_stucture.php that prints whatever info you desire to a file on the server, saves the file as a timestamp (this will be your version number), and saves the name in the structure_log table
Create a crontab that runs print_structure.php however often you desire
Create a PHP script called delete_dups.php that grabs the last two records from your structure_log table, compares those two files, and if they are the same (representing no change to structures), deletes the one with the latest timestamp (filename) and removes that record from the structure_log table
Create a crontab that runs delete_dups.php half as often as the one that runs print_structure.php
This will make a versioning folder on your server. You can manually run the print_structure.php script whenever you desire and compare it against the latest version log you have in your server folder to see if your database you just ran it on, is the same as the last time the version check was ran.
I've had some success with MySQL Workbench:
Import (reverse engineer) your dev database into workbench. You can do this by either exporting your schema to an SQL file and loading it into workbench, or workbench will get the schema directly from the server.
Next, generate your diff file with the "Synchronise model" option. You select the production database, then which tables to sync, and workbench generates an SQL file you can run to sync both models.
A word of caution: the first time, there will likely be quite a few apparently uneeded changes while the DB is updated to workbench "style". For subsequent updates, the tool is rather reliable, though I would never let an automated tool have free range over my production DB ;-)
Always check the SQL file for errors, in some cases, dropping a column then adding another of the same name but different type will generate an alter column which will fail.
I don't have anything that creates an incremental diff between two databases but here's the script I use to compare two MySQL databases:
<?php
//------------------------------------------------------------------------------
// Define the variables we'll be using.
//------------------------------------------------------------------------------
$db1_con = NULL;
$db1_constraints = array();
$db1_dbname = 'db1';
$db1_host = 'localhost';
$db1_password = 'password1';
$db1_tables = array();
$db1_username = 'username1';
$db2_con = NULL;
$db2_constraints = array();
$db2_dbname = 'db2';
$db2_host = '123.123.123.123';
$db2_password = 'password2';
$db2_tables = array();
$db2_username = 'username2';
//------------------------------------------------------------------------------
// Connect to the databases.
//------------------------------------------------------------------------------
try{
$db1_con = new PDO("mysql:host=$db1_host;dbname=information_schema", $db1_username, $db1_password);
$db1_con->setAttribute(PDO::ATTR_EMULATE_PREPARES, FALSE); // Try to use the driver's native prepared statements.
$db1_con->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); // Let's use exceptions so we can try/catch errors.
}catch(PDOException $e){
echo "<p>Connection failed for $db1_host: " . $e->getMessage() . '</p>';
exit;
}
try{
$db2_con = new PDO("mysql:host=$db2_host;dbname=information_schema", $db2_username, $db2_password);
$db2_con->setAttribute(PDO::ATTR_EMULATE_PREPARES, FALSE); // Try to use the driver's native prepared statements.
$db2_con->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION); // Let's use exceptions so we can try/catch errors.
}catch(PDOException $e){
echo "<p>Connection failed for $db2_host: " . $e->getMessage() . '</p>';
exit;
}
if (NULL !== $db1_con && NULL !== $db2_con){
echo "<h2>Column Analysis</h2>";
$sql = 'SELECT * FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = ? ORDER BY TABLE_NAME, ORDINAL_POSITION';
$statement1 = $db1_con->prepare($sql);
$statement1->bindValue(1, $db1_dbname);
$statement2 = $db2_con->prepare($sql);
$statement2->bindValue(1, $db2_dbname);
if (TRUE === $statement1->execute()){
while ($row = $statement1->fetch(PDO::FETCH_ASSOC)){
$db1_tables[$row['TABLE_NAME']][$row['COLUMN_NAME']] = array();
foreach ($row AS $key => $value){
$db1_tables[$row['TABLE_NAME']][$row['COLUMN_NAME']][$key] = $value;
}
}
}
if (TRUE === $statement2->execute()){
while ($row = $statement2->fetch(PDO::FETCH_ASSOC)){
$db2_tables[$row['TABLE_NAME']][$row['COLUMN_NAME']] = array();
foreach ($row AS $key => $value){
$db2_tables[$row['TABLE_NAME']][$row['COLUMN_NAME']][$key] = $value;
}
}
}
foreach ($db1_tables AS $table => $info){
if (!isset($db2_tables[$table])){
echo "<p>Table <strong>$table</strong> does not exist in the SECOND database!</p>";
}else{
foreach ($info AS $column => $data){
if (!isset($db2_tables[$table][$column])){
echo "<p>Column <strong>$column</strong> does not exist in table <strong>$table</strong> in the SECOND database!</p>";
}else{
if (count($data)){
foreach ($data AS $key => $value){
if ($db1_tables[$table][$column][$key] !== $db2_tables[$table][$column][$key]){
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> has differing characteristics for <strong>$key</strong> (". $db1_tables[$table][$column][$key] ." vs. ". $db2_tables[$table][$column][$key] .")</p>";
}
}
}
}
}
}
}
foreach ($db2_tables AS $table => $info){
if (!isset($db1_tables[$table])){
echo "<p>Table <strong>$table</strong> does not exist in the FIRST database!</p>";
}else{
foreach ($info AS $column => $data){
if (!isset($db1_tables[$table][$column])){
echo "<p>Column <strong>$column</strong> does not exist in table <strong>$table</strong> in the FIRST database!</p>";
}else{
if (count($data)){
foreach ($data AS $key => $value){
if ($db2_tables[$table][$column][$key] !== $db1_tables[$table][$column][$key]){
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> has differing characteristics for <strong>$key</strong> (". $db2_tables[$table][$column][$key] ." vs. ". $db1_tables[$table][$column][$key] .")</p>";
}
}
}
}
}
}
}
echo "<h2>Constraint Analysis</h2>";
$sql = 'SELECT * FROM information_schema.KEY_COLUMN_USAGE WHERE TABLE_SCHEMA = ? ORDER BY TABLE_NAME, ORDINAL_POSITION';
$statement1 = $db1_con->prepare($sql);
$statement1->bindValue(1, $db1_dbname);
$statement2 = $db2_con->prepare($sql);
$statement2->bindValue(1, $db2_dbname);
if (TRUE === $statement1->execute()){
while ($row = $statement1->fetch(PDO::FETCH_ASSOC)){
foreach ($row AS $key => $value){
$db1_constraints[$row['TABLE_NAME']][$row['COLUMN_NAME']][$key] = $value;
}
}
}
if (TRUE === $statement2->execute()){
while ($row = $statement2->fetch(PDO::FETCH_ASSOC)){
foreach ($row AS $key => $value){
$db2_constraints[$row['TABLE_NAME']][$row['COLUMN_NAME']][$key] = $value;
}
}
}
foreach ($db1_constraints AS $table => $info){
foreach ($info AS $column => $data){
if (isset($db2_constraints[$table][$column])){
if (count($data)){
foreach ($data AS $key => $value){
if ('CONSTRAINT_NAME' !== $key && $db1_constraints[$table][$column][$key] !== $db2_constraints[$table][$column][$key]){
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> has differing characteristics for <strong>$key</strong> (". $db1_constraints[$table][$column][$key] ." vs. ". $db2_constraints[$table][$column][$key] .")</p>";
}
}
}
}else{
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> is missing a constraint in the SECOND database!</p>";
}
}
}
foreach ($db2_constraints AS $table => $info){
foreach ($info AS $column => $data){
if (isset($db1_constraints[$table][$column])){
if (count($data)){
foreach ($data AS $key => $value){
if ('CONSTRAINT_NAME' !== $key && $db2_constraints[$table][$column][$key] !== $db1_constraints[$table][$column][$key]){
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> has differing characteristics for <strong>$key</strong> (". $db2_constraints[$table][$column][$key] ." vs. ". $db1_constraints[$table][$column][$key] .")</p>";
}
}
}
}else{
echo "<p>Column <strong>$column</strong> in table <strong>$table</strong> is missing a constraint in the FIRST database!</p>";
}
}
}
}
?>
Edited to add code that shows differences in constraints as well.

Categories