Find the diff between a database and a mysql file - php

I have a database and a mysql file of the same database with some modifications(Some tables might have been added, some might have been dropped, table structures, some tables definitions might have been changed). How to find the changes between the two databases and their tables using PHP (structure only)?

I've done this sort of some time ago. It boils down to having two databases, dump them to a file and then compare them with diff, adjusted for known differences like database names.
There are also tools out there that can do this interactively by querying two databases and allowing to sync differences.
I don't know of a tool that compares a dump file with a database, though.

Here is a MySQL procedure that does exactly that: http://www.artfulsoftware.com/infotree/qrytip.php?id=624
Hope it helps (PS: there are a lot of useful queries here: www.artfulsoftware.com/infotree/queries.php )
PS: To get the data in PHP, just run the query and grab the info, then do whatever you want with it.
PS2: Import your old database dump into another database, and then use this procedure to compare the databases.

Related

is there a tool or any quick way to extract db tables automatically from my php codes?

I recently started to re-engineer a PHP-mysql project which was created about 7 years ago. I have only php and html codes and no mysql database or any document which shows the database structures.
Is there any tool which help me extract the tables of my database using php files? in the php files i have insert queries and select queries and also update.
I think of a tool (such a crawler) which takes my php files as input and create some sql create table queries as output.
I'm afraid there is no such tool. There's no guarantee that your PHP code even makes reference to every table or every column. You might see code like this:
mysqli_query("INSERT INTO mytable VALUES ('1234', 'abc', NULL, DEFAULT)");
What are the column names? What are their data types? Is the first column an integer, or is it just the developer's habit to put numbers inside string delimiters? What's the default value referenced by the fourth column? How could an automated tool infer these things by scanning this code?
Many details such as triggers, constraints, and indexes, are not referenced at all by PHP code, but they're necessary to make the application work.
If the database has any stored procedures, the PHP code wouldn't have any knowledge of the logic inside the procedure.
mysqli_query("CALL myprocedure(1234, 'north')");
The same problem exists for the query in VIEW definitions.
Reverse engineering this project is going to be a time-consuming forensic task, plus a lot of guesswork.
It really illustrates the importance of including the current schema dump with the source code of a project.

What is the procedure to normalize a database with PHP?

I just took over a pretty terrible database design job, which heavily use comma separated value to store data. I know I know, it is hell.
The db is mysql, currently accessing it using MySql Workbench.
I already had idea in mind what to remove, and what new relations table needed.
So, my question is, how shall I proceed by migrating comma separated data to the new table? Any tools specialize for normalizing database?
Edit:
The server code is in PHP.
Define you new tables and attributes first.
Then, use PHP or Python or your favorite language with MySQL calls and write a 1 time converter which loops and reads the old table(s) and records and inserts the proper records into the new tables.
It appears you are looking for standard practices. There are varying degree of denormalized databases out there. The ones I have come across have been normalized with custom code and tools.
SQL Server Integration Services (SSIS) can be used for some case. In your case, I'd build a script for the migration that involves:
creation of normalized tables
creating stored procedure or PHP script(s) to read data from denormalized table, transform it and load it into normalized table
creating a log table or log file
performing the migration in sandbox; write logs while doing so
version control the script
correct the proc/script as needed
create another sandbox
run the full script on sandbox
if successful, run the full script on prod (with logging)
SSIS is used for ETL in many organizations; it's standard tool for Microsoft BI stack and can also be used to migrate data between non-Microsoft DBs also.
Open Source ETL tool called Talend might also help in transforming your data. I personally believe that a PHP script will be the fastest and easiest to manipulate data.

Record or Compare mysql data before and after operation -reverse engineering

My problem is I'm using a HUGE web application (a school system), with no documentation for the internal logic. I need to make a bulk update of a particular value, but I don't know what tables in the MySQL database contain the relevant data to update. The app it's self runs from php. Is there an easy way to compare the database before I do an operation and after I do it so I can see what tables are effected? I tried using a diff comparing tool on the dumped sql before and after, but the database is so huge it's really impractical to use, wondering if there is something better or if I can just configure php somehow to log any mysql operations from whatever file happens to trigger them.
You may want to run the performance tool from the mysql workbench and look at the performance reports/statement analysis. This will work if you pick a time when the system is not being used and then run some function in the web that updates the tables with the values you need to change. Look at the performance table before and after you run your experiment and look for those sql statements which show use. It's not perfect, but this will at least help you begin to hone in on the data you're looking for. The big 'gotcha' here is if the value you want to change is dynamically derived during the query process. Then you'll have to understand how the derivation works and the source columns. But, again, this will give you a brute-force starting place.

Generate .sql file vs execute queries with php

I am trying to import some data in one table from a database to another database.
I cannot just copy them, because format of both tables of the two databases are different.
With the fetched data from one database, I am able to create insert queries.
I want to know which is better:
Execute those queries in PHP itself by creating a new connection to second database.
Write all queries to .sql file and then import it directly in second database.
I am looking at the aspects of performance and ease of implementation.
Note: I am expecting the data in the table to be more than ten thousand rows
If you go with the first Option, There are chances that you could make some mistakes.
I prefer you to go with the Second option to Write all queries to .sql file and then import it directly in second database. Thanks
vJ
I would certainly go for the second option. Why use php for a one time action.
You can just solve this in the database with SQL only
I would go for the second option.
Then I would:
get an overview over both table structures
Export the data from the first table in a flat file format like CSV.
If necessary, transform the data from the first table to the second using a script or a tool.
Import the modified data into the second table.
The database vendors have good tools for exporting, manipulating and importing data.
If only the name of the tables are different, vendor tools importing feature often have good functionality for mapping data from one table to another. In my own case, I've used Oracle SQL developer, but please let me know your vendor and I can give you a pointer in the right direction.

Simulate MySQL connection to analyze queries to rebuild table structure (reverse-engineering tables)

I have just been tasked with recovering/rebuilding an extremely large and complex website that had no backups and was fully lost. I have a complete (hopefully) copy of all the PHP files however I have absolutely no clue what the database structure looked like (other than it is certainly at least 50 or so tables...so fairly complex). All data has been lost and the original developer was fired about a year ago in a fiery feud (so I am told). I have been a PHP developer for quite a while and am plenty comfortable trying to sort through everything and get the application/site back up and running...but the lack of a database will be a huge struggle. So...is there any way to simulate a MySQL connection to some software that will capture all incoming queries and attempt to use the requested field and table names to rebuild the structure?
It seems to me that if i start clicking through the application and it passes a query for
SELECT name, email, phone from contact_table WHERE
contact_id='1'
...there should be a way to capture that info and assume there was a table called "contact_table" that had at least 4 fields with those names... If I can do that repetitively, each time adding some sample data to the discovered fields and then moving on to another page, then eventually I should have a rough copy of most of the database structure (at least all public-facing parts). This would be MUCH easier than manually reading all the code and pulling out every reference, reading all the joins and subqueries, and sorting through it all manually.
Anyone ever tried this before? Any other ideas for reverse-engineering the database structure from PHP code?
mysql> SET GLOBAL general_log=1;
With this configuration enabled, the MySQL server writes every query to a log file (datadir/hostname.log by default), even those queries that have errors because the tables and columns don't exist yet.
http://dev.mysql.com/doc/refman/5.6/en/query-log.html says:
The general query log can be very useful when you suspect an error in a client and want to know exactly what the client sent to mysqld.
As you click around in the application, it should generate SQL queries, and you can have a terminal window open running tail -f on the general query log. As you see queries run by that reference tables or columns that don't exist yet, create those tables and columns. Then repeat clicking around in the app.
A number of things may make this task even harder:
If the queries use SELECT *, you can't infer the names of columns or even how many columns there are. You'll have to inspect the application code to see what column names are used after the query result is returned.
If INSERT statements omit the list of column names, you can't know what columns there are or how many. On the other hand, if INSERT statements do specify a list of column names, you can't know if there are more columns that were intended to take on their default values.
Data types of columns won't be apparent from their names, nor string lengths, nor character sets, nor default values.
Constraints, indexes, primary keys, foreign keys won't be apparent from the queries.
Some tables may exist (for example, lookup tables), even though they are never mentioned by name by the queries you find in the app.
Speaking of lookup tables, many databases have sets of initial values stored in tables, such as all possible user types and so on. Without the knowledge of the data for such lookup tables, it'll be hard or impossible to get the app working.
There may have been triggers and stored procedures. Procedures may be referenced by CALL statements in the app, but you can't guess what the code inside triggers or stored procedures was intended to be.
This project is bound to be very laborious, time-consuming, and involve a lot of guesswork. The fact that the employer had a big feud with the developer might be a warning flag. Be careful to set the expectations so the employer understands it will take a lot of work to do this.
PS: I'm assuming you are using a recent version of MySQL, such as 5.1 or later. If you use MySQL 5.0 or earlier, you should just add log=1 to your /etc/my.cnf and restart mysqld.
Crazy task. Is the code such that the DB queries are at all abstracted? Could you replace the query functions with something which would log the tables, columns and keys, and/or actually create the tables or alter them as needed, before firing off the real query?
Alternatively, it might be easier to do some text processing, regex matching, grep/sort/uniq on the queries in all of the PHP files. The goal would be to get it down to a manageable list of all tables and columns in those tables.
I once had a similar task, fortunately I was able to find an old backup.
If you could find a way to extract the queries, like say, regex match all of the occurrences of mysql_query or whatever extension was used to query the database, you could then use something like php-sql-parser to parse the queries and hopefully from that you would be able to get a list of most tables and columns. However, that is only half the battle. The other half is determining the data types for every single column and that would be rather impossible to do autmatically from PHP. It would basically require you inspect it line by line. There are best practices, but who's to say that the old dev followed them? Determining whether a column called "date" should be stored in DATE, DATETIME, INT, or VARCHAR(50) with some sort of manual ugly string thing can only be determined by looking at the actual code.
Good luck!
You could build some triggers with the BEFORE action time, but unfortunately this will only work for INSERT, UPDATE, or DELETE commands.
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

Categories