Is this the right logic for a database migration (using php ) - php

This is more about checking my logic to see if i understand whether or not my idea of migration is the same as more experienced folk.
I have 2 databases, one with my current website stuff and one with my new development stuff. I now want to load all the current website database onto my new development database.
These databases have different schemas, different structures in terms of column names and a bit more decoupling in terms of data within the new development database tables.
To do this i am using php and sql, where by i'm calling specific tables using sql into multidimensional arrays to get all the relevant data needed for my new development database tables. checking to see if its repeat data and ordering it.
So now i have a multidimensional array that is full of data needed for my new database table which has been extracted from the old tables. I have renamed all the keys in the multidimensional array to match the names of the database. So technically i have a multidimensional array that is a copy of what i want to insert into my database.
Then i insert that multidimensional array into the new database and bobs your uncle a migration of a database?
Does this sound right and is there some suggested reading that you guys n girls might point me to?
Regards Mike
EDIT
By using multidimensional arrays to collect all the data that i want to put into my new database, wont i then be double handling the data and therefore use alot more resources from my migration script?

I have never tried this before but I am pretty certain you can access 2 databases at the same time. That being said you can extract from DB1 do your checks, changes, etc then just insert into the new DB.
Here is a stack question that does connect to 2 db's

So I've researched about creating a migration script and I thought id explain to anyone else who has to do this a general outline in how to implement it. Bare in mind I'm only really using basic php no class's or functions, all procedural. I'm going to focus on one particular table and from that you can extrapolate for the whole database.
1) Create a php file specifically for collating the data of this one table (e.g table1.php)
2) Create all the sql statements you'll need to extract all relevant information for that particular table
3) With each sql statement create a loop and put all the data fetched from the sql statement into a array
4) Then create loop and an sql statement for inserting the data from the arrays you just populated into the new database. and if you want to check for repeat data just implement this check within this loop and sql statement.
5) Note you can add a timer and a counter for checking how long it took and amount of files transferred and or number of duplicates.
This may be obvious for most people, and might be considered wrong by others but my original plan on collating the data in a "table equivalent multidimensional array" and then inserting that array into the table meant i was double handling data (i think). So i assumed it would more efficient doing it this way, and a lot more simple.
I hope this basic outline will help anyone considering doing the same thing for the first time, and if someone has thoughts on how to make this operation more effective please feel free to rip this explanation apart. As this is only what I've implemented myself through trail and error as i have no real experience in this its just what I've concocted myself.
Regards Mike

Related

Recreate a database using existing php code

So I have an old website which was coded over an extended period of time but has been inactive for 3 or so years. I have the full PHP source to the site, but the problem is I do not have a backup of the database any longer. I'm wondering what the best solution to recreating the database would be? It is a large site so manually going through each PHP file and trying to keep track of which tables are referenced is no small task. I've tried googling for the answer but have had no luck. Does anyone know of any tools that are available to help extract this information from the PHP and at least give me the basis of a database skeleton? Otherwise, has anyone ever had to do this? Any tips to help me along and possibly speed up the process? It is a mySQL database I'm trying to use.
The way I would do it:
Write a subset of SQLi or whatever interface was used to access the DB to intercept all DB accesses.
Replace all DB accesses with the dummy version of yours.
The basic idea is to emulate the DB so that the PHP code runs long enough to activate the various DB accesses, which in turn will allow you to analyze the way the DB is built and used.
From within these dummy functions:
print the SQL code used
regenerate just enough dummy results to let the rest of the code run, based on the tables and fields mentioned in the query parameters and the PHP code that retrieves them (you won't learn much from a SELECT *, but you can see what fields the PHP code expects to get from it)
once you have understood enough of the DB structure, recreate the tables and let the original code work on them little by little
have the previous designer flogged to death for not having provided a way to recreate the DB programatically
There are currently two answers based on the information you provided.
1) you can't do this
PHP is a typeless language. you could check you sql statements for finding field and table names. but it will not complete. if there is a select * from table, you can't see the fields. so you need to check there php accesses the fields. maybe by name or by index. you could be happy if this is done by name, because you can extract the name of the fields. finally the data types will missing. also missing: where are is an index on, what are primary keys, constrains etc.
2) easy, yes you can!
because your php is using a modern framework with contains a orm. this created the database for you. a meta information are included in the php classes/design.
just check the manual how to recreate the database.

Search mysql if json key not empty

I've got a json string stored in a mySQL DB and now I'm trying to find a way to check if a specific key contains a value at all or not.
Been googling around, but most solutions point to finding a specific value, where instead I want to check if there's a value there or not.
This is to be implemented into some sort of php check function, and if there's a way to get all results from mysql instead of doing multiple queries, that'd be great.
Example:
row 1 {"Name":"Jane","Group":"","customernumber":"12345"}
row 2 {"Name":"Mike","Group":"Sales","customernumber":"23456"}
row 3 {"Name":"Steve","Group":"","customernumber":"34567"}
The resulting array would contain Mike with details and so on.
A little help, please?
EDIT
I didn't choose to store the data like this. It's the CMS I'm working with that stores custom form like this.
I've got about 400 db entries and I thought of letting MySQL do the processing since I don't know if storing that many results in an array would be bad for performance since a couple of users are going to view pages that uses these results causing quite frequent requests.

Simulate MySQL connection to analyze queries to rebuild table structure (reverse-engineering tables)

I have just been tasked with recovering/rebuilding an extremely large and complex website that had no backups and was fully lost. I have a complete (hopefully) copy of all the PHP files however I have absolutely no clue what the database structure looked like (other than it is certainly at least 50 or so tables...so fairly complex). All data has been lost and the original developer was fired about a year ago in a fiery feud (so I am told). I have been a PHP developer for quite a while and am plenty comfortable trying to sort through everything and get the application/site back up and running...but the lack of a database will be a huge struggle. So...is there any way to simulate a MySQL connection to some software that will capture all incoming queries and attempt to use the requested field and table names to rebuild the structure?
It seems to me that if i start clicking through the application and it passes a query for
SELECT name, email, phone from contact_table WHERE
contact_id='1'
...there should be a way to capture that info and assume there was a table called "contact_table" that had at least 4 fields with those names... If I can do that repetitively, each time adding some sample data to the discovered fields and then moving on to another page, then eventually I should have a rough copy of most of the database structure (at least all public-facing parts). This would be MUCH easier than manually reading all the code and pulling out every reference, reading all the joins and subqueries, and sorting through it all manually.
Anyone ever tried this before? Any other ideas for reverse-engineering the database structure from PHP code?
mysql> SET GLOBAL general_log=1;
With this configuration enabled, the MySQL server writes every query to a log file (datadir/hostname.log by default), even those queries that have errors because the tables and columns don't exist yet.
http://dev.mysql.com/doc/refman/5.6/en/query-log.html says:
The general query log can be very useful when you suspect an error in a client and want to know exactly what the client sent to mysqld.
As you click around in the application, it should generate SQL queries, and you can have a terminal window open running tail -f on the general query log. As you see queries run by that reference tables or columns that don't exist yet, create those tables and columns. Then repeat clicking around in the app.
A number of things may make this task even harder:
If the queries use SELECT *, you can't infer the names of columns or even how many columns there are. You'll have to inspect the application code to see what column names are used after the query result is returned.
If INSERT statements omit the list of column names, you can't know what columns there are or how many. On the other hand, if INSERT statements do specify a list of column names, you can't know if there are more columns that were intended to take on their default values.
Data types of columns won't be apparent from their names, nor string lengths, nor character sets, nor default values.
Constraints, indexes, primary keys, foreign keys won't be apparent from the queries.
Some tables may exist (for example, lookup tables), even though they are never mentioned by name by the queries you find in the app.
Speaking of lookup tables, many databases have sets of initial values stored in tables, such as all possible user types and so on. Without the knowledge of the data for such lookup tables, it'll be hard or impossible to get the app working.
There may have been triggers and stored procedures. Procedures may be referenced by CALL statements in the app, but you can't guess what the code inside triggers or stored procedures was intended to be.
This project is bound to be very laborious, time-consuming, and involve a lot of guesswork. The fact that the employer had a big feud with the developer might be a warning flag. Be careful to set the expectations so the employer understands it will take a lot of work to do this.
PS: I'm assuming you are using a recent version of MySQL, such as 5.1 or later. If you use MySQL 5.0 or earlier, you should just add log=1 to your /etc/my.cnf and restart mysqld.
Crazy task. Is the code such that the DB queries are at all abstracted? Could you replace the query functions with something which would log the tables, columns and keys, and/or actually create the tables or alter them as needed, before firing off the real query?
Alternatively, it might be easier to do some text processing, regex matching, grep/sort/uniq on the queries in all of the PHP files. The goal would be to get it down to a manageable list of all tables and columns in those tables.
I once had a similar task, fortunately I was able to find an old backup.
If you could find a way to extract the queries, like say, regex match all of the occurrences of mysql_query or whatever extension was used to query the database, you could then use something like php-sql-parser to parse the queries and hopefully from that you would be able to get a list of most tables and columns. However, that is only half the battle. The other half is determining the data types for every single column and that would be rather impossible to do autmatically from PHP. It would basically require you inspect it line by line. There are best practices, but who's to say that the old dev followed them? Determining whether a column called "date" should be stored in DATE, DATETIME, INT, or VARCHAR(50) with some sort of manual ugly string thing can only be determined by looking at the actual code.
Good luck!
You could build some triggers with the BEFORE action time, but unfortunately this will only work for INSERT, UPDATE, or DELETE commands.
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html

Increment Database Table Names

I'm looking to create a PHP script that creates a new table within a database that would be tied to a label and then within the table there would be rows of data relating to the status of the label. However, I'm not sure how I can get the PHP script (or MySQL) to increment the name of the table. All I can find is a lot of detail on auto incrementing columns for rows.
Thoughts?
You're doing it wrong. If you have scripts that, during the project live phase, create and delete regular tables, more often than not it is an indicator of bad design.
If you're keen on OOP, you may consider a table like a Class definition, and each row as an object (or an entity, if you wish) - i know it is a stretch, but it has some similarities.
Take some time to read about database normalization and database design, this project and everyone after this will benefit much more than spending time to research a working solution for the current problem you are facing.

Storing arrays in MySQL?

On the Facebook FQL pages it shows the FQL table structure, here is a screenshot below to show some of it (screenshot gone).
You will notice that some items are an array, such as meeting_sex, meeting_for current_location. I am just curious, do you think they are storing this as an array in mysql or just returning it as one, from this data it really makes me think it is stored as an array. IF you think it is, or if you have done similar, what is a good way to store these items as an array into 1 table field and then retrieve it as an array on a PHP page?
alt text http://img2.pict.com/3a/70/2a/2439254/0/screenshot2b187.png
The correct way to store an array in a database is by storing it as a table, where each element of the array is a row in the table.
Everything else is a hack, and will eventually make you regret your decision to try to avoid an extra table.
There are two options for storing as an array:
The first, which you mentioned, is to make one, or several, tables, and enumerate each possible key you intend to store. This is the best for searching and having data that makes sense.
However, for what you want to do, use serialize(). Note: DO NOT EVER EVER EVER try to search against this data in its native string form. It is much faster (and saner) to just reload it, call unserialize(), and then search for your criteria than to develop some crazy search pattern to do your bidding.
EDIT: If it were me, and this were something I was seriously developing for others to use (or even for myself to use, to be completely honest), I would probably create a second lookup table to store all the keys as columns; Heck, if you did that, mysql_fetch_assoc() could give you the array you wanted just by running a quick second query (or you could extract them out via a JOINed query). However, if this is just quick-and-dirty to get whatever job done, then a serialized array may be for you. Unless you really, really don't care about ever searching that data, the proper column-to-key relationship is, I think most would agree, superior.
I guarantee you that Facebook is not storing that data in arrays inside their database.
The thing you have to realize about FQL is that you are not querying Facebook's main data servers directly. FQL is a shell, designed to provide you access to basic social data without letting you run crazy queries on real servers that have performance requirements. Arbitrary user-created queries on the main database would be functional suicide.
FQL provides a well-designed data return structure that is convenient for the type of data that you are querying, so as such, any piece of data that can have multiple associations (such as "meeting_for") gets packaged into an array before it gets returned as an API result.
As other posters have mentioned, the only way to store a programming language structure (such as an array or an object) inside a database (which has no concept of these things), is to serialize it. Serializing is expensive, and as soon as you serialize something, you effectively make it unusable for indexing and searching. Being a social network, Facebook needs to index and search almost everything, so this data would never exist in array form inside their main schemas.
Usually the only time you ever want to store serialized data inside a database is if it's temporary, such as session data, or where you have a valid performance requirement to do so. Otherwise, your data quickly becomes useless.
Split it out into other tables. You can serialize it but that will guarantee that you will want to query against that data later. Save yourself the frustration later and just split it out now.
you can serialize the array, insert it, and then unserialize it when you retrieve it.
They might be using multiple tables with many-to-many relationships, but use joins and MySql's GROUP_CONCAT function to return the values as an array for those columns in one query.

Categories