Easiest way to move data to other database - php

I have two versions of my blog: the 1st is written in PHP and uses MySQL, but the 2nd, the new one, is written in Python and uses Postgres.
My goal is to move data from one to other. Table names and schema changes.
My idea was to make ORM models for old site, and, using loop, get data using ORM and put it in new database, because I have ORM models for my new site too.
It would look something like:
old_articles = OldArticle.objects.all()
for old_article in old_articles:
new_article = NewArticle()
new_article.title = old_article.name
new_article.content = old_article.body
new_article.save()
ORM would easy abstract differences between the databases and, in my opinion, this could actually work! Or no, are there better ways?

If this migration will only be done once, I wouldn't go the ORM way. Exporting standards-compliant SQL dumps from MySQL is possible and the dumps could easily be imported into PostgreSQL. Once the data is in PostgreSQL, run your migration queries to make the scheme changes or use temporary 'import' tables and copy the data to the tables in the new scheme/lay-out.
Test all your migration queries and write a scenario containing all steps to take, which queries to run and in what order. Also include manual steps that need to be performed.
Once you're sure that the migration scenario is correct, and fully tested, put your old blog in 'maintainance mode' (sorry, we're offline, we'll be back soon) and do it for real!
Most important: test your scenario, validate the result and, take your time, you should never hurry these things!

There are a lot of libraries to do this sort of thing. I would stick to something that is already implemented and well tested. Here is a link to the postgres wiki that has a list of tools to do just this thing.
http://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL

Related

Where should I put the inicial data for my CakePHP application?

I'm starting to use the console migrations for CakePHP.
I would like to know where should I put the Initial Data for my application. For example if I'm going to run it in a developer machine for the first time and I need to set up some tables with data.
As I can see in the official book they recommend the "CakeSchema callbacks", but the method "public function after()" inside schema.php is rewritten every time i run:
cake schema generate
Also this doesn't look like a clean approach.
Where should I put this kind of instruction?
I'm running CakePHP 2.4
Thanks!
You can use the Migrations plugin for such a thing https://github.com/CakeDC/migrations
So that you can provide the migrations (creation of tables, creation of fields, as well insertion of data into your tables)

PHPUnit - database testing, how to manage that

Well, as in the title, I can think of three ways to manage testing the database output (I'm using ORM in my application, and PDO in unit tests). Which is the best one? How do you handle this?:
Create data set with the data I want specifically for testing, and change the code so that it reads xml instead of the ORM arrays (in tests classes).
Create setUp() method, set the attribute containing the ORM array, and work on that.
Same as the second point, but with another database, created specifically for testing
You may want to read PHPUnit's chapter on database testing.
I use PDO via my own thin wrapper that supports nested transactions via save points. In the bootstrap, I create a test database with the entire structure of production along with very basic seed data. During each setUp() and tearDown() I begin a transaction and then roll back.
Each test imports the subset of data it needs from raw SQL files. From there, the ORM is tested using real inserts, etc. But this only works because my DB driver supports nested transactions. If the tests begin/commit and check for success/failure, everything still works.
If you don't have nested transaction support, you could set up and tear down the entire database on each test, but that will be slower. And note that you don't always have to test against a real database... it depends on what sort of things you are testing.
In my tests, I use a test database.
MySQL has a few test databases on their site. I find the Sakila rather difficult, so I use the World database.

using both active record and doctrine in codeigniter

I was curious to know whether its ok if I user codeigniter's active record query besides using doctrine in some cases, simultaneously. Because in some cases, I find active record more easy and quick way to get things done then writing doctrine query. For example, consider the following case where I need to return total number of rows in a table, in doctrine:
$query = $this->em->createQueryBuilder()
->select("count(c)")
->from($this->entity, "c")
->getQuery();
return $query->getSingleScalarResult();
vs via active record:
return $this->db->count_all_results($this->table);
You can see how easy it is in active record. There may be more such cases. So, is there any pros or cons in using both?
Also, will they use two different db connection to perform their operations?
You can use both Doctrine and ActiveRecord at the same time. Swordfish has highlighted the some problems. In addition to that if you bring in new developer team, the learning curve will be more.
I suggest choose one and stick with it. IMO, both are equally good. You should choose based on your current project and personal preference. You can find the very good comparison here
What Does Doctrine Add Above Active Record - CodeIgniter?
Regarding the two queries you mentioned, if you use DQL, it might look simple
$query = $em->createQuery('SELECT COUNT(u.id) FROM Entities\User u');
$count = $query->getSingleScalarResult();
I was curious to know whether its ok if I user codeigniter's active
record query besides using doctrine in some cases, simultaneously
Yes it is ok. There's no rule against it.
Also, will they use two different db connection to perform their
operations?
Depends on how you use them. Connection pooling in PHP is not how it is in other languages. You might have to write a custom class to hook them both up to use a common connection, but its not something that id spend my time for if im in an hurry.
Regarding Pros and Cons
It is good to stick with active record as far as codeigniter is concerned as it is cleaner, efficient and comes as part of codeigniter and provide you almost everything that you might need. You can get the extended active record class from codeigniter forums that extends on the base class to provide some complex join functionality as well.
But technically its not an issue using two layers, other than the fact that it gets messy, and makes two separate connections.
Find the link for the doctrine integration in CI
https://github.com/mitul69/codeigniter-doctrine-integration
I will update more document in couple of days.

Need a simple ORM or DBAL for existing PHP app

I am working on extending an existing PHP application. Unfortunately for me, the existing app is a mess. It's all spaghetti code with raw mysql_* calls. Groan. No way that I am going to do that in the parts that I am extending.
So, I am looking for a simple ORM of DBAL that I can easily drop in and start using. Desired features:
It must work on an existing database schema. Preferably with minimal or no additional configuration. The existing database schema is the same quality as the existing PHP code (no sensible naming conventions, not normalised, etc.). I don't want to spend days converting the database schema manually into annotated object properties a la Doctrine 2.
It must be able to work alongside the existing raw mysql_* queries. I have no idea how hydrating ORMs like Doctrine 2 or Propel behave when scripts are manually manipulating the data in the database behind their backs, but I assume it's not pretty.
It must run on PHP 5.2.x. I'd love to use PHP 5.3 but I have zero interest in going over the existing 125K lines of spaghetti code mess to make sure it runs on PHP 5.3.
Relationships not required. In the few places I need to get to relational data, I'll be happy to call an extra find() or query() or whatever myself.
Bonus points if it has some trigger support (e.g. beforeSave, afterSave). Not a requirement, but just nice to have.
Edit: Someone put me out of my misery. I just found out that the 125K lines of spaghetti code also changes the database schema. E.g, add an extra option somewhere and a whole slew of ALTER TABLE statements start flying. I could probably fill a year's worth of TheDailyWTF with this codebase. So, one more requirement:
Must be able to cope with a changing database schema automatically (e.g. adding columns).
I have been looking at a few solutions, but I am unsure how well they would work given the requirements. Doctrine 2, RedBeanPhp and the like all require PHP 5.3, so they are out. There's a legacy version of RedBeanPhp for PHP 5.2.x but I don't know if it would work with a messy, existing database schema. NotORM looks okay for getting data out but I don't know if it can be configured for the existing database schema, and how you can easily put data back into the database.
Ideally I would like something simple. E.g:
$user = User::find($id);
$user->name = 'John Woo';
$user->save();
Or:
$articles = ORM::find('article')->where('date' => '2010-01-01');
foreach ($articles as $article) {
echo $article->name;
}
Any tips or even alternative solutions are welcome!
I use...
http://github.com/j4mie/idiorm/
it has an active record implementation too in the form of Paris.
With regard to your edit. Idiorm copes with changing schemas and the syntax almost exactly matches the type you want in your question.
How well did you look into Doctrine? I am using Doctrine 1.2 for these kind of things. Quite easy to setup, allows you to start off with an existing schema. It automatically figures out the relations between tables that have foreign key constraints.
It has extensive trigger and behaviour support, so the bonus points can be spent as well, and it has relational support as well, so your additional queries are not necessary. It has beautiful lazy loading, and it comes with a flexible query language (called DQL) that allows you to do almost exactly the same stuff that you can do in SQL in only a fraction of the effort.
Your example will look like this:
/* To just find one user */
$user = Doctrine::getTable('User')->findOneById($id);
/* Alternative - illustrating DQL */
$user = Doctrine_Query::create()
->from('User u')
->where('u.id = ?',array($id))
->fetchOne();
$user->name = 'John Woo';
$user->save();
It must be able to work alongside the existing raw mysql_* queries. I have no idea how hydrating ORMs like Doctrine 2 or Propel behave when scripts are manually manipulating the data in the database behind their backs, but I assume it's not pretty.
Well, that is technically impossible to auto-manage; a SQL database is simply not pushing back stuff to your ORM, so to update stuff that was changed in the background, you need to perform an additional query one way or the other. Fortunately, Doctrine makes this very easy for you:
/* #var User $user */
/* Change a user using some raw mysql queries in my spaghetti function */
$this->feedSpaghetti($user->id);
/* Reload changes from database */
$user->refresh();

Entity Framwework-like ORM NOT for .NET

What I really like about Entity framework is its drag and drop way of making up the whole model layer of your application. You select the tables, it joins them and you're done. If you update the database scheda, right click -> update and you're done again.
This seems to me miles ahead the competiting ORMs, like the mess of XML (n)Hibernate requires or the hard-to-update Django Models.
Without concentrating on the fact that maybe sometimes more control over the mapping process may be good, are there similar one-click (or one-command) solutions for other (mainly open source like python or php) programming languages or frameworks?
Thanks
SQLAlchemy database reflection gets you half way there. You'll still have to declare your classes and relations between them. Actually you could easily autogenerate the classes too, but you'll still need to name the relations somehow so you might as well declare the classes manually.
The code to setup your database would look something like this:
from sqlalchemy import create_engine, MetaData
from sqlalchemy.ext.declarative import declarative_base
metadata = MetaData(create_engine(database_url), reflect=True)
Base = declarative_base(metadata)
class Order(Base):
__table__ = metadata.tables['orders']
class OrderLine(Base):
__table__ = metadata.tables['orderlines']
order = relation(Order, backref='lines')
In production code, you'd probably want to cache the reflected database metadata somehow. Like for instance pickle it to a file:
from cPickle import dump, load
import os
if os.path.exists('metadata.cache'):
metadata = load(open('metadata.cache'))
metadata.bind = create_engine(database_url)
else:
metadata = MetaData(create_engine(database_url), reflect=True)
dump(metadata, open('metadata.cache', 'w'))
I do not like “drag and drop” create of data access code.
At first sight it seems easy, but then you make a change to the database and have to update the data access code. This is where it becomes hard, as you often have to redo what you have done before, or hand edit the code the drag/drop designer created. Often when you make a change to one field mapping with a drag/drop designer, the output file has unrelated lines changes, so you can not use your source code control system to confirm you have make the intended change (and not change anything else).
However having to create/edit xml configuring files is not nice every time you refractor your code or change your database schema you have to update the mapping file. It is also very hard to get started with mapping files and tracking down what looks like simple problem can take ages.
There are two other options:
Use a code generator like CodeSmith that comes with templates for many ORM systems. When (not if) you need to customize the output you can edit the template, but the simple case are taken care of for you. That ways you just rerun the code generator every time you change the database schema and get a repeatable result.
And/or use fluent interface (e.g Fluent NHibernate) to configure your ORM system, this avoids the need to the Xml config file and in most cases you can use naming conventions to link fields to columns etc. This will be harder to start with then a drag/drop designer but will pay of in the long term if you do match refactoring of the code or database.
Another option is to use a model that you generate both your database and code from. The “model” is your source code and is kept under version control. This is called “Model Driven Development” and can be great if you have lots of classes that have simpler patterns, as you only need to create the template for each pattern once.
I have heard iBattis is good. A few companies fall back to iBattis when their programmer teams are not capable of understanding Hibernate (time issue).
Personally, I still like Linq2Sql. Yes, the first time someone needs to delete and redrag over a table seems like too much work, but it really is not. And the time that it doesn't update your class code when you save is really a pain, but you simply control-a your tables and drag them over again. Total remakes are very quick and painless. The classes it creates are extremely simple. You can even create multiple table entities if you like with SPs for CRUD.
Linking SPs to CRUD is similar to EF: You simply setup your SP with the same parameters as your table, then drag it over your table, and poof, it matches the data types.
A lot of people go out of their way to take IQueryable away from the repository, but you can limit what you link in linq2Sql, so IQueryable is not too bad.
Come to think of it, I wonder if there is a way to restrict the relations (and foreign keys).

Categories