Migrating Between Relational Databases


In this post we show how to migrate off of any RDBMS, onto any other RDBMS. We use a case study of Oracle -> MySQL, but we can move between any 2 combinations (SQL Server -> Postgres, MySQL -> SQLite…etc.)

TL;DR? Jump to code snippet solution.

The New England Patriots Case Study

With NEP, I was responsible for projects spanning various fields, ranging from Web Development to Machine Learning. After learning that we kept the majority of our data in an Oracle 9i RDBMS, and learning that Oracle pulled support for the system in July 2010, naturally the first move to make was to migrate onto a different RDBMS, or to upgrade.

In short, we chose to migrate from Oracle onto MySQL as our target RDBMS. So began my journey to migrate off of a very outdated database, and onto a modern solution.

The Project Requirements

As most projects do, the project grew from a simple database migration, to a database migration requiring cleaning data, altering schema, identifying and dropping unused/legacy columns etc… We identified the following requirements when searching for a solution:

  1. No Loss of Information
    • Data, schema, constraints and indexes all must be migrated without any loss of information.
  2. Automated Solution
    • Avoid .sql scripts and manual scripting.
    • ‘Push-button’ solution, rather than several small procedures.
  3. High Degree of Customizability.
    • Adding custom rules for schema transformations.
    • Adding custom rules for data transformations.
  4. Performance
    • Fast enough to test multiple runs per day.

The Immediate Solutions Failed…

  1. MySQL Workbench involved a great deal of manual configuring, and no ‘automated’ way to migrate from Oracle -> MySQL.
    • (Why would Oracle allow Enterprise -> Open-source migrations within their own product-line anyway?)
  2. DBConvert lacked support for Oracle versions before 10.x.x.
    • (Pricing also starts at $150 for Personal use, $1000 for Enterprise use)
  3. A few Ruby gems (taps) lacked support for many of the Oracle column types, and the cx_Oracle driver in general.

The Solution

I decided to dust off an old college project, burn the midnight oil to revitalize it, and leverage it to solve this problem.

The project, etlalchemy, is an open-sourced Python application which sits atop SQLAlchemy, and allows ETL (Extract, Transform, Load) functionality between any 2 SQL databases. The tool presents a “Simple over Complex” solution to the problem, allowing you to Migrate any SQL Database with 4 Lines of Code. (More advanced features are also available).

To install the tool:

pip install etlalchemy
# On El Capitan:
### pip install --ignore-installed etlalchemy

To run the tool:

from etlalchemy import ETLAlchemySource, ETLAlchemyTarget
# Migrate from SQL Server onto PostgreSQL
src = ETLAlchemySource("mssql+pyodbc://user:passwd@DSN_NAME")
tgt = ETLAlchemyTarget("postgresql://user:passwd@hostname/dbname",
                          drop_database=True)
tgt.addSource(src)
tgt.migrate()

Since solving my problem with the first pass of etlalchemy, I have spent months adding support for PostgreSQL, MySQL, Oracle, SQL Server, and SQLite. This means that with 4 lines of code, you can do things like migrate from SQL Server onto MySQL, migrate from Oracle onto PostgreSQL, or in my case migrate Oracle onto MySQL. Another helpful usecase is to migrate a remote MySQL database onto a local SQLite database to test an application on your local machine.

Happy Database Migrating!