Today's guest post hails from community contributor Piotr Solnica of Code Benders, an agile software collective based in Bend, Oregon. In his free time Piotr supports OSS, is a DataMapper](http://datamapper.org/) core contributor, an author of DataMapper plugins and adapters and one of the developers of rubytime.org and coderack.org.
Ruby on Rails makes it easy to bootstrap a greenfield project. Unfortunately not all projects are greenfield, so sometimes you need to work with a legacy database schema. ActiveRecord is a great choice when you have a full control over your database from the very beginning, but what should you do if you need to connect to a database with a schema that isn't in-line with Rails conventions? In this post I will explain how to setup a Rails application that uses DataMapper as the ORM. I'll also guide you through configuring your models to work with a legacy schema. If you are not familiar with DataMapper I recommend reading the “Getting Started” section of the official documentation.
A legacy schema
For the sake of this post let's use an example schema of a blog application using MySQL:
Here's what's “wrong” with it:
- table names are singularized
- column names use camel-case naming convention prefixed with a table name
- post primary key is an md5 sum of concatenated values of “author” and “publishTime”
- post => user foreign key has a custom name “author”
- “postPublishTime” is stored as a string and we would like to use DateTime object
Creating a rails application with DataMapper as ORM
This is actually a straight-forward task. All you need to do is to use a template provided by DataMapper when generating an application skeleton as follows:
rails new blog-legacy -m http://datamapper.org/templates/rails.rb
By default the skeleton uses sqlite as the database. To change that, simply edit Gemfile and change ‘dm-sqlite-adapter' to ‘dm-mysql-adapter' and run bundle install. To generate empty models run:
DataMapper gives you two ways of defining table and column names. You can either set it manually in every model or you can configure global naming conventions for all models.Here's how to set table and field names explicitly in the model:
With many properties it would add clutter to our models, that is why it is better to set global naming conventions for resources and fields. Those conventions will be applied to every model and will eliminate the need to define storage names and set :field options for every property.
To set global naming conventions, we can create datamapper.rb initializer like this:
Now we can clean up our User model like this:
Dealing with natural primary keys
In our example schema, a post's primary key is an MD5 sum of concatenated fields “postAuthorId” and “postPublishTime”. If we want to have the value generated in Ruby we can use :default options in property declaration:
Now whenever you create a post and don't provide :id it will be generated automatically based on the values of :author and :publish_time properties.
As you can see this doesn't look very nice in the model. So, we could create a custom property with a generalized functionality to clean this up! This way we can re-use this code in different scenarios whenever we want an MD5 sum of concatenated fields. You might be surprised how easy this is to do:
Now we can define post :id like this:
How easy was that?!?
Associations with custom foreign keys
Our database has a basic 1:M relationship between User and Post models. The only caveat is that foreign key on post table pointing to user id has a name that doesn't follow our global naming conventions. To configure the association we must override our defaults:
Fine tuning - typecasting properties
Post :publish_time is a String property as values are stored in a VARCHAR column. Our goal is to work with DateTime objects, not strings. To achieve that we could simply override property accessor and do typecasting like this:
While this approach works, it is not a recommended approach. DataMapper is based on the concept that a model's logic should be decoupled from the database schema. A much better and cleaner way is to abstract away that logic in to a custom property object. An extra benefit comes with that too - you can re-use that code in many applications as it's not tied to a specific model. Here's another approach:
All right! We've got all we need to work with our legacy schema. Let's take a look at the final version of the models:
At first sight it is hard to tell that the underlying database has a funky schema, isn't it? These models look almost as if you were starting a fresh project.
Let's take a look at how we could work with them:
This creates a following insert statement:
Let's create some posts too:
Further reading and resources
This post explains basics as well as some more advanced things you can do with DataMapper. If you would like to learn more, please take a look at the documentation, subscribe to the Mailing List or jump into the #datamapper IRC channel.
Other useful links: