Engine Yard Blog RSS Feed

Today's guest post hails from community contributor [Piotr Solnica](http://www.codebenders.com/people/piotr-solnica/) of [ Code Benders](http://www.codebenders.com/), an agile software collective based in Bend, Oregon. In his free time Piotr supports OSS, is a [DataMapper](http://datamapper.org/) core contributor, an author of DataMapper plugins and adapters and one of the developers of [rubytime.org](http://rubytime.org/) and [coderack.org](http://coderack.org/).

Ruby on Rails makes it easy to bootstrap a greenfield project. Unfortunately not all projects are greenfield, so sometimes you need to work with a legacy database schema. ActiveRecord is a great choice when you have a full control over your database from the very beginning, but what should you do if you need to connect to a database with a schema that isn't in-line with Rails conventions? In this post I will explain how to setup a Rails application that uses DataMapper as the ORM. I'll also guide you through configuring your models to work with a legacy schema. If you are not familiar with DataMapper I recommend reading the “Getting Started” section of the official documentation.

A legacy schema

For the sake of this post let's use an example schema of a blog application using MySQL:


CREATE TABLE `post` (
 `postId` varchar(45) NOT NULL,
 `postAuthorId` int(11) DEFAULT NULL,
 `postTitle` varchar(45) DEFAULT NULL,
 `postContent` text,
 `postPublishTime` varchar(45) DEFAULT NULL,
 PRIMARY KEY (`postId`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

CREATE TABLE `user` (
 `userId` int(11) NOT NULL AUTO_INCREMENT,
 `userName` varchar(45) DEFAULT NULL,
 `userEmail` varchar(45) DEFAULT NULL,
 PRIMARY KEY (`userId`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;


Here's what's “wrong” with it:
  • table names are singularized
  • column names use camel-case naming convention prefixed with a table name
  • post primary key is an md5 sum of concatenated values of “author” and “publishTime”
  • post => user foreign key has a custom name “author”
  • “postPublishTime” is stored as a string and we would like to use DateTime object
###Creating a rails application with DataMapper as ORM This is actually a straight-forward task. All you need to do is to use a template provided by DataMapper when generating an application skeleton as follows: `rails new blog-legacy -m http://datamapper.org/templates/rails.rb` By default the skeleton uses sqlite as the database. To change that, simply edit Gemfile and change ‘dm-sqlite-adapter' to ‘dm-mysql-adapter' and run bundle install. To generate empty models run:

script/rails g model user
script/rails g model post


###Naming conventions
DataMapper  gives you two ways of defining table and column names. You can either  set it manually in every model or you can configure global naming  conventions for all models.Here's how to set table and field names explicitly in the model:


class User
  include DataMapper::Resource

  storage_names[:default] = "user"

  property :id, Serial, :field => "userId"
  property :name,  String, :field => "userName"
  property :email, String, :field => "userEmail"
end


With  many properties it would add clutter to our models, that is why it is better to set global naming conventions for resources and fields. Those  conventions will be applied to every model and will eliminate the need  to define storage names and set :field options for every property.

To set global naming conventions, we can create datamapper.rb initializer like this:


# config/initializers/datamapper.rb
Rails::DataMapper.configuration.resource_naming_convention[:default] = lambda do |name|
  name.downcase.singularize
end

# note: this will be available in dm-rails 1.1.1
Rails::DataMapper.configuration.field_naming_convention[:default] = lambda do |property|
  "#{property.model.name.downcase}#{property.name.to_s.camelize}"
end


Now we can clean up our User model like this:

class User include DataMapper::Resource

property :id, Serial property :name, String property :email, String end


Much better!
###Dealing with natural primary keys
In  our example schema, a post's primary key is an MD5 sum of concatenated  fields “postAuthorId” and “postPublishTime”. If we want to have the  value generated in Ruby we can use :default options in property  declaration:


class Post
  include DataMapper::Resource

  property :id, String, :key => true, :default => lambda { |resource, property|
    Digest::MD5.hexdigest("#{author}#{publish_time}") }

  # ...
end


Now  whenever you create a post and don't provide :id it will be generated  automatically based on the values of :author and :publish_time  properties.

As  you can see this doesn't look very nice in the model. So, we could  create a custom property with a generalized functionality to clean this  up! This way we can re-use this code in different scenarios whenever we  want an MD5 sum of concatenated fields. You might be surprised how easy  this is to do:


 module ::DataMapper
   class Property
     class MD5 < String
    key true
    length 32

    accept_options :fields

    default lambda { |resource, property|
          Digest::MD5.hexdigest(property.options[:fields].join) }
     end
   end
 end


Now we can define post :id like this:

class Post include DataMapper::Resource

property :id, MD5, :fields => [ :author, :publish_time ] end


How easy was that?!?
###Associations with custom foreign keys
Our  database has a basic 1:M relationship between User and Post models. The  only caveat is that foreign key on post table pointing to user id has a  name that doesn't follow our global naming conventions. To configure  the association we must override our defaults:


class User
  include DataMapper::Resource

  property :id, Serial
  property :name,  String
  property :email, String

  # We need to set child_key otherwise DataMapper would use :user_id
  has n, :posts, :child_key => :author_id
end

class Post
  include DataMapper::Resource

  property :id, MD5, :fields => [ :author, :publish_time ]
  property :title,  String
  property :content, Text

  # This creates a property called :author_id for you. We also set “User” as the parent model
  belongs_to :author, “User”
end


Easy enough.
###Fine tuning - typecasting properties
Post  :publish_time is a String property as values are stored in a VARCHAR  column. Our goal is to work with DateTime objects, not strings. To  achieve that we could simply override property accessor and do  typecasting like this:


class Post
  # …

  def publish_time
    DateTime.parse(super)
  end
end


While  this approach works, it is not a recommended approach. DataMapper is  based on the concept that a model's logic should be decoupled from the  database schema. A much better and cleaner way is to abstract away that  logic in to a custom property object. An extra benefit comes with that  too - you can re-use that code in many applications as it's not tied to a  specific model. Here's another approach:


module DataMapper
  class Property
    class DateTimeString < String
      def load(value)
        ::DateTime.parse(value)
      end

      def dump(value)
        value.to_s
      end
    end
  end
end


Better?
###Wrap-up
All right! We've got all we need to work with our legacy schema. Let's take a look at the final version of the models:


class User
  include DataMapper::Resource

  property :id,            Serial
  property :name,  String
  property :email, String

  has n, :posts, :child_key => :author_id
end

class Post
  include DataMapper::Resource

  property :id, MD5, :fields => [ :author, :publish_time ]  

  property :publish_time, DateTimeString
  property :title,       String
  property :content,  Text

  belongs_to :author, "User"
end


At  first sight it is hard to tell that the underlying database has a funky  schema, isn't it? These models look almost as if you were starting a  fresh project.

Let's take a look at how we could work with them:


user = User.create(:name => "John Doe", :email => "john.doe@example.com")
=> #


This creates a following insert statement:

INSERT INTO user (userName, userEmail) VALUES ('John Doe', 'john.doe@example.com')


Let's create some posts too:


user.posts.create(:title => "Hello World!", :content => "DataMapper is awesome :)", :publish_time => DateTime.now)
=>  #


And SQL:

INSERT INTO post (postId, postPublishTime, postTitle, postContent, postAuthorId) VALUES ('4a585935e1d2dd21da146f6d3bcfc41b', '2011-05-12T13:39:04+02:00', 'Hello World!', 'DataMapper is awesome :)', 2)

Further reading and resources

This post explains basics as well as some more advanced things you can do with DataMapper. If you would like to learn more, please take a look at the documentation, subscribe to the Mailing List or jump into the #datamapper IRC channel.

Other useful links:

__Author Bio__ [Piotr Solnica](http://www.codebenders.com/people/piotr-solnica/) is a team member at[ Code Benders](http://www.codebenders.com/), an agile software collective based in Bend, Oregon. When Piotr's not writing great code, he's writing awesome code. He has been working as a web developer for 6+ years and has solid experience with Ruby-based technologies and frameworks. His proficiencies include a range of JavaScript libraries such as Prototype, jQuery, YUI and MooTools. Piotr is an unobtrusive JavaScript evangelist, an HTML/CSS purist and a linux geek. In his free time he supports OSS, is a DataMapper core contributor, an author of DataMapper plugins and adapters and one of the developers of [rubytime.org](http://rubytime.org/) and [coderack.org](http://coderack.org/).

Tagged:

comments powered by Disqus