Engine Yard Blog RSS Feed

In TDD and BDD we write small, focused, technical tests, sometimes called micro-tests. One of the core ideas is that these tests should run fast, really fast—each one measured in milliseconds. If you're writing plain Ruby code, that's pretty easy to accomplish. However, when you're using something like Rails or Merb and DataMapper or ActiveRecord, it can get a bit more challenging.

Why do we end up with model tests/specs that run slowly? In looking at how we write and test our code, let's ignore controllers and views, and focus on models. After all, models are where you put your business logic.

There can be several reasons for a slow test suite: database access, unnecessary objects, a massive setup that takes a while to load—just to name a few. This time though, I want us to focus on a new reason, one that's a result of operating in the context of ActiveRecord or DataMapper. One that tricks us into thinking we're doing well. One that exposes a few somewhat major flaws in the design of AR and DM.

With ActiveRecord and DataMapper, each model class is responsible for its own persistence, hence DHH using the name "Active Record." It was inspired by Fowler's writing on the pattern in Patterns of Enterprise Application Architecture and is the enabler of our model addiction.

OverActive Record

Used properly, the active record pattern is great for persisting data. These objects are great at handling their own persistence. Accessors, associations, and validation make a good wrapper around the transactional nature of a database. This is fine when the model is a simple data object, but we all run into problems when we start adding other behavior to the model class.

Weighted Companion Cake

Weighted Companion Cake: The cube was the cake!

(Compliments to flickr user Mandrake88)

Recall the Single Responsibility Principle: "There should never be more than one reason for a class to change." As we commonly use them, ActiveRecord and DataMapper1 classes almost always violate this principle. There are usually at least two responsibilities handled by every AR/DM model: persistence and business logic. Carrying around that persistence behavior, and all the dependencies that go along with it, is what bloats and slows down our specs.

There have been various attempts at dealing with this: in memory databases, stubbing parts of the DM/AR frameworks, etc. None of these are ideal. Either they don't speed things up as much as we'd like, they're awkward, or they bulk up the specs an objectionable amount.

Is There a Better Way? Why, Yes! Yes There Is.

DM/AR lets you define a model's properties, its persistent parts, and generates the associated accessors and mutators. All access to the persistent properties is handled through these accessors and mutators. This gives us a perfect seam along which to split the class. All of the persistence-dependent functionality is on one side of this seam, and the business logic of the model is on the other.

Dragon Cake

We need a good saving throw against this frosting.

Courtesy of Mike's Amazing Cakes

Pretend... It's a Cake.

Yes, you read that correctly. Keep on reading—it makes sense!

Each of our models is like a frosted cake. We have the nice fluffy cake batter, baked with our properties, validations, and persistence. Then we frost the outside with our business logic. We can take a simple ActiveRecord cake and make it into almost anything, with a good amount of frosting.

Now herein lies the problem: the frosting starts to take on a life of its own. Before we know it we're baking a few 100 ActiveRecord cakes each time we need to test the frosting behavior... There are two really good ways to get around this: cupcakes and cardboard.

Strategy 1: Oh Look! Cupcakes!

When we build a web application with ActiveRecord or DataMapper, it's really hard to think outside the persistence box. If something is persisted on that model, or if it uses something on that model, we automatically add the code to the model. This leads quickly to a gigantic cake that does everything.

This is especially bad with primary models like User. The poor bloated User gets tons of code stuffed into it because our app is usually focused around users. Users do things in our app. Users get affected by things. They control the app and most things belong to the user in some way. But this doesn't mean our User class should be 1000 lines.

This cake does too much.

This cake does too much (yes that's cake)

The solution is to break out chunks of code into their own classes. There are many common patterns that we can find in our classes: Factory, Provider, Policy, Strategy, and plenty of types that are custom to each domain (Cake Decorator?). Let's lump these all into something that we'll call Cupcake Classes.

At Engine Yard, one of the places we found an opportunity for a Cupcake was our Deployment initialize. The method had to do a lot of data mutating before it could create an instance. Blank strings needed to become nil, string representations of a type needed to be converted to the actual type, and defaults needed to be applied to fill out the incoming data. The initialize method was a ridiculous 50 lines long. This needed to become its own class.

The result is a really clean initialize:


  def initialize(params={})
    merger = Deployment::DefaultsMerger.new(params)
    merger.each_attribute { |att, val| send("#{att}=", val) }
  end

...and a class that knows all about how to merge incoming params and defaults. It has a clear API, and it's easily testable. We know what comes in and what comes out and we don't have to create a database record every time we test it.


class Deployment
  class DefaultsMerger

    def initialize(params)
      @attributes = merge_with_defaults(params)
      scrub_blank 0, :data_volume_size, :app_server_count
      scrub_blank nil, :db_volume_id, :data_volume_id
      coerce_value(:instance_size, :db_instance_size) do |val|
        Instance::Size.coerce(val)
      end
      # ...
    end

    def each_attribute
      @attributes.each do |k,v|
        yield k, safe_to_i(v)
      end
    end

  private

    def defaults
      Mash.new({
        :data_volume_size => Volume::DEFAULT_SIZE,
        :app_server_count => 1,
        # ...
      })
    end
    # ...
  end
end

Cupcake Cake of the Hungry Hungry Caterpillar

A maintainable and testable model should be a collection of cupcakes. (This model is a bit buggy)

Cupcake Coding Rule of Thumb: Messy Code Makes for Good Cupcakes!

When you want to clean up your models, look for messy code. Long methods or sets of utility methods (methods that barely, if at all, touch instance state) are just begging to be cut up into smaller classes.

What tipped us off was that the initialize method was long, nasty, and really had nothing to do with the class it was in. Mostly, it dealt with params, a hash. Almost every line in the old method did hash operations, not operations dealing with the actual model.

We clearly needed a class that would know how to work with merging our defaults into an incoming hash. The resulting class would also improve our real test coverage. It's much easier to test a bunch of branches when they don't have to pass through the model life cycle.


describe Deployment::DefaultsMerger do
  describe_attribute :data_volume_size do
    it_defaults_to Volume::DEFAULT_SIZE
    it_converts ["", nil, 0, "0"], :to => 0
    it_converts [123, "123"],      :to => 123
  end

  describe_attribute :app_server_count do
    it_defaults_to 1
    it_converts ["", nil, 0, "0"], :to => 0
    it_converts [2, "2"],          :to => 2
  end
  # ...
end

With a focused class, it's easy to write focused specs. With a few custom methods in your specs, you have beautifully concise descriptions of behavior to match your simple code.

Cardboard Cake

Looks yummy!

Thanks to flickr user hazellterry

Strategy 2: Cardboard Cake

The next way to improve your models and your testing is to extract the model's behavior from the persistence. The idea is to test the frosting without having to bake a cake every time. Our example class after the behavior has been extracted:


class Oven
  include DataMapper::Resource
  include Oven::Behavior

  ## Properties

  property :id,           Serial
  property :oven_type,    String
  property :temp_setting, String

  has 1, :heating_element
  has 1, :latch
  ## ...

end

Here's the frosting, or behavior, that has been extracted from it:


class Oven
  module Behavior

    def check_lock
      latch.engage! if dangerous_temperature?
    end

    def dangerous_temperature?
      Oven::Temperatures.dangerous? temp_setting
    end

    def turn_on(temp = 350) # ...
    def turn_off # ...
    def preheating? # ...
  end
end

Once this is done, you can write specs that focus on the logic in isolation. Instead of testing every aspect of behavior on top of the persistence framework (slow), you can isolate the behavior (fast). Sort of like making the cake out of cardboard so you can test our frosting without the baking time (See? Told you it would make sense!).

How Do We Achieve This Isolation? Mock Object Trickery, But of Course!

Let's create a mock to stand in for the model object being tested. Let's call these Cake Mocks. Then, we'll extend the mock object with with the Behavior we're testing.


@oven = mock("Oven").extend(Oven::Behavior)

It still acts like an oven, but instead we've frosted the outside of a cardboard cake with the same business logic that we would add to a real ActiveRecord object. The trick is to have the mock object extend the module that contains the business logic. The result is an object containing the behavior of the model but not its persistence.

Now we can stub accessors on the mock (but only those involved in the example) and set expectations on the mutators. Then the example can call methods on the mock oven as if it was a real Oven instance. Those calls invoke the actual business logic methods that we want to test. When those methods access the internal behavior of our persistence layer, our mock answers with the methods we stubbed. The mock merely fills in the center of the cake.


describe Oven::Behavior do
  describe "preheating" do
    before(:each) do
      @latch = mock("latch")
      @oven = mock("Oven", :latch => @latch).extend(Oven::Behavior)
    end

    it "isn't locked when temperature setting is less than 500" do
      @oven.stub!(:temp_setting).and_return(350)
      @latch.should_receive(:engage!).never
      @oven.check_lock
    end

    it "isn't locked when temperature setting is 500" do
      @oven.stub!(:temp_setting).and_return(500)
      @latch.should_receive(:engage!).never
      @oven.check_lock
    end

    it "is locked when temperature setting is greater than 500" do
      @oven.stub!(:temp_setting).and_return(505)
      @latch.should_receive(:engage!).once
      @oven.check_lock
    end
  end
end

This cake mock style encourages another good testing practice: integration testing. When the model is completely mixed up with behavior, the need for integration testing is easier to ignore, but no less important. With separated behavior and persistence we can write simple integration tests that make sure the seam between the cake and the frosting is still intact.

Give Your Models Some Attention

When looking at models this way, whether you're breaking them up in to cupcakes or separating the frosting, it's easier to see the difference between behavior and persistence. If the logic can stand on its own, make a new class. If it's more like an extension of the model, make a module and mix it back in. Allow ActiveRecord and DataMapper models to take care of what they do best without all the extra weight.

Take a few minutes and look at the models in your project. If the classes weren't backed by the database, would you still have all that stuff jammed into one place? Are your models more complex than they need to be? If so... do something about it!


[1] DataMapper may be named after Fowler's DataMapper pattern, but that's where the similarity ends. "DataMapper" is little more than a variation of ActiveRecord. The objects are responsible for their own persistence, while the DataMapper pattern uses a separate object (the mapper) that manages persistence: "A layer of Mappers (473) that moves data between objects and a database while keeping them independent of each other and the mapper itself."


Tagged:

comments powered by Disqus