Since December, Rails has undergone a fairly significant internal refactoring in quite a number of areas. While it was quite tricky at first, we mere mortals have started to hone a process for diving into a new area of the codebase and emerging some time later with a much improved area that does basically the same thing. Here’s the approach we’ve adopted and advocate:
First, refactoring needs to be refactoring, not revision. By that I mean that while you are in the process of invasively improving some code, it is not the appropriate time to also change the functionality of that code. If you do both at the same time, it will be difficult to track down whether a bug in the code is the result of refactoring or functionality changes.
We’ve held fast to this requirement for the Rails 3 work Carl and I have been doing, which has resulted in an extremely stable edge, despite making fairly invasive changes.
Second, any kind of significant refactoring without tests is folly. The first thing you should do is take a look at the test suite for the area in question and beef it up if necessary.
Thankfully, Rails has a fairly reasonable test suite, and the addition of Sam Ruby’s Agile Web Development on Rails test suite has provided an additional level of confidence in the changes we’re making.
Third, once you’re ready to dive in, read through the code carefully. It can be tempting to just go in and hack away at a particularly egregious part of the codebase, but you’ll frequently be changing code that exists for a reason.
Something I’ve noticed both in Rails applications and in Rails itself is that code that looks very strange at the beginning of a period of refactoring tends to exist for a reason.
__Fourth, as you proceed, make very small changes, then run the full test suite after every change. __Commit often. What you want to look for is cases where the boundary APIs around the code you’re writing are messy (so you have multiple ways in to a particular class or area of code where one would suffice).
One Rails example would be rendering a template in ActionView from ActionController. When I started in December, ActionController called into ActionView using a number of public and private APIs, so making any changes around those boundaries was very tricky. Some things we wanted to do, like improve the way layouts were selected, was too complex because of the number of ways templates and layouts were rendered.
The very first thing I did in the early days of the merge was work toward reducing the number of ways that ActionController told ActionView to render a template. In the end, we settled on just a single API:
render_template_from_controller, which takes a Template object from the template to render, and a Template object for the layout. Once this was done, it became a lot easier to make changes on either side of the boundary, without fear that a small change in ActionView could break any number of things in ActionController.
Of course, this assumes that you understand what your boundaries are. This is something that’s learned over time, but a fundamental requirement in good refactoring is having functionality broken up into units that are easy to understand, with small surface area. This is commonly achieved using classes, which is a good starting point, but Ruby has other tricks up its sleeve as you get more advanced, like judicious use of modules (and the new Rails
Fifth, once you have reasonable boundaries, dive in and start making changes. A pretty good rule of thumb is to clean up cases where a public API has started being used for private, internal use. This might mean that changing the internals of your code breaks the public functionality (which, again, should be sacrosanct during this process). Have a zero-tolerance policy for failing tests as you make small changes, especially as you separate out public and private functionality.
One example of this in Rails was extensive usage of ActionView’s public render method by private functionality. As a result, the public render method had snippets of code inside to handle special cases (like
render :file taking a Template object). The solution in this case was to extract out the private functionality, and have the public
render method as well as the private internals call the new extracted methods. This ensures that internal functionality is kept internally, where it can be refactored more easily.
Sixth, don’t be afraid to
git reset --hard if you find yourself sinking into quicksand, with rising confusion due to changes you made. Over the course of working with Rails, I’ve lost an hour or more at a time to changes made too rapidly and carelessly, and the only advice I can give is to give up on ratholes as early as you notice them.
So that’s it. Six easy steps to refactoring Rails.