Engine Yard

Developer Blog

Ruby Posts

What happened to the Rails 4 Queue API?

By | April 11th, 2013 at 11:04AM

The Queue API in Rails 4 is supposed to be an abstraction layer for background processing. It ships with a basic implementation, but developers are expected to swap out the default backend with something more production ready like Resque or Sidekiq. This standardization should then allow Rails plug-ins (and Rails itself) to perform work asynchronously where it makes sense without having to worry about supporting all of the popular backends.

In preparation for my talk titled “How to fail at Background Jobs”, I’ve been following activity on the Rails 4 Queue API. Recently, the Queue API was removed from the master branch, and pushed off until Rails 4.1 at the earliest.

What follows is my third party attempt to report on why. My main source of information is this commit on GitHub, but I’ll also attempt to draw some conclusions based on my own experience with queueing systems.

Another interesting source of information comes in the comments with the very first commit to add a Queueing API to Rails: https://github.com/rails/rails/commit/adff4a706a5d7ad18ef05303461e1a0d848bd662

Basically, I see three failures with the Queue API as currently implemented in the “jobs” branch: https://github.com/rails/rails/tree/jobs

1. The API

The API as implemented is a “nice idea”, but it’s actually very un-Rails-like when compared to things like ActiveRecord.

Here’s an example of enqueuing a job in the existing implementation:

class SignupNotification
 
  def initialize(user)
    @user.user
  end
 
  def run
    puts "Notifying #{@user.name}..."
  end
 
end
 
Rails.queue[:important_jobs].push(SignupNotification.new(user))

For illustrative purposes only, a more Rails-like API might look like this:

class SignupNotification
  connect_to_queue :important_jobs
 
  def run(user_id)
    user = User.find(user_id)
    puts "Notifying #{@user.name}..."
  end
 
end
 
SignupNotification.async.run(user.id)

The name of the queue should be a concern of the job (not the place that enqueued it). Imagine if you wanted to change the queue name, you’d have to change every enqueue-ing place to reference the new name.

Also, notice we have to do a little extra work in our implementation of run to fetch the user by ID instead of having our queuing system Marshall it for us. This leads me to the next failure…

2. Marshall vs. JSON

Jobs are generally run in a different process from where they were enqueued. This means serialization. The simplest choice for doing this would seem to be Ruby’s built in Marshall. Rails took the approach of Marshalling an entire job class, while most other libraries use serialize the job arguments and job class name to JSON. It’s a best practice in most other systems to store as little information as possible information about the job in the queue itself. A queue is an ordering system, information should be stored in a database.

The Marshall approach is a slightly nicer API for the developer, but quickly breaks down in practice. Care must be taken not to Marshall objects with too many relationships to other objects or Procs (which cannot be Marshalled in Ruby unless you are using the niche implementation: MagLev).

Finally, Marshalling is not as nice for Ops. Monitoring a running queue in production is much easier when you can easily inspect the contents of jobs. JSON is a much more portable format.

3. Solving the Wrong Problem

It seems one of the major goals of the Rails 4 Queue is to always send e-mails in the background. We could debate whether action_mailer really belongs as part of a Model-View-Controller framework in the first place, but I digress.

Let me re-word that a bit: One of the major goals of Rails 4 Queue is to ensure that the sending of e-mails does not adversely impact web response time.

Generally, this sort of thing is done using a background jobs system like Resque: you make a job that sends your e-mail. But Rails core thinks we can do better than that, we don’t need a background job system if we can just make our web application server do the work after it’s completed sending the response to the client.

Here’s some terribly ugly and hacky code to demonstrate my point.

Example using thin: https://gist.github.com/jacobo/5164180

Example using Unicorn: https://gist.github.com/jacobo/5164192

If you run these rack apps and hit them with curl, you’ll see that the “e-mail processing” does not interfere with the client receiving a response. But, it does tie up these single-threaded web servers. They won’t serve the next request until they are finished with the previous after-request job.

Another approach might be to use threads, but unless you are on JRuby or Rubinius, you would likely slow down your response processing. As your e-mail sending thread will likely start executing and using up processing power that would otherwise be used to generate the response.

The only good way to solve this problem is to make changes to Rack itself, but I’ve yet to see a proposal on exactly what these might be.

In Conclusion

I’m hoping to see the discussion continue. Maybe there’s even an opportunity for other community members to step up and propose ideas about what the Rails 4 Queue API should look like. I think getting this right and shipping it will be a huge win for Rails developers everywhere who are currently duplicating effort working on a myriad of background job processing extensions and customizations coupled to their current backend queueing library.

Popularity: unranked |

Learning Rails (and Ruby)

By | April 8th, 2013 at 2:04PM

I know PHP. I mean, I really know PHP. Not just the syntax, or the idioms and idiosyncrasies, but why. I can tell you why something works the way it does, under the hood; and I was probably around when the decision was made to do it that way. Thirteen years with any language is a long time.

But it hasn’t always been PHP. Two years into my PHP journey, I took a small detour and taught myself ColdFusion, which had just transitioned to running on top of the Java EE platform. Which also meant that I dug into Java because you could extend ColdFusion with Java components.

And then of course there was the inevitable delving into JavaScript, add in a healthy dose of CSS, semantic web technologies (RDF, OWL, and SPARQL), XML, XPath, and XSL (XSL:FO and XSLT) and lets not forget SQL. Heck, I can write (and have written) DTDs!

More recently, after starting to work for Engine Yard on the Orchestra PHP Platform, I learned Python. (Yes, we use Python for parts of our PHP stack. Why? It’s the best tool for the job.)

I didn’t list most of the keywords on my resumé to make myself sound fancy, I did it because I think it’s fair to say at this point, I qualify as a Polyglot.

I have always explored new tools (be they daemons, utilities, libraries, languages or services) and judged them on a few criteria:

  • How well is the tool written?

  • What is its security track record?

  • How many open bugs does it have?

  • How has the community responded to previous issues (are they open, friendly, courteous, prompt)?

  • Does it have enough features for what I need?

  • Does it have too many features for what I need?

Ultimately, it comes down to: Is it the right tool for the task?

Because of this, ultimately when I come to write a web site, PHP is my tool of choice. Know thy tool well, and it shall treat you well.

Then along came Engine Yard, and I was exposed to just a ton of fantastic engineers who happen to choose Ruby as their tool of choice.

Even still, more than a year after working for Engine Yard, I had yet to pick up Ruby, or Rails. Sure, I’ve read a lot of Ruby code, for code reviews and out of interest for how something has been done. I’ve even hacked a little on some Rails stuff, but it was mostly copy-and-paste-and-hum-a-few-bars.

Then along came Distill, and we needed a site. With about 3 weeks to work on it, while still working on other tasks, and with no requirements for technology choice, I would normally have just picked up PHP, and probably Zend Framework 2 and knocked it out in a few days.

Instead, given that I’ve been discussing Distill, and what we want to achieve (a focus on solutions, not technologies) for months, I decided to get into the spirit with what looked like an opportunity to try out Rails (and Ruby). This was a small project, with limited feature set and scope that I could quickly fall back to PHP if I ran into too many issues. Luckily, surrounded by literally dozens of fantastic experienced developers, I had a lot of folks I could ask questions of — but as you’ll see, I didn’t really need much help.

Implementation Details

This blog post is not meant to focus on how I learned Ruby or Rails, but what I learned from the experience. However, I did want to cover some of this too.

Coming from PHP, I definitely encountered some WTFs:

  • Parentheses are optional on method calls, and often not used: but in some cases [such as nested calls] you need them.

  • There are lots of ways to do “if not”. These include: if !<condition>, if not <condition> and unless <condition>.

  • Method names can contain ? and !, and there is a convention of ending method names that return boolean with ?. It’s not a operator, it’s part of the name, e.g. foo.empty?. Methods ending in ! usually indicate they modify the object they are called upon e.g. foo.downcase! modifies foo, while foo.downcase returns the result, for this reason they are known as destructive methods.

  • Returns can be implicit, a method returns the result of the last statement run (and most code I’ve seen does this)

Then you get something like this (actual code at some point for the Distill website. It may have changed by the time it goes live):

class Speaker < ActiveRecord::Base
  belongs_to :user
  has_many :proposals
 
  attr_accessible :user, :bio, :email, :name, :id, :website,  :photo
 
  has_attached_file :photo, :styles => { :medium => "300x300>", :thumb => "120x120>" }
  validates_attachment_content_type :photo, :content_type => /^image\/(png|gif|jpeg)/
 
  validates :bio, :email, :name, :photo, :presence => true
  validates :email, :format => {
      :with => /\A[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]+\z/,
      :message => "Must be a valid email address"
  }
end

Lets break this down:

  • Line 1: We define a class, Speaker that extends (<) the Base class in (::) the ActiveRecord module.

  • Lines 2, 3, 5, 7, 8, 9, 10: are all method calls

  • Line 15: we close the class definition (end)

“But wait, you said method calls?” Indeed I did! “That’s crazy talk! We’re still in the class definition!”

First, it’s important to note that Ruby has an implicit self, which is like self:: in PHP, it calls methods statically (that’s the easiest equivalency). This means that belongs_to :user can also be expressed as self.belongs_to :user. What’s weird here, is that these (which are inherited) are being called during definition of the class. These methods can be defined (e.g. def self.foo) and called (after definition) within the definition of that same class, or inherited from it’s parent. These methods modify the class object itself.

Aside: while writing this blog post, I actually fully realized what the previous section means, and I had a tweet exchange with fellow Engine Yarder @mkb which helped solidify what’s going on which you can see here — to summarize: classes are defined by executing code, this means you can programmatically define classes, and that you can even work with it during definition.

So what I thought was a property (validates), that was somehow magically defined twice, is actually a method call – there’s that lack of parenthesis on method calls that I mentioned earlier.

So what happened?

Well, I built a website. A secure, readable (code), usable website. Nothing more than I could have built in PHP, but there were several things that I did that sort of blew my mind.

For the majority of this app, we’re talking simple CRUD. Show a form, take the input, store the input, and display it later. There was no complex data structures that I had to worry about or anything. Ruby/Rails or PHP/Zend Framework 2; didn’t really matter. I could’ve written this in **bash script** probably.

I read through most of the Getting Started with Rails, adapting it to my needs as a I went along.

The two parts I thought would be the most challenging:

  1. OAuth with multiple backends (Github, Facebook, and Twitter)

  2. Storing uploaded images on S3.

OAuth with Multiple Backends

To solve this challenge, I use the time-honored practice of Googling. In doing so, I stumbled across Devise and Omniauth; two gems that implement user authentication and OAuth, respectively.

Integrating these gems for someone who barely knows Ruby or Rails was actually quite tricky, but I got it done and was quite amazed! It didn’t just handle the OAuth, it handled routes, views, forms, database schema – pretty much everything. I did have to write custom handlers to deal with pushing the user into the database and handling the data sent back by the service (e.g. name/email), but nothing too difficult.

S3 Image Uploads

Again, I solved this with Google, and came up with the Paperclip gem. A file “attachment” extension to ActiveRecord.

Through paperclip, I went from not knowing how to implement file uploads in Ruby/Rails, to handling it, storing the details in the database, creating multiple thumbnails, and pushing it to S3 within minutes. Essentially, after config (which is just specifying the storage adapter of S3, the credentials, and path), and a rake call, these few lines handled everything:

has_attached_file :photo, :styles => { :medium => "300x300>", :thumb => "120x120>" }
validates_attachment_content_type :photo, :content_type => /^image\/(png|gif|jpeg)/

This names the attachment as “photo”, specifies the versions we want (300×300 and 120×120) and validates that they uploaded a png, gif or jpeg.

What did I get out of this?

Well, I still have a ton to learn. Ruby isn’t just a different syntax, which was (mostly) my experience with Python. That being said, I feel I can now intelligently talk about some of the benefits of Rails compared to it’s PHP counterparts.

One of the most significant advantages is the library of amazing gems that work out of the box with Rails and can bring a lot more than more generic PHP libraries, both due to the widespread usage of Rails in the Ruby community and the fact that folks tend to stick to it’s standard tooling. For example, there is an OAuth component for Zend Framework 2, but it doesn’t presume that you’re using ZF2 as your controller/router and setup the routes, nor does it generate views, or hook into a specific auth mechanism or database adapter. This, I believe, it what makes Rails (and therefore Ruby) a great tool for rapid development. (If you’re hunting for gems, The Ruby Toolbox lists gems by what they do, and their popularity, which is quite handy!)

I also think we need to separate the framework from the language. Just like PHP, or Python, Ruby is a general purpose language. PHP however was built from the ground up with a primary focus on running in a web environment. But that really doesn’t mean much more than it just provides easy access to the web environment (GET/POST/Cookies, built in session handling, PUT/POST raw data, server environment, etc). What this means to me is that I could have built the Distill site in any of these three.

One major factor PHP has going for it for the web, is that it’s shared-nothing architecture is great for horizontal scaling, and it seems better at handling concurrency out of the box — though Ruby is  making great strides in this area (with projects like rubinius, and jruby). That isn’t to say Ruby or Python can’t, or don’t scale, it’s just that it’s much further along in the learning curve because there’s more to getting it right. Of course, working with (and deploying on) the Engine Yard Cloud means this was a non-issue for me.

So, the language doesn’t matter all that much, but what about the framework? Could I have built the same website using Zend Framework 2? Yes. Would it have been easier? In some aspects, specifically the fact I don’t know Ruby or Rails that well, sure. However I don’t think I could have built out the OAuth and S3 storage as covered here as quickly and easily given all other factors were equal. Now, going back to Zend Framework 2, I find I’m doing a lot more busy-work, such as generating forms, scaffolding, schema updates, etc, at least as a starting point.

Does this mean I’m switching to Ruby/Rails? Unlikely. PHP is still the preferred ink in my pen, simply because knowing it so well means it’s an effortless tool to transform my ideas into reality.

Will I turn to Ruby/Rails again? Maybe not for my own projects — I tend to work with friends from the PHP community. But when I’m working with my excellent teammates at Engine Yard, absolutely — for me, the strongest thing Ruby/Rails has going for it, is the community knowledge I have access to. No matter the answer to this question however, learning a new language — and more importantly, learning best practices with that language — hopefully makes me a better developer.

It is very easy to latch on to a language, and a community, and think we are learning, because we’re looking at periphery technologies like database servers, cache storage and web services… and we are learning, but it’s through a single point of view (“The PHP Way” or “The Ruby Way”); a set of blinders that are based on everything we are comfortable with.

Getting out of your comfort zone — learning a new language — the core part of what brings all of our technologies together, capturing, and discussing ideas with other communities is where you’ll really grow your ability to solve problems: with the best solutions, and the best tools. So get out to meetups, or conferences, listen to podcasts, and read articles on other languages, even if you’re not interested in using that language.

I’m excited to have finally put enough time into learning enough about this great tool to work on some amazing technology with my fellow Engine Yarders, and to be able to apply the general concepts I’ve learned throughout my journey into another medium with more people.

 Distill is a conference to explore the development of inspired applications. The call for papers is now open and tickets go on sale in April.

Popularity: 1% |

Engine Yard Expands Support For Rubinius

By | April 4th, 2013 at 1:04PM

I am very pleased to announce that Engine Yard is sponsoring Dirkjan Bussink of Critical Codes to work on Rubinius.

Engine Yard has been a generous supporter of open source Ruby projects, including multiple Ruby implementations and Ruby on Rails, for many years. Indeed, they originally hired Evan Phoenix, the creator of Rubinius, in 2007, and have sponsored my work on Rubinius since 2008. Their sponsorship improves all aspects of the Ruby community, for developers writing Ruby code and for people everywhere who use applications written in Ruby or Rails. I’d like to thank Engine Yard for making Ruby and other open source technologies better for everyone.

Dirkjan has been a contributor to numerous open source projects, and to Rubinius in particular, for many years. He is eager, helpful and all around a joy to work with. We are lucky to have him helping with Rubinius.

With the accolades and appreciation dispensed, I’d like to cover some of what is coming for Rubinius.

Rubinius is an implementation of Ruby. At present, it supports 1.8.7 and 1.9.3 language modes, with support for Ruby 2.0 coming soon. Rubinius is a drop-in replacement for MRI (Matz’s Ruby Implementation), including support for C-extensions. Rubinius includes a modern, generational garbage collector, just-in-time compiler to native code using LLVM, and full support for multi-core and multi-CPU hardware with no global interpreter lock.

We are working toward the 2.0 final release for Rubinius. Dirkjan recently visited the Engine Yard office in Portland, OR for a week so we could talk about current and future development in person. I blogged summaries of our discussions: Welcome Dirkjan! and PDX Summit Recap. If you are interested in technical aspects of Rubinius, please see those posts.

Rubinius is available as an Early Access feature in Engine Yard Cloud. If you are currently using Engine Yard Cloud and are interested in learning more about how Rubinius may benefit you, please contact us. There are professional services available to help evaluate the benefits of Rubinius. Engine Yard also offers a free trial if you are not currently using Cloud. We are also working on Rubinius support in other platforms. Dirkjan is available to contract to assist evaluating and migrating to Rubinius.

The future is concurrent. We see this every day with industry’s use of technologies like ErlangClojure, and Node.js. Rubinius has been built from the beginning to bring Ruby into this concurrent world.

We will be writing more about the technology in Rubinius in the coming weeks. In the meantime, try your application, library, or gem on Rubinius. And don’t forget to test on Rubinius on Travis CI. That provides us invaluable feedback. If you have a moment, drop by our #rubinius IRC channel and say hello to Dirkjan.

Popularity: 1% |

RVM Autolibs: Automatic Dependency Handling and Ruby 2.0

By | March 27th, 2013 at 12:03PM

Last month marked a very important milestone for Rubyists – The release of Ruby 2.0.0. It comes with new RubyGems and new dependencies, including OpenSSL. RVM was not doing much to resolve dependencies earlier, instead installing LibYAML because it is required for RubyGems to function properly. The situation changes with OpenSSL as it’s now a bigger dependency. Initially for Ruby 2.0.0-rc1 RVM was installing OpenSSL. However compiling OpenSSL is not that easy task as with LibYAML, it also duplicates the effort with distribution maintainers to compile a working OpenSSL.

A new approach

To make this work, RVM takes a new approach. It will now work with the system package manager to install required libraries. This is no easy feat, as different systems have different names for packages, with some of them being available by default and some not available at all.

It’s easy when it’s easy

It’s easy to use an existing package manager on any of the systems. The trouble begins when distribution does not have a default package manager which is the case for OSX. There are a number of package managers, and none of them are popular enough to be de-facto standard. With this in mind it’s necessary for RVM to find an existing package manager and install one when there isn’t one available.

Sensible defaults

When autolibs was first added RVM assumed users wanted to have all the work done for them. Unfortunately we fast hit the reality that some users know better and still prefer to install dependencies manually. There had to be a compromise to fit both needs. In the end RVM will by default detect available libraries and fail if they are not available. Users have now option to switch to other modes including “do it all for me” and “let me do it myself”.

Do it all for me

Users who want get the libraries installed automatically can use autolibs mode 4 aka. enable. This will tell RVM to find package manager (installing one if necessary), install all dependencies, and finally use them for compiling rubies. If the package manager is not available (on OS X) Homebrew will be installed. However users can select what package manager will be installed with autolibs modes osx_port, osx_fink and smf. The smf package manager is for the lesser known RailsInstaller’s SM Framework.

For systems with a default package manager mode 4 is the same as mode 3, which means install missing packages.

Let me do it myself

For users that do not want RVM do the automatic there are two modes that will come in handy. Mode 1 allows users to instruct RVM to pick the libraries and just show warnings if they are missing. In case when even the automatic detection is to much it can be turned off with mode 0. Unfortunately there is a caveat. Given that the code is more dynamic, there is no longer a list to show what is required. This means that some libraries are picked depending on current system state. So if users do not want to use the automated modes (3 or 4) then RVM can only report what is missing, not all the dependencies that might be required on similar distributions.

Some tricks

To install RVM with Ruby, Ruby on Rails and all the required libraries (aka. the poor man’s RailsInstaller):

 \curl -L https://get.rvm.io | bash -s stable --rails --autolibs=enable

To use rvm in deployment where sudo requires extra handling like in capistrano:

    task :install_requirements do
      sudo “rvm --autolibs=4 requirements #{rvm_ruby_string}”
    end
    task :install_ruby do
      run “rvm --autolibs=1 install #{rvm_ruby_string}”
    end

You can find more details about autolibs in our docs https://rvm.io/rvm/autolibs.

Let us know

We have been testing autolibs code for some time now, but as always bringing it to wider audience creates new cases, detects new flaws, or just creates possible misunderstandings. We are open to get those fixed please report issues to RVM’s issue tracker https://github.com/wayneeseguin/rvm/issues or talk to use using IRC http://webchat.freenode.net/?channels=rvm

 Thanks for using RVM, and may the autolibs feature improve your Ruby experience.

Other Announcements

Officially opening RVM 2.0 work.

RVM 1.19 was last release where we included new features (Autolibs), all new feature requests will be deferred to RVM 2.0. We still will provide support, work on fixing bugs and update all software versions as long as RVM 2.0 is not released and marked stable. But to allow the work on RVM 2.0 we need to freeze the feature set available in RVM 1.x.

Updates to the website!

RVM has long had an unorganized website that simply adds information and has become hard for both maintainers and for users and since we are opening up development on RVM 2.0 work we are also opening up development on a brand new site! We hope to clean up and simplify the way you interact with the site, implement a cleaner design using Twitter’s Bootstrap and make the documentation more like man pages so that they can be ported back and forth between RVM and the website making everything more seamless not only for us, but for users also.

 

Popularity: 1% |

Introducing Gentoo 12.11, the New & Improved Engine Yard Distribution

By | March 19th, 2013 at 10:03AM

The distribution is one of the most crucial components of the Engine Yard stack. Much has changed since the company was founded, and the distribution needed to change with it. On behalf of the Distribution Team, including Gordon Malm, Kirk Haines and myself, I am pleased to announce the Early Availability of the new Engine Yard distribution. Even with the changes made, the team has worked hard to closely match the underlying system with what users have familiarized themselves with. With this in mind I’d like to take the time to point out the main changes which we feel are beneficial to you, our customer.

Enhanced Ruby Support

While supporting a number of new languages recently, including PHP and NodeJS, Engine Yard has a strong Ruby presence. Since Ruby was first released, there have been many new implementations that have come out, and with it the need to better support existing and future implementations. The new distribution’s Ruby architecture improves the support of these implementations through a more modular backend.

To make for an even more customized experience for users, RVM is now available on all new distribution installations. A big thanks to Michal Papis, the RVM lead, who has been instrumental in helping make this happen. This has been a request from many customers, and we’re excited to be able to deliver on it.

More User Focus

Work on the new distribution allowed for the team to start with a cleaner slate, which meant that more focused user centered customizations could be made. Packages such as Nginx and PHP were re-evaluated to ensure that they were customized to fit the needs of a majority of our customers. Supported versions were re-evaluated as well for major packages, allowing our support team the ability to support the new distribution more efficiently. Finally, the Linux kernel has been updated to the 3.4 series and the configuration options have been re-evaluated. One of the most prominent changes being the move to EXT4 as the default filesystem.

Hardened Toolchain

There has been substantial process in the area of compiler based security over the years. The new distribution utilizes a hardened toolchain to provide the benefits of this effort. Such protections include:

  • Stack Smashing Protection (SSP) for mitigation against stack overflows

  • Position Independent Executables (PIEs) for mitigation against attacks requiring executable code be located at a specific address

  • FORTIFY_SOURCE for mitigation against attacks resulting from the overflow of fixed length buffers and some format-string attacks

  • RELRO for mitigation of attacks against various sections in an ELF binary

  • BIND_NOW for mitigation of attacks that rely on loading shared objects at the time of function execution

These changes help to provide additional security for the system, reducing the possible attack vectors that could be utilized by an exploit.

Improved Testing

Testing an operating system is an extremely difficult process, and requires constant adaptation. Work on the new distribution has led to an increase in the creation of runtime tests for ensuring the reliability of the system. Core packages that had test suites were evaluated to ensure as much code level reliability as possible. I would in particular like to thank the Engine Yard QA team, who has played an instrumental role in helping us with this goal. However, testing is once again a constant effort and we look forward to helping improve the quality of the testing process.

Conclusion

These are just a few of the many improvements that have been made to the new distribution to better help serve our customers. Our work does not end here however, and we look forward to improving our processes even further to better serve your needs. On behalf of the distribution team we thank you for being Engine Yard customers, and look forward to working with you now and in the future.

To get started with early access for the new distribution, please refer to the Use Engine Yard Gentoo 12.11 Early Access documentation on the Engine Yard website.

Popularity: unranked |