Everything You Need to Know About Unicorn

Unicorn's been a topic I've been interested in learning about for a while now; numerous Engine Yard customers and developer friends use it, love it, and recommend it. Thankfully, the opportunity to do so recently presented itself. I spent some time poking around free resources looking for answers to my questions, and it wasn't as easy as I'd hoped... so I decided to go straight to the source.

First, I spent a bunch of time going over the Unicorn README file. While comprehensive, when I was done, I still had questions, so I put them all together, and emailed the Unicorn development team. They were gracious enough to reply with detailed answers to all my questions, and now that I'm in the know, I figured this would be a great resource to share with the rest of you. It's not our usual style of blog post, but it's solid information just the same, in what's hopefully an easily consumable format.

I've organized the questions into topical sections. The topics are: Clients, Debugging, Process Management, Load Balancing, Thread-safety, Rack support and Rack wrapper, Log Files, Binary Upgrades, Forking, Listening Interfaces, Configuration, Asynchronous Transfers, The Binary and Dependencies. There's a lot, so read through it all, or skip straight to the section that interests you.

Clients

What are "fast-clients"?

Clients that can make full (or close to full) use of the network bandwidth available to the server. Clients on a LAN (or the same host) usually fit this description, as they don't have to trickle data to the server over a slow link.

What's a slow client, by comparison?

A client with high latency or limited bandwidth that forces the server to sit idle and wait for data in the request or writable buffer space in the response. Accept filters in FreeBSD and deferred accept in Linux mitigate this problem for slow legitimate clients, but a dedicated attack can still get around those.

Slowloris is perhaps the most prominent example of the damage that could be caused by slow clients, but there have been similar tools like it floating around privately for years, including the Unicorn author's own "David" tool, which he (David) only made public after Slowloris:

Clients that sit around with idle keepalive connections is also huge problem for simple servers like Unicorn (and traditional Apache prefork), so Unicorn does not support keepalive.

The Unicorn author also works on the Rainbows! server, which is designed specifically to handle talking directly to slow clients and high-latency apps (Comet/WebSockets) without nginx in front.

What is a "low-latency, high-bandwidth connection"?

Anything on localhost or the local area network that doesn't make the server sit idle, unable to service other requests.

Debugging

Can you give me an example of how to debug?

Reproducibility is critical to debugging. Processes are inherently simpler, as the process state is always well-defined on a per-request basis and isolated from other requests as much as possible by the OS. One example is to help track down a memory leak related to a specific class of requests:

An non-Rubyist admin noticed that among a pool of workers, some used significantly more memory than other workers. Since the log file format always logged the PID serving each request, they were able to quickly narrow down which endpoints were prone to leaking memory (without even looking at the code).

In a server where requests are all served within the same process, it would've been much harder to narrow down which endpoints were using up memory. In a server where a single process handles multiple clients simultaneously, it would've required thorough inspection of the source code to track down which requests were leaking memory.

Process Management

"Unicorn will reap and restart workers that die from broken apps. There's no need to manage multiple processes or ports yourself. Unicorn can spawn and manage any number of worker processes you choose to scale to your backend."

Does that mean that Unicorn doesn't need monit or god?

No server needs things like monit or god; it all depends on your comfort level, your app, and your support requirements. It's always possible—albeit unlikely—for the master process to die, but things like monit and god aren't immune to dying, either. Developers use those tools, and similar ones, like Bluepill, with Unicorn.

Load Balancing

"Load balancing is done entirely by the operating system kernel. Requests never pile up behind a busy worker process."

So there isn't a mongrel queue issue?

No, there no a Mongrel queue issue on a single machine. A single queue is shared across worker processes and the workers only pull off the queue when they're available to do work. There's still a potential queue issue in a cluster behind a load balancer, but the risk is mitigated, since most servers are multicore and run multiple worker processes. The queue is also tunable by specifying the :listen parameter.

Thread-safety

Why is thread-safety good?

The utility of thread-safety really depends on the particulars of your situation. It gives you much more flexibility with what your app can run, and under ideal conditions, threads are memory efficient and relatively inexpensive. Thus, allowing apps to work with threads is good for experienced programmers.

On the other hand though, making things thread-safe by default can hurt performance in single-threaded situations. Even contention-free locks can end up adding significant overhead due to memory barriers. Both MRI and Python core developers have come to this same conclusion.

Rack Support and Rack Wrapper

What rack applications are supported?

Pretty much anything that passes Rack::Lint (and sometimes, even a few that don't).

What Ruby on Rails versions does the wrapper support?

The manpage says everything 1.2.x to 2.3.x, and there are integration tests for those version.

Log Files

"Builtin reopening of all log files in your application via USR1 signal. This allows logrotate to rotate files atomically and quickly via rename instead of the race condition prone and slow copytruncate method."

What is the USR1 signal?

USR1 is the first user-defined signal, which usually gives applications the most flexibility in determining what a signal handler for it would do. To send a USR1 signal to Unicorn, use the standard kill(1) command:

kill -USR1 $PROCESS_ID

Nginx also uses the USR1 signal for reopening log files. Most of the signals Unicorn accepts map directly to the nginx ones for ease-of-learning. Unicorn also takes steps to ensure multi-line log entries from one request all stay within the same file.

Binary Upgrades

What are binary upgrades?

Binary upgrades are upgrades that upgrade Unicorn itself, the version of Ruby, or even any system libraries including the system C library. For users that depend on copy-on-write functionality, it's also the only way to upgrade the application

How do you upgrade?

The upgrade procedure is the same as nginx, and is also documented here (bottom of page).

What happens after upgrading?

After ensuring the old processes are terminated gracefully (via SIGQUIT), that same code should ensure that the app behaves as expected. If the app is broken, another "upgrade" is required which may involve switching back to a known good version.

Forking

What is the preload_app directive?

The preload_app directive loads the application before forking workers, so it can share any loaded data structures. By default, workers each load a private copy of their app for out-of-the-box compatibility with existing servers.

What's a use case for using the preload_app directive?

preload_app can dramatically speed up startup times. It can also make it easy to share memory across processes when using Ruby Enterprise Edition. REE also uses tcmalloc on some platforms, like Linux, instead of a generic malloc, which improves performance for most server workloads independently of copy-on-write.

Listening Interfaces

What's an example configuration for how to set this up and how this can be used for debugging an application?

You can set up a worker to listen on a specific address so that you can do things like strace the worker while hitting that address and see what happens. There's a commented out example in here, which is shortened and uncommented here:

after_fork do |server, worker|
  # per-process listener ports for debugging/admin/migrations
  addr = "127.0.0.1:#{9293 + worker.nr}"
  server.listen(addr, :tries => -1, :delay => 5)
end

Normally, strace will slow down a process enough that it usually "loses" when trying to accept() a connection against other workers and it never sees the request.

Configuration

Is there a good example configuration to help me get started?

The examples here cover many settings, including comments. The simplest case is with preload_app=false. Here's a short example:

worker_processes 16
pid "/path/to/app/shared/pids/unicorn.pid"
stderr_path "/path/to/app/shared/log/unicorn.stderr.log"
stdout_path "/path/to/app/shared/log/unicorn.stdout.log"

In contrast, preload_app=true can significantly complicate things, as it requires disconnecting/reconnecting to the database and other connections to avoid unintended resource sharing. All configuration settings are documented in the RDoc of the Unicorn::Configurator class.

In addition to Unicorn::Configurator settings, there's also the rackup config file (usually config.ru) used by all Rack applications independently of the underlying server. There's also system/kernel tuning, which the Unicorn documentation touches on here.

The Binary

What is the unicorn executable? What is the unicorn_rails executable?

The unicorn executable is a Rack-only tool modeled after Rack's "rackup" and is recommended for Rack applications. unicorn_rails was made to be an easier transition for users of pre-Rack versions of Rails. The manpage encourages Rails 3 users to use plain unicorn instead.

What's the difference?

From the unicorn_rails manpage, some conventions of unicorn_rails are modeled after script/server found in Rails. It creates directories under "tmp" like script/server and the -E/--environment switch sets RAILS_ENV instead of RACK_ENV.

Dependencies

Are there any dependencies? Gems, system packages, etc.

Rack is the only Gem Unicorn currently depends on. Unicorn does not set hard dependencies on any released version of Rack. Unicorn depends on MRI 1.8 or 1.9 on a Unix-like platform. There have been commits to make the C/Ragel HTTP parser work with Rubinius, but there have been some other issues in the pure Ruby code. Building from git requires Ragel (but the distributed source tarball/gems do not). The project does not distribute precompiled binaries.

Unicorn uses RDoc for most of the documentation and John MacFarlane's Pandoc (a Haskell tool) for the Markdown manpages. Pandoc was the most prominent Markdown to manpage converter at the time, as Ryan Tomayko's ronn had not appeared when the manpages did.

What are the requirements? Operating system, ram, etc.

Most POSIX-like platforms are supported. Unicorn depends on a bunch of Unix-y things like fork(), the ability to share file descriptors with children, signals, pipes, unlinked open files, etc...

Unicorn has been deployed to and tested on various Linux distros heavily. The Unicorn mailing list has gotten reports and patches for OpenBSD compatibility, too, so that should work. Unicorn does not depend on any exotic system calls not provided natively by MRI.

RAM usage depends heavily on the application/libraries, version of Ruby, word size of the architecture, and number of worker processes configured. It shouldn't take significantly more or less than any other Ruby web server.

Conclusion

I didn't start out with a specific problem, more like a void in my knowledge base, and now that that void is gone, I'm pretty pleased with the robust capabilities of Unicorn. It's not going to be the tool-of-choice for every use case, but clearly, it'll do wonders in a lot of them. As always, leave questions and comments here!