Engine Yard Blog RSS Feed

Ruby's a fantastic language; we love it because it's flexible, readable and concise, to name just a few reasons. The Ruby language is also incredibly complex as far as language syntaxes (grammar) are concerned. This sometimes leads to some dark seedy corners... but by examining the stranger aspects of Ruby's syntax, it helps us to better understand the power of Ruby. This entry will show some of the stranger aspects of the language and reflect on how we rarely see these used in real life.

Warning: Most of the code you see in this post should never be used in real life by actual programmers. The snippets are meant to provide insight into the power of Ruby, and more obviously, are for entertainment value.

1. Expressions are Cool

One of the great things about Ruby is that everything is just executable code. In fact, everything is an expression. This makes it nice because you can then programmatically build up a set of methods in a class body, like this:


(((class Numeric
  MM_SIZE = 0.00025

  [:mm, :cm, :dm, :m, :dam, :hm, :km, :Mm].inject(MM_SIZE) do |size, unit|
    define_method(unit) { self * size }
    size * 10
  end
end

)))

Ignoring the fact that I just modified Numeric, you have to admit that being able to do stuff like this in Ruby makes it a great language; a language with the flexibility to define methods progammatically because everything is just code and expressions.

Over the years of supporting JRuby, I began to realize that the grammar is a bit _too _too generous. At this point you can argue that it's up to programmers to hang themselves with any bizarre language feature. So what's the fuss?


(def foo(a, b=def foo(a); "F"; end)
  a
end

p foo("W", 1) + foo("T") + foo("bar")      # => "WTF"
)

Is there a sane reason to allow a def as the value of an optional argument? Since we have a language where everything is an expression, this is just life in the Ruby lane. Luckily, def returns nil and not UnboundMethod or something that's considered useful; otherwise there would be some pretty weird code floating around.

Before I go on, I wondered about variable scoping...


def foo(a, b = self.class.send(:define_method, :foo) {|_| ": I captured #{a}" })
  a
end

p foo("foo") + foo("bar")  #=> foo: I captured foo

Of course it makes sense that the optional argument value of 'b' evaluates in a scope where it can capture 'a'... but wow! The potential for strange Ruby code is just amazing!

2. It Makes Sense But...

Other times the grammar has oddities which seem to make sense on the surface but generally confuse you when you start to stray off the path of idiomatic Ruby. Heredocs are a great example:


a = <<EOF
hooray
multi-lines
EOF

This how we normally see them. The idiomatic example may have us define a heredoc as: when encountering a heredoc statement (<<EOF) take all lines after that statement until we encounter the heredoc marker on a line by itself and use those lines as a multi-line string. Certainly, that would still work for this example:


{:skybox => "/data/skyboxes/mountains/",
 :floors => [
  {:location =>  \[0,0,0\],:data => <<EOL, :texture => "data/texture/wall.jpg"},
........BCB.........
.........P..........
....................
........DDD.........
.......D....D.......
....................
....D.....D....D....
....................
..D....D....D....D..
...BBBBBBBBBBBBBB...
EOL
  {:location =>  \[0,10,0\],:data => <<EOL, :texture => "data/texture/wall.jpg"},
# More elided ...
}

Heredocs in this code are interleaved in the middle of a hash literal... This is odd, but the definition still holds up. Let's break the definition:


def foo(a,b)
  p a, b
end

foo(<<ONE, <<TWO)
This is one
ONE
This is two
TWO

Two heredocs on the same line break the definition. I think with a little work we could fix the definition to talk about what to do about multiple heredocs on the same line, but then consider this case:



a = <<ONE
This is one. #{<<TWO}
This is two. #{<<THREE}
This is three.
THREE
TWO
ONE    # => "This is one. This is two. This is three.\n\n\n"

Coming up with an easy to read definition is starting to get rough. Maybe just accepting that you can do weird things with heredoc without actually explaining them in a common definition is the right thing to do. Do we really want people to use heredocs this way? Is it cool that we can do stuff like this?

3. Mystical String Concatenation

Ruby's grammar is huge. Super huge, and sometimes you see something in the grammar that makes you wonder how it got there. For me the one I wonder about the most is what I call the "Mystical String Concatenation" feature:


string        : string1
              | string string1 {
                  $$ = support.literal_concat(getPosition($1), $1, $2);
              }

Since string1 must be a type of string literal this translates into the following Ruby syntax:


a = "foo" "bar"
p a # => "foobar"

Is this really useful? The performance benefit is that the parser will concatenate the string before any execution happens.... but this only works for string literals! A programmer could just rewrite the string as "foobar". Perhaps someone wanted to inspect strings into an eval statement?


one = "foo"
two = "bar"
a = eval "#{one.inspect} #{two.inspect}"

It mystifies me...

Conclusion

Ruby is beautiful, powerful, and in some cases, wacky. I've only just scratched the surface of the weird convoluted things you can do with Ruby's syntax. What's most interesting to me though is that Ruby programmers rarely touch these strange bits unless they're trying to be cute...

Compared to another language I used to always use (it starts with a P and is four letters, but I seem to have completely forgotten its name...), it's surprising how rarely real Ruby code ventures into the land of the indecipherable.

Both Ruby and the previously mentioned forgotten language are very powerful, and can get a lot of the same things done, but it seems that Ruby code ends up being written in a largely idiomatic way. Ruby may be a rat's nest for writing a coherent specification of the language, but the Ruby that people write using this underspecified language ends up looking really nice.


Tagged:

comments powered by Disqus