Engine Yard Blog RSS Feed

Note: Today's post is from guest contributor John Mettraux. John is a developer living east of Macau. He likes reading poetry and obscure prussian theories, as, sometimes, he can disguise some of it as software.

Wayne: We've started using ruote to execute our provisioning process.

Stamford: I've heard of it, it's some kind of state machine workflow thing, isn't it?

Wayne: Not really, it's more like an interpreter, you pass to it a process definition, much like a program and it runs it for you. A state machine is declarative, while a program, is, well, imperative.

Stamford: What does the provisioning process "definition" look like then?

Wayne: Here is my first version:

Ruote.process_definition 'provisioning' do
  customer :task => 'choose configuration'
  automation :task => 'setup system'
  accounting :task => 'invoice customer'
  customer :msg => 'system ready notification'
end

Stamford: What are those customer, automation, accounting? Method calls?

Wayne: No, well... Yes, it's ruby code, when it's executed it produces some kind of 'Abstract Syntax Tree' like:

pdef = Ruote.process_definition do
  customer :task => 'choose configuration'
  automation :task => 'setup system'
end

will store in the variable pdef this tree :

[ "define", {}, [
  [ "customer", { "task" => "choose configuration" }, [] ],
  [ "automation", { "task" => "setup system" }, [] ] ] ]

ruote will interpret this tree, not the Ruby code.

Stamford: It means I could also produce this tree from other "sources"?Wayne: Exactly, ruote comes with an XML reader, with a 1 to 1 mapping to its Ruby DSL, but you could imagine going further and interpret other "formats" to produce ruote ASTs.

Stamford: OK, so ruote will take as input a process definition, ... Those method calls, well, I don't like them, the method names should be verbs, not nouns...

Wayne: You're right. They're nouns because they are participant names. The actual action is passed as value of the :task attribute. Ruote is a workflow engine, what you see is a sequence of work passing from the participant 'customer' to 'automation' up until 'customer' again for a final notification.

Stamford: Got it, but for the last one, customer again, you went with a :msg instead of a :task

Wayne: Yes, since it's a notification and not a task. Of course ruote itself can't really distinguish between English nouns and verbs, it's a convention.

Stamford: OK, so now, what are "participants"? Am I right, when I say they take "part" in the work flow?

Wayne: Exactly. The simplest of the participants take blocks of Ruby code:

engine.register_participant 'accounting' do |workitem|
  Invoice.new(
    workitem.fields['customer_id'],
    workitem.fields['items']
  ).save!
end

More complicated participant implementations are Ruby classes that have consume(workitem) and a cancel(id, flavour) method. The consume is equivalent to the block of the block participant. The cancel is called when the process instance is cancelled and the actions initiated by the participant need to be rolled back or suspended somehow.

Stamford: What about human participants?

Wayne: Storing workitems for "human consumption"? Ruote comes with a StorageParticipant that you can use to build up more complex task / workitem lists. As you've seen ruote hands workitems to participants...

Stamford: You sound like a salesman, are you telling me there a place where tasks for humans get dumped and we can just pick them there and do the work?

Wayne: Right.

Stamford: (looks at the process definition Wayne is working on) A process definition... indeed it looks like a program... Your version is too simplistic. Remember, we do immediately charge the customer, the bank system tells us if the customer has got credit.

Wayne: OK, let me do a few adaptations then.

Ruote.process_definition 'provisioning' do

  customer :task => 'choose configuration'
  accounting :task => 'check credit'

  customer :msg => 'credit notification'
  terminate :if => "${credit} != 'ok'"

  concurrence do
    sequence do
      automation :task => 'setup system'
      customer :msg => 'system ready'
    end
    accounting :task => 'charge customer'
  end
end

Stamford: With this "concurrence", you're letting the 'setup system' task happen while the 'charge customer' is running too?

Wayne: Yes, oh wait, let's not charge until we're sure the system is up and running.

Ruote.process_definition 'provisioning' do

  customer :task => 'choose configuration'
  accounting :task => 'check credit'

  customer :msg => 'credit notification'
  terminate :if => "${credit} != 'ok'"

  automation :task => 'setup system'
  customer :msg => 'system ready'

  accounting :task => 'charge customer'
end

Stamford: Very flat now, but what happens when the 'setup system' task fails?

Wayne: By default, ruote will stop the failed process branch and log the error. The administrator has then the opportunity to investigate, fix the cause and then kill the process instance or replay at the point of failure (now that the obstacle is gone).

Stamford: How does the administrator get notified?

Wayne: It's a bit of an advanced subject, but I wired a notification "participant" like this:

engine.on_error = 'sms_admin_notification'

engine.register_participant(
  'sms_admin_notification', Acme::SmsNotifier, :number => '38047272')

Stamford: Simply handing a workitem to the notifier.

Wayne: Yes, I could have wired a sub-process for more complex behaviours, notify multiple participants and so on...

Stamford: What happens when the process changes?

Wayne: You write a new process definition.

Stamford: And you restart the application?

Wayne: Well, not really, remember, you feed the engine an AST of the process definition.

pdef = Ruote.process_definition do
  customer :task => 'choose configuration'
  automation :task => 'setup system'
end

# ...

engine.launch(pdef, 'customer_id' => 'xyz')

Stamford: Yes, pdef is changed at some point...

Wayne: The engine is passed the process definition at launch time, like you pass a path to an executable to your operating system and it executes it in a process. You can call launch multiple times:

id0 = engine.launch(pdef_x, 'customer_id' => 'xyz')
id1 = engine.launch(pdef_x, 'customer_id' => 'xyz')
id2 = engine.launch('http://pdefs.inner.example.org/def1')

launch will return immediately.

Stamford: What's this 'customer_id' you're passing?

Wayne: Ah, it's one of the initial fields for the workitem that the process instance circulate among its participants.

Stamford: So if I understand correctly, this "workflow engine" is like an operating system?

Wayne: You could say so, it can host multiple process instances, launched from various process definitions.

Stamford: Still, my question about change...

Wayne: OK, I didn't answer it completely. In a simplistic fashion, the process definition changes, but process instances also follow the process definition as passed at launch time. Ruote doesn't keep a 'registry' of process definitions, it runs whatever is passed at launch time, it doesn't have to consult the registry while running the instance. So process instances follow their initial track.

Stamford: And if a process instance must change? With a state machine, I could just change the set of states and transitions and all the objects that follow it would adapt instantly.

Wayne: What's wrong with a tool that follows a plan until its end? "The new procedure will enter into action by March, cases started before March will follow the old procedure" sounds familiar?

Stamford: OK, there are cases where I want processes to follow the initial plan, but what about when I need to change running business processes?

Wayne: There are a few variants and they depend on the definitions and on the level of change. Basically, a) you cancel the process instance and restart it or b) you re-apply the changed branch of the process instance and pass the new process definition fragment to the branch.

Stamford: Sounds good. Now why are you against state machines?

Wayne: Sorry, I am not against them. I think they are great for protocols and lifecycles, whenever there is 1 state being tracked. As you can see from the process definition we've worked on, we're affecting multiple states.

Stamford: Not really, the states are "choosing configuration", "checking credit background", "preparing system", "system prepared", ...

Wayne: I mean, the process modifies the state of the user record, it creates an invoice record, it creates system resources, ... It's not about the state of a unique resource. In fact we use state machines to track the lifecycle of some of our resources, but ruote orchestrates the transitions for multiple of those resources.

Stamford: I could implement a "workflow" resource and attach dedicated state machine to each instance.

Wayne: How would you then model such a workflow?

Ruote.process_definition do
  alice :task => 'prepare batch'
  concurrence do
    bob :task => 'deal with normal items'
    charly :task => 'deal with items of category c'
    doug :task => 'deal with items of category d'
  end
  alice :task => 'finalize batch'
end

When in the concurrence, the workflow is in three states simultaneously.

Stamford: Well, I would make a single "deal with items" state and put the task separation logic in the task controller.

Wayne: OK, it's doable, to each his own mindset. I tend to see "workflow" as "work flow", control flow for work items. Workflows are first class for me, I want to be able to start, pause, cancel them. Your "workflow" resource plus state machine approach is certainly valid. I've seen other schools where the resource is the "task" and the state takes its value from a set of user names or role names.

Stamford: I still like state machines.

Wayne: You like domino effects. I prefer to have limited local domino effects and workflows that trigger them, here and there. I see the flow as an external, separate force that changes the state, not as a cascade of state changes and their secondary effects.

Stamford: ...

Wayne: I want a "work flow", I don't want a network of emergent side-effects.

Stamford: You're quite lyric today. Now, how does this "ruote" thing fit in Rails?

Wayne: It's a Ruby library, isn't that a sufficient answer?

Stamford: Yes, but...

Wayne: Sorry. There is a ruote-kit Rack application that you can put in your Rails application, it exposes processes, workitems, schedules with HTML and JSON representations, there is a further ruote-on-rails example that wraps ruote and ruote-kit and is more polished, though quite beta right now.

Stamford: So I could look at this ruote-kit, and run ruote inside of my Rails application?

Wayne: Yes, though, perhaps, it'd be better to consider ruote as an external thing, like a database and run ruote workers outside of your Rails application. Well, the easy variant, for small organizations, is one Rails application with one ruote system embedded. The other extreme would be a ruote storage shared by one or more ruote workers and accessed by one or more Rails applications.

Stamford: Like I have this work queue and its workers and I sometimes place work there and fetches it back later?

Wayne: Exactly. The "queue" for ruote is a "storage". There are multiple implementations, from in-memory, to CouchDB or Redis. The Redis storage is getting popular. A Rails application can integrate ruote-kit and point to a storage to query ruote about its process instances, cancel them, launch new ones, list workitems, pass them back, and so on.

Stamford: I'm not forced to use ruote-kit and its resources, can I directly tap into ruote via a storage?

Wayne: Yes. You could connect to a Redis storage with something like:

require 'redis'
require 'ruote'
require 'ruote-redis'

engine = Ruote::Engine.new(
  Ruote::Redis::Storage.new(::Redis.new(:db => 15, :thread_safe => true)))

Stamford: Engine, engine, ... Shouldn't it be "Dashboard"?

Wayne: Yes, you're probably right, especially since the active part, the worker, is not present.

Stamford: Can I write?

require 'redis'
require 'ruote'
require 'ruote-redis'

engine = Ruote::Engine.new(Ruote::Worker.new(
Ruote::Redis::Storage.new(::Redis.new(:db => 15, :thread_safe => true))))

Wayne: Ah, yes, of course. Your application would thus host a worker. This kind of code is common for one app, one engine environments.

Stamford: Could I launch a workflow when some request comes? Or when one of my models/records change?

Wayne: Yes, from a controller or even better from your model.

Stamford: How about "human intervention" in my workflows?

Wayne: Present your users with a list of workitems they're allowed to modify and proceed.

engine.register_participant /^user_.+/, Ruote::StorageParticipant

# ...

workitems = engine.storage_participant.by_participant('user_' + username)

Stamford: Proceed?

Wayne: This storage participant is a task list, when your user is done with a workitem, give it back to it, so that it can resume its trip in the process instance.

workitem.fields['message'] = 'Kilroy was here'
engine.storage_participant.reply(workitem)

Stamford: OK. Workitem transiting in a process instance, sometimes surfacing in a storage participant... Now, what are ruote's drawbacks?

Wayne: We've seen its need for at least a worker. Consuming a bit of your CPU even, requiring its own process.

Stamford: Right, like all those systems that have to run something out of the request/reply loop.

Wayne: Spoken like a true web developer...

Stamford: Another drawback please.

Wayne: We've been talking for a while now, let's add "not so straightforward". A Rails developer tends to equate workflow to "development workflow" or "state machine workflow", he's not so used to differentiate "business entity lifecycle" from "business workflow".

Stamford: Enough preaching.


Tagged:

comments powered by Disqus