State Machine Executor Part 4 — Timeouts, exceptions, suspending

This is the fourth part of the State Machine Executor series. For your convenience you can find other parts in the table of contents in State Machine Executor Part 1 — Introduction

Let’s discuss how to improve reliability of our state machines.

How machines are executed

In part 1, we defined the contract for triggering a single transition. Each transition returns instructions what actions to execute and what transition to call next. We then run in a loop until the state machine is completed.

We can modify this mechanism to deal with crashes, errors, and other undersired effects. Let’s revisit the loop that we defined in part 2:

We read the store before entering the loop. In each loop iteration, we pass the store to the transition, and then update the state and execute actions. We’re now going to modify this solution.

Suspending

The first thing to support is suspension of the state machine. If the machine decides that it needs to wait, it can indicate that in the TransitionResult:

We can now include that in the loop handling:

We can then proceed with the state machine when the time comes. We can obviously extend that to support sleep or waiting for some condition.

Exceptions

We need to handle unexpected crashes as well. We simply catch the exception and then we need to let the state machine know it happened. We can do that by redirecting the state machine to a well-known transition:

We can obviously extend that to give access to the exception or add any additional details.

Timeouts

We would also like to terminate the state machine if it runs for tool long. There are two ways to do that: we can terminate it the hard way by interrupting the thread (in a preemtive way), or we can wait for it to complete the transition (in a cooperative way). No matter what happens, we may want to redirect the state machine to a well-known transition for handling timeouts:

Notice that we stop processing after the timeout transition. Had we not do that, we would run in an infinite loop. If you don’t want to terminate the processing, then make sure you don’t run into rerouting the state machine constantly.

Summary

Next time, we’re going to see how to deal with data streaming and why it’s needed.