This is the fifth part of the State Machine Executor series. For your convenience you can find other parts in the table of contents in State Machine Executor Part 1 — Introduction

Being able to describe side effects instead of executing them may sound great, but it has one significant drawback – the side effects need to be described completely before they are handed off to the executor. Building an object describing the action may cause significant memory usage. Let’s see how to fix that.

Streaming

We’d like to be able to stream the data. Let’s say that we have the following Action describing web request to execute:

class HttpAction {
	public string Url;
	public string Method;
	public byte[] Body;
}

class HttpAction {

public string Url;

public string Method;

public byte[] Body;

}

See the Body field. It holds the entire payload to be sent. Creating such a payload and storing it in memory will increase the memory usage and decrease scalability. To avoid that, we should have something like this:

class HttpAction {
	public string Url;
	public string Method;
	public Stream<byte> Body;
}

class HttpAction {

public string Url;

public string Method;

public Stream<byte> Body;

}

Looks great, but it doesn’t solve any problem. Remember that the state machine must create the action object and hand it over to the executor. The state machine won’t be able to run any code until the action is executed. This means that the Stream must be filled with the data, so we still have the problem with high memory usage.

Instead of passing the stream, we could pass a stream generator. That could be a lambda or some other interface with yield keyword:

class HttpAction {
	public string Url;
	public string Method;
	public IEnumerable<byte> Body;
}

class HttpAction {

public string Url;

public string Method;

public IEnumerable<byte> Body;

}

Looks slightly better, but still has issues. If Body wraps any local variables into a closure, then the memory will not be released until the stream is read. Not to mention that it’s much harder to persist the HttpAction object to provide reliability.

Solution

To solve the problem, we need to effectively stream the data. However, since the actions are executed after the state machine is done, we need to stream the data somewhere else – to a local file.

The executor can provide the following abstraction:

class Env{
	public FileWrapper CreateFile();
	public FileWrapper ReadFile(string identifier);
}

class FileWrapper {
	public string Identifier;
	public File FileHandle;
	public void Commit();
}

class Env{

public FileWrapper CreateFile();

public FileWrapper ReadFile(string identifier);

}

class FileWrapper {

public string Identifier;

public File FileHandle;

public void Commit();

}

Now, the state machine can call CreateFile to get a temporary file. Next, the state machine can stream the content to the file. Finally, the state machine calls Commit to indicate to the executor that the file is ready to be persisted. The executor can then upload the file to the persistent store.

Last but not least, we need to modify the action definition:

class HttpAction {
	public string Url;
	public string Method;
	public string BodyFileIdentifier;
}

class HttpAction {

public string Url;

public string Method;

public string BodyFileIdentifier;

}

The action executor can now stream the body from the file. If something fails, the file can be retrieved from the persistent storage and the action can be retried.

This solution is not perfect, though. The data is streamed twice which slows everything down. That’s an obvious trade-off.