.NET Inside Out Part 7 — Generating Func from a bunch of bytes in C#

This is the seventh part of the .NET Inside Out series. For your convenience you can find other parts in the table of contents in Part 1 – Virtual and non-virtual calls in C#

In Capturing thread creation to catch exceptions we generated a method using array of bytes and replaced default Thread constructor to capture exceptions. Today we are going to extend that example to create a “library” for creating any Action or Func from a bunch of bytes. Let’s go.

Idea

As we know memory is just a pile of bytes. It is up to us whether we treat something as a code or as a data. We also know that we can allocate object on a stack using TypedReferences so we can allocate it everywhere. Nothing stops us from generating a byte array with some machine code and jumping there. The thing is, we would like to have a decent helper methods for doing that so we don’t need to handcraft jumps and calls all the time with debugger in hand.

So today we are going to utilize .NET mechanisms to create a dynamic function from bunch of bytes. How can we generate some code in C#? We can use Reflection.Emit. With this tool in hand we can emit some IL opcodes and compile them dynamically in runtime and get Delegate to call the code. We are going to do something similar, but instead of IL opcodes we will use machine code.

So we assume that the user has an array of bytes representing code to execute (basically any code conforming to the architecture, calling convention and savoir vivre) and he wants us to give him a Delegate for this code. To do that, we will actually create new Delegate dynamically and modify the method code to jump to the code provided by the user. This is the idea at a glance.

Let’s go.

Implementation

We start with the following class:

We want user to inherit from it and add a stub method which we will use to do the jumping. So the user needs to create something like this:

The method conforms to the Func< int, int> signature so we can create delegate of that type. The method needs to be named Stub and has at least 6 bytes of body. We will not (!) execute its code, so user can put there anything. The Target field is used internally as a pointer to the actual machine code. With this skeleton we have a runtime type checks, which is not ideal since we would like to have it during compilation, but it is good enough.

So the user will call our library in the following manner:

So he or she gives us correct types (delegate type and type with method for plumbing) and a bunch of bytes. In this example, the machine code simply takes the parameter (which is in edx register according to x86_32 .NET calling convention), adds 4 and returns the result. Please note that the Stub method is not static because our delegate must have a target which we use to do the jump. This has very funny implication which we will see later.

Okay, let’s use some helper code:

Most of this code should be obvious if you are familiar with WinAPI and .NET memory management. If no, please refer to my other posts or just google it. Basically, this is used to pin and unlock memory so the OS allows us to do the hard part.

Okay, let’s now move on to the meat:

We first pin the array of bytes (with machine code) and calculate its address in memory. Since the array has two integers at the beginning (array type and size), we need to move by 8 bytes.

Next, we unlock the page with the array so we can execute the code there.

Next, we create instance of stub class and set Target field to the address of machine code. Next, we create delegate and here is first runtime type check performed by the .NET (second one is at the end where we cast delegates).

Finally, the most important part. We first get a pointer to a stub function (which is provided by the user in the subclass) and overwrite it’s body. In that function we want to get Target field of the delegate target (target with lowercase “t” is an instance for which the delegate is bound whereas Target with uppercase “T” is an address of our machine code) and jump to that code. We could use jmp here but ordinary jump requires relative address so instead of recalculating them we can simply do this trick with pushing address on the stack and returning from a function. We could also put it in the register and do jmp eax, whichever your prefer.

Okay, that’s it! Now we can trivially create any delegate just by creating one subclass and providing machine code (by the way, you can easily generate it with this tool).

Messing with jumps and target type

I mentioned that the Stub method is not static and it has funny implications. See this code:

We have a class for Action< int> delegate, nothing fancy. Next, we have a helper method written in C# to execute some code easily. Finally, we get the address of the helper method and in our machine code we simply jump to it (using the same trick with absolute jump as before). Now, when we call function(5), we call the method Stub using ordinary .NET mechanisms (delegate invocations etc.). Next, our Stub method has an absolute jump to our array with machine code. Finally, from that array we jump to MyWriteLine. But! Since we didn’t change any registers (apart from eip and esp obviously, but the latter is reversed after each retn), we still have two parameters passed by the .NET delegate: target in ecx and integer value in edx. And the target type doesn’t match! It is ActionInt instead of the actual type containing MyWriteLine! Of course .NET isn’t aware of that.

Summary

Throughout our journey we managed to allocate memory anywhere and now we can execute basically any code with just a little of plumping mechanisms so now we can actually write asembler in C#. Why would we ever do that? In one of the next posts we will see how to use it to capture all exceptions, including StackOverflowException. However, if you now wire up any asm -> machine code translator, you can easily create a runtime asm execution engine directly in C#.

You can find the code here. I tested this in Visual Studio 2015 15.5.0 on Windows 10 Enterprise x64 with .NET 4.5 and Release Any CPU profile.