This is the third part of Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

Hi! We already know how to allocate objects on a stack, we also evaluated the performance of list storing objects instead of references. In this part we are going to do two things. First, we will implement basic memory allocator which will be able to allocate any object from scratch (no copying this time). Second, we will hijack new operator in order to be able to decide where objects should be allocated. Finally, we will be able to instrument CLR where it should store almost all objects. Let’s go!

Memory management

We already know that every reference object has pointer to method table. In first part we allocated object on a stack, however, we had to have already initialized object in order to copy its method table pointer. But in order to implement custom memory allocator, we would like to be able to allocate object without using additional information. We can extract this value using the following line:
typeof(T).TypeHandle.Value
We also need to know the object size. In part two we defined the object size by hand in code, however, we can extract this value from method table. It is stored 4 bytes after the beginning of the method table, so we are able to find this value using method table pointer.

Having all this details we are able to implement basic memory allocator:

class GenericMemoryAllocator
{
	// Fields needs to be static in order to be accessible from RawAllocate
	static private int[] memory;
	static private int currentOffset;
	static private object dummy;

	public GenericMemoryAllocator()
	{
		memory = new int[102400]; // Array big enough to be stored in Gen 2
		currentOffset = 0;
	}

	public unsafe T Allocate< T >()
	{
		var methodTable = typeof(T).TypeHandle.Value; // Get handle to method table
		RawAllocate(methodTable); // Allocate object and set field, also JIT method
		return (T)dummy;
	}

	public static unsafe IntPtr RawAllocate(IntPtr methodTable)
	{
		// Calculate the object size by extracting it from method table and dividing by int size. We assume that the size starts 4 bytes after the beginning of method table (works on .NET 4 and .NET 3.5)
		int objectSize = Marshal.ReadInt32(methodTable, 4) / sizeof(int); 
		// Skip sizeof(int) bytes for syncblock
		currentOffset++; 
		// Write address to method table
		memory[currentOffset] = (int)methodTable; 

		// Get handle for newly created object
		TypedReference newObjectReference = __makeref(dummy); 
		// Get handle for memory
		TypedReference memoryReference = __makeref(memory); 
		// Calculate the address of spawned object. We need to add 2 since we need to skip method table of array and size of the array
		var spawnedObjectAddress = (*(int*)(*(int*)(&memoryReference)) + (currentOffset + 2) * sizeof(int)); 
		// Modify handle for new object using address of existing memory
		*(int*)(*(int*)&newObjectReference) = spawnedObjectAddress; 

		// Move within the memory
		currentOffset += objectSize; 
		return (IntPtr) (int*)* (int*)(*(int*)&newObjectReference);
	}

}

First, in lines 4-6 we declare variables: integer array where we will store our objects, offset within that array, and reference to dummy object. Next, we create array big enough to be stored in GC generation 2. We will utilize this to verify that our allocator is working pretty well.

In line 14 we start method for allocating object of any type. We first obtain handle to method table, we next allocate object, finally we return it from this function. Remember that the dummy reference is modified to point to the newly allocated object so there is no null reference problem. Also, this reference will be modified every time we allocate something.

Finally, in RawAllocate method we allocate new object. We accept one parameter: a pointer to method table. We also return one value: pointer to allocated memory. We start with calculating object size (by reading it from the method table), in line 28 we store method table pointer for newly created object, and finally (line 37) we modify the reference so now it points to brand new object. Next, we update offset variable and return result.

You can verify this code by running this sample:

// Create allocator and allocate object
var allocator = new GenericMemoryAllocator();
var customlyAllocated = allocator.Allocate< TestClass >();

// Allocate ordinary object
var ordinary = new object();

// Observe that custom object is in generation 2
Console.WriteLine("Object created normally: {0}", GC.GetGeneration(ordinary));
Console.WriteLine("Object customly allocated by hand: {0}", GC.GetGeneration(customlyAllocated));

Ordinary object should be in generation 0 (since it is very small and brand new), however, custom object should be in generation 2.

Hacking new

So far so good. Now we are able to allocate any object using our custom logic. We don’t need to have already allocated object, we can do one from scratch and don’t bother with other things. However, we would like to be able to allocate objects using our allocator using only new operator. In other words, we would like to modify CLR in a way that it will store elements in a place we specify.

First, let’s try do achieve that by hand. We will use WinDBG to examine CLR internals and patch them on the fly. Let’s run previous application (do not try to use F5 and debugging from VS since it will greatly modify execution environment, just run the application), attach WinDBG to it and see some internal structures.

First, let’s try to find code of our function executing allocation logic:

0:006> .loadby sos clr
0:006> !name2ee * Program
Module:      6fb61000
Assembly:    mscorlib.dll
--------------------------------------
Module:      01593fbc
Assembly:    GenericUnsafeAlloc.exe
0:006> !dumpmodule -mt 01593fbc
Name:       C:\GenericUnsafeAlloc\bin\Debug\GenericUnsafeAlloc.exe
Attributes: PEFile 
Assembly:   0102e0d0
LoaderHeap:              00000000
TypeDefToMethodTableMap: 01590038
TypeRefToMethodTableMap: 0159004c
MethodDefToDescMap:      015900d4
FieldDefToDescMap:       01590104
MemberRefToDescMap:      00000000
FileReferencesMap:       01590130
AssemblyReferencesMap:   01590134
MetaData start address:  00c12334 (3260 bytes)

Types defined in this module

      MT  TypeDef Name
------------------------------------------------------------------------------
01594e08 0x02000002 GenericUnsafeAlloc.GenericMemoryAllocator
01594d24 0x02000003 GenericUnsafeAlloc.Program
01594ea0 0x02000004 GenericUnsafeAlloc.TestClass

Types referenced in this module

      MT    TypeRef Name
------------------------------------------------------------------------------
6ff9cd10 0x02000001 System.Runtime.CompilerServices.CompilationRelaxationsAttribute
6ff9d708 0x02000002 System.Runtime.CompilerServices.RuntimeCompatibilityAttribute
6ff9cdb0 0x02000003 System.Diagnostics.DebuggableAttribute
6ff9d220 0x02000005 System.Reflection.AssemblyTitleAttribute
6ff9d2a0 0x02000006 System.Reflection.AssemblyDescriptionAttribute
6ff9d260 0x02000007 System.Reflection.AssemblyConfigurationAttribute
6ff9cca4 0x02000008 System.Reflection.AssemblyCompanyAttribute
6ff9cc64 0x02000009 System.Reflection.AssemblyProductAttribute
6ff9d3d4 0x0200000a System.Reflection.AssemblyCopyrightAttribute
6ff9d414 0x0200000b System.Reflection.AssemblyTrademarkAttribute
6ff9d00c 0x0200000c System.Runtime.InteropServices.ComVisibleAttribute
6ff9cdf0 0x0200000d System.Runtime.InteropServices.GuidAttribute
6ff9d454 0x0200000e System.Reflection.AssemblyFileVersionAttribute
6ffa0470 0x0200000f System.Runtime.Versioning.TargetFrameworkAttribute
6ff9dbd4 0x02000013 System.Object
6ff9ee28 0x02000015 System.RuntimeTypeHandle
6ff9ea20 0x02000017 System.Type
6ff9be08 0x02000018 System.Runtime.InteropServices.Marshal
6ff9c00c 0x02000019 System.IntPtr
6ff9f6bc 0x0200001f System.Int32
6ff86980 0x02000020 System.Console
6ffa09c0 0x02000021 System.GC
0:006> !dumpmt -md 01594d24 
EEClass:         01591470
Module:          01593fbc
Name:            GenericUnsafeAlloc.Program
mdToken:         02000003
File:            C:\GenericUnsafeAlloc\bin\Debug\GenericUnsafeAlloc.exe
BaseSize:        0xc
ComponentSize:   0x0
Slots in VTable: 6
Number of IFaces in IFaceMap: 0
--------------------------------------
MethodDesc Table
   Entry MethodDe    JIT Name
6ff019c8 6fb661fc PreJIT System.Object.ToString()
6ff07850 6fb66204 PreJIT System.Object.Equals(System.Object)
6ff0bd80 6fb66224 PreJIT System.Object.GetHashCode()
6fe5dbe8 6fb66238 PreJIT System.Object.Finalize()
02cb003d 01594d1c   NONE GenericUnsafeAlloc.Program..ctor()
02cb0448 01594d10    JIT GenericUnsafeAlloc.Program.Main()
0:006> !U 02cb0448 
Normal JIT generated code
GenericUnsafeAlloc.Program.Main()
Begin 02cb0448, size 1c5
*** WARNING: Unable to verify checksum for C:\GenericUnsafeAlloc\bin\Debug\GenericUnsafeAlloc.exe

C:\GenericUnsafeAlloc\Program.cs @ 130:
>>> 02cb0448 55              push    ebp
02cb0449 8bec            mov     ebp,esp
02cb044b 57              push    edi
02cb044c 83ec68          sub     esp,68h
02cb044f 8d7d94          lea     edi,[ebp-6Ch]
02cb0452 b91a000000      mov     ecx,1Ah
02cb0457 33c0            xor     eax,eax
02cb0459 f3ab            rep stos dword ptr es:[edi]
02cb045b 833d6842590100  cmp     dword ptr ds:[1594268h],0
02cb0462 7405            je      02cb0469
02cb0464 e8c4c6736e      call    clr!JIT_DbgIsJustMyCode (713ecb2d)
02cb0469 33d2            xor     edx,edx
02cb046b 8955e8          mov     dword ptr [ebp-18h],edx
02cb046e 33d2            xor     edx,edx
02cb0470 8955e0          mov     dword ptr [ebp-20h],edx
02cb0473 33d2            xor     edx,edx
02cb0475 8955e4          mov     dword ptr [ebp-1Ch],edx
02cb0478 33d2            xor     edx,edx
02cb047a 8955dc          mov     dword ptr [ebp-24h],edx
02cb047d 90              nop

C:\GenericUnsafeAlloc\Program.cs @ 132:
02cb047e b9084e5901      mov     ecx,1594E08h (MT: GenericUnsafeAlloc.GenericMemoryAllocator)
02cb0483 e8402c8dfe      call    015830c8 (JitHelp: CORINFO_HELP_NEWSFAST)
02cb0488 8945d8          mov     dword ptr [ebp-28h],eax
02cb048b 8b4dd8          mov     ecx,dword ptr [ebp-28h]
02cb048e ff15484e5901    call    dword ptr ds:[1594E48h] (GenericUnsafeAlloc.GenericMemoryAllocator..ctor(), mdToken: 06000007)
02cb0494 8b45d8          mov     eax,dword ptr [ebp-28h]
02cb0497 8945e8          mov     dword ptr [ebp-18h],eax
...

As we can see, we first load SOS extension. Next, we find all Program in all modules, we dump our module, and finally we get to method descriptions by going through method tables.

We can see, that in line Program.cs@132 we allocate new object. Internally it is realized by call 015830c8 (JitHelp: CORINFO_HELP_NEWSFAST). WinDBG helps us here and says that this is JitHelp method, which is used to allocate memory.

Now we can decompile the code and see what it does:

0:006> !U 015830c8
Unmanaged code
015830c8 8b4104          mov     eax,dword ptr [ecx+4]
015830cb 648b15380e0000  mov     edx,dword ptr fs:[0E38h]
015830d2 034240          add     eax,dword ptr [edx+40h]
015830d5 3b4244          cmp     eax,dword ptr [edx+44h]
015830d8 7709            ja      015830e3
015830da 894240          mov     dword ptr [edx+40h],eax
015830dd 2b4104          sub     eax,dword ptr [ecx+4]
015830e0 8908            mov     dword ptr [eax],ecx
015830e2 c3              ret
015830e3 e96663b46f      jmp     clr!JIT_New (710c944e)

As we can see, this is small method stub. It does not have ordinary code setting stack frame, so we suspect that it is dynamically generated in runtime. You can verify it by rerunning the application, finding this code, and looking for it in modules. As far as we can see, it is not stored in any particular DLL, so we can suspect that it is written dynamically to the memory. You can try to find this code in .NET Core or Rotor sources (“open source” version of .NET Framework 2.0).

OK, so let’s see, what this function really does. Here are only my assumptions, so don’t consider them “rock solid”. However, they seem quite reasonable.
First, it loads [ecx+4] into eax. This is size of object to allocate.
Next, it retrieves variable from fs segment. This is probably metadata of recently allocated block.
In next line it adds size of the object and a value retrieved from metadata. In the next two lines it compares this value to some other value from metadata. This is probably check if object to allocate fits in memory block. If it doesn’t then we jump to function which will allocate another chunk of memory and do some magic.
If the object fits, we store size. Next, we perform arithmetic on pointers to get address to our object. Finally, we return the pointer to the object.

Now we can find our allocate function:

0:006> !dumpmt -md 01594e08
EEClass:         015918c4
Module:          01593fbc
Name:            GenericUnsafeAlloc.GenericMemoryAllocator
mdToken:         02000002
File:            C:\GenericUnsafeAlloc\bin\Debug\GenericUnsafeAlloc.exe
BaseSize:        0xc
ComponentSize:   0x0
Slots in VTable: 12
Number of IFaces in IFaceMap: 0
--------------------------------------
MethodDesc Table
   Entry MethodDe    JIT Name
6ff019c8 6fb661fc PreJIT System.Object.ToString()
6ff07850 6fb66204 PreJIT System.Object.Equals(System.Object)
6ff0bd80 6fb66224 PreJIT System.Object.GetHashCode()
6fe5dbe8 6fb66238 PreJIT System.Object.Finalize()
02cb0620 01594e00    JIT GenericUnsafeAlloc.GenericMemoryAllocator..cctor()
02cb06a8 01594df8    JIT GenericUnsafeAlloc.GenericMemoryAllocator..ctor()
02cb0059 01594db4   NONE GenericUnsafeAlloc.GenericMemoryAllocator.Allocate()
02cb0055 01594da8   NONE GenericUnsafeAlloc.GenericMemoryAllocator.GetReferenceAsPointer(System.Object)
02cb07d0 01594dc8    JIT GenericUnsafeAlloc.GenericMemoryAllocator.RawAllocate(IntPtr)
02cb0061 01594dd4   NONE GenericUnsafeAlloc.GenericMemoryAllocator.CreateObject()
02cb0065 01594de0   NONE GenericUnsafeAlloc.GenericMemoryAllocator.GetAllocMethodAddress()
02cb0069 01594dec   NONE GenericUnsafeAlloc.GenericMemoryAllocator.HijackNew()

We can see, that RawAllocate is jitted and stored in 02cb07d0. We can now patch CLR function and jump to our code:

0:006> a 015830c8
015830c8 jmp 0x02cb07d0
jmp 0x02cb07d0
015830cd 

0:006> !U 015830c8
Unmanaged code
015830c8 e903d77201      jmp     02cb07d0
015830cd 15380e0000      adc     eax,0E38h
015830d2 034240          add     eax,dword ptr [edx+40h]
015830d5 3b4244          cmp     eax,dword ptr [edx+44h]
015830d8 7709            ja      015830e3
015830da 894240          mov     dword ptr [edx+40h],eax
015830dd 2b4104          sub     eax,dword ptr [ecx+4]
015830e0 8908            mov     dword ptr [eax],ecx
015830e2 c3              ret
015830e3 e96663b46f      jmp     clr!JIT_New (710c944e)

And we are done. Now all alocation mechanism will go through our function!

Patching code from code

We already know how to patch code using WinDBG. However, we would like to do the same in C#.

Since extracting address of CLR method for allocation might be difficult, we will extract it from code. We will use a helper function:

[MethodImpl(MethodImplOptions.NoOptimization)]
private void CreateObject()
{
	new object();
}

We define a method which only creates new object. We also add attribute which will disable optimizations for this function.

Now we can find address with the following code:

private static int GetAllocMethodAddress()
{
	// Get handle to method creating object
	var methodHandle = typeof(GenericMemoryAllocator).GetMethod("CreateObject", BindingFlags.NonPublic | BindingFlags.Instance).MethodHandle;
	// JIT methods
	RuntimeHelpers.PrepareMethod(methodHandle);
	// Get address of jitted method
	IntPtr methodAddress = Marshal.ReadIntPtr(methodHandle.Value, 8);

	// Call to internal function differs between builds
#if DEBUG
	int offset = 0x22;
#else
	int offset = 0xE;
#endif

	// Read jump offset
	int address = 0;
	for(int i = 1; i < 5; ++i)
	{
		address = address + (Marshal.ReadByte(methodAddress, offset + i) << (i-1)*8);
	}

	// Calculate absolute address
	address = address + (int)methodAddress + offset + 1 + 4;

	return address;
}

We hard-code two offsets, one for DEBUG build, one for normal build.

Finally, we patch the code by hand:

public static void HijackNew()
{
	var methodHandle = typeof(GenericMemoryAllocator).GetMethod("RawAllocate").MethodHandle;
	RuntimeHelpers.PrepareMethod(methodHandle);

	var myAllocAddress = Marshal.ReadInt32(methodHandle.Value, 8);
	var defaultAllocAddress = GetAllocMethodAddress();

	int offset = myAllocAddress - defaultAllocAddress - 4 - 1; // 4 bytes for relative address and one byte for opcode
	byte[] instruction = {
		0xe9, // Long jump instruction
		(byte)(offset & 0xff),
		(byte)((offset >> 8) & 0xff),
		(byte)((offset >> 16) & 0xff),
		(byte)((offset >> 24) & 0xff)
	};

	Marshal.Copy(instruction, 0, (IntPtr)defaultAllocAddress, instruction.Length);
}

We get handle to our method, compile it with RuntimeHelpers. Next, we get address of jitted code (which is stored eight bytes after the beginning of the metadata). We get address of CLR function and calculate code for jump.

Jump instruction has five bytes: one for opcode, and four for address. We need to remember that the address is relative.

And we are done. We can now test our code with this simple program:

static void Main()
{
	// Create allocator and allocate object
	var allocator = new GenericMemoryAllocator();
	var customlyAlocated = allocator.Allocate< TestClass>();

	// Allocate ordinary object
	var ordinary = new object();

	// Hijack method and allocate object
	GenericMemoryAllocator.HijackNew();
	var hijacked = new object();

	// Observe that hijacked objects are in generation 2
	Console.WriteLine("Allocator: {0}", GC.GetGeneration(allocator));
	Console.WriteLine("Object customly allocated by hand: {0}", GC.GetGeneration(customlyAlocated));
	Console.WriteLine("Object created normally: {0}", GC.GetGeneration(ordinary));
	Console.WriteLine("Object with hijacked newobj: {0}", GC.GetGeneration(hijacked));
}

And the result is:

Generic allocator result.

You can find all code here: https://gist.github.com/afish/33046b6c90833c922d6d

Summary

We were able to patch CLR in order to implement custom allocation mechanism. Of course, this is just proof of concept — it is definitely not production-ready. In order to make it more reliable, we should patch other functions (yes, there are different functions for allocating arrays etc), we also would need to make it working on x64. However, this is just proof of concept.

Bonus chatter: I said that you should not run it from VS debugger. Why? What would you need to change in order to make it work?
Bonus chatter 2: What changes are required to run it on x64?
Bonus chatter 3: What about memory pinning? Will GC move our objects or not?