unsafe – Random IT Utensils https://blog.adamfurmanek.pl IT, operating systems, maths, and more. Mon, 29 Apr 2019 02:25:11 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 Custom memory allocation in C# Part 12 — Hiding objects from GC https://blog.adamfurmanek.pl/2019/08/10/custom-memory-allocation-in-c-part-12/ https://blog.adamfurmanek.pl/2019/08/10/custom-memory-allocation-in-c-part-12/#comments Sat, 10 Aug 2019 08:00:55 +0000 https://blog.adamfurmanek.pl/?p=3048 Continue reading Custom memory allocation in C# Part 12 — Hiding objects from GC]]>

This is the twelfth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

We already know how to allocate memory anywhere on the heap but if you think that it solves all problems with GC then most likely you are missing one thing. Even though we took control of allocation we still have GC algorithms running in the background. This means that GC will traverse all object graphs and try to clean them up. Pinning objects or allocating them in LOH doesn’t solve this issue. So, can we do something about it? Let’s see.

First, let’s see if we can measure “useless” GC runs. Let’s take this code:

class BigGraph
{
	public int value;
	public BigGraph left;
	public BigGraph right;

	public BigGraph(int value, int level)
	{
		this.value = value;
		if (level == 0) return;
		this.left = new BigGraph(value + 1, level - 1);
		this.right = new BigGraph(value + 2, level - 1);
	}

	public int GetSum()
	{
		return value + (right != null ? left.GetSum() + right.GetSum() : 0);
	}
}

We have a simple binary tree implementation with one integer field. Let’s allocate a chunk of memory and see how fast GC is:

class Program
{
	public static int depth = 25;
	public static int iterations = 10;

	static void Main(string[] args)
	{
		Console.WriteLine($"Allocating: {DateTime.Now}");
		var graph = new BigGraph(1, depth);
		Console.WriteLine($"Allocated: {DateTime.Now}");
		Console.WriteLine($"Sum: {graph.GetSum()} at {DateTime.Now}");

		var stopwatch = new Stopwatch();
		stopwatch.Start();
		for(int i = 0; i < iterations; ++i)
		{
			GC.Collect(2);
			GC.WaitForFullGCComplete();
			GC.WaitForPendingFinalizers();
			Console.WriteLine($"Iteration {i + 1} at {DateTime.Now}");
		}
		stopwatch.Stop();

		Console.WriteLine($"Finished at {DateTime.Now} after {stopwatch.Elapsed}");
		Console.WriteLine($"Checking sum to see if objects are available: {graph.GetSum()} at {DateTime.Now}");

		Console.ReadLine();
	}
}

Output:

Allocating: 4/28/2019 7:06:06 PM
Allocated: 4/28/2019 7:06:22 PM
Sum: -1811939326 at 4/28/2019 7:06:23 PM
Iteration 1 at 4/28/2019 7:06:27 PM
Iteration 2 at 4/28/2019 7:06:30 PM
Iteration 3 at 4/28/2019 7:06:33 PM
Iteration 4 at 4/28/2019 7:06:36 PM
Iteration 5 at 4/28/2019 7:06:39 PM
Iteration 6 at 4/28/2019 7:06:42 PM
Iteration 7 at 4/28/2019 7:06:44 PM
Iteration 8 at 4/28/2019 7:06:47 PM
Iteration 9 at 4/28/2019 7:06:50 PM
Iteration 10 at 4/28/2019 7:06:52 PM
Finished at 4/28/2019 7:06:52 PM after 00:00:28.9204623
Checking sum to see if objects are available: -1811939326 at 4/28/2019 7:06:53 PM

According to Task Manager, whole graph consumes around 1.3 GB of memory. We can see that each full GC iteration takes almost 3 seconds on average.

So what can we do to make it faster? We need to convince GC to not traverse our objects. We need to replace every reference with integer. Interior references need to be replaced as well because they may leak into local variables so GC would have a way to enter the graph.

We will use UnsafeList class to make it easier.

First, we create new class:

class BigGraphNoGcRef
{
	public static UnsafeList.UnsafeList< BigGraphNoGcRef> list = new UnsafeList.UnsafeList< BigGraphNoGcRef>((int)Math.Pow(2, Program.depth + 1), 6);

	public int value;
	public int left;
	public int right;

	public BigGraphNoGcRef(int value, int level)
	{
		this.value = value;
		if (level == 0) return;
		this.left = list.Add(new BigGraphNoGcRef(value + 1, level - 1));
		this.right = list.Add(new BigGraphNoGcRef(value + 2, level - 1));
	}

	public int GetSum()
	{
		return value + (right != 0 ? list.Get(left).GetSum() + list.Get(right).GetSum() : 0);
	}
}

Please forgive me ugliness of static fields here and there but this is just to make things easy to follow. Production-ready implementation should make it better, obviously. Also, I approximate size of the object to have 6 integers and allocate twice as long array as needed, just to make sure that I don’t hit wrong memory.

So we still have the field with value which we use for verification if our objects are still alive, but for pointers to children we use ordinary integers. Please keep in mind that I use Any CPU on Windows 7 x64, so the application runs as WoW64 and hence uses 32-bit long pointers.

When allocating children we add them to unsafe list. This creates objects on the heap which are then cleaned up by GC but whole list copies them and stores forever. You may want to use custom allocator instead, this is just an implementation detail.

Rest of the class is almost identical. We just need to extract pointers using Get method which looks like this:

public T Get(int index)
{
	unsafe
	{
		TypedReference reference = __makeref(_target);
		*(int*)*(int*)&reference = _storage[index * _elementSize];
		T result = __refvalue(reference, T);
		return result;
	}
}

So we create new reference on the fly as needed.

Now, let’s see the benchmark:

class Program
{
	public static int depth = 25;
	public static int iterations = 10;

	static void Main(string[] args)
	{
		Console.WriteLine($"Allocating: {DateTime.Now}");
		var graph = new BigGraphNoGcRef(1, depth);
		Console.WriteLine($"Allocated: {DateTime.Now}");
		Console.WriteLine($"Sum: {graph.GetSum()} at {DateTime.Now}");

		var stopwatch = new Stopwatch();
		stopwatch.Start();
		for(int i = 0; i < iterations; ++i)
		{
			GC.Collect(2);
			GC.WaitForFullGCComplete();
			GC.WaitForPendingFinalizers();
			Console.WriteLine($"Iteration {i + 1} at {DateTime.Now}");
		}
		stopwatch.Stop();

		Console.WriteLine($"Finished at {DateTime.Now} after {stopwatch.Elapsed}");
		Console.WriteLine($"Checking sum to see if objects are available: {graph.GetSum()} at {DateTime.Now}");

		Console.ReadLine();
	}
}

Yes, the only difference is in line 9 where I create different collection type. Results:

Allocating: 4/28/2019 7:05:24 PM
Allocated: 4/28/2019 7:05:30 PM
Sum: -1811939326 at 4/28/2019 7:05:32 PM
Iteration 1 at 4/28/2019 7:05:32 PM
Iteration 2 at 4/28/2019 7:05:32 PM
Iteration 3 at 4/28/2019 7:05:32 PM
Iteration 4 at 4/28/2019 7:05:32 PM
Iteration 5 at 4/28/2019 7:05:32 PM
Iteration 6 at 4/28/2019 7:05:32 PM
Iteration 7 at 4/28/2019 7:05:32 PM
Iteration 8 at 4/28/2019 7:05:32 PM
Iteration 9 at 4/28/2019 7:05:32 PM
Iteration 10 at 4/28/2019 7:05:32 PM
Finished at 4/28/2019 7:05:32 PM after 00:00:00.0013297
Checking sum to see if objects are available: -1811939326 at 4/28/2019 7:05:35 PM

So we see that allocation is much faster (but we observed that in Part 2 as well). However, Full GC takes almost nothing! There is simply no objects to traverse so GC has nothing to do. At the same time, you can see that sum of integers is exactly the same before and after running GC so our objects work correctly.

Summary

Using custom allocation and manual memory management we can effectively turn GC off. There is one more caveat: our approach works with types we control. If we wanted to use it with built-int classes (or generally classes in which we cannot replace references with integers), we would need to implement custom wrapper which would extract all reference-type fields using reflection, nullify them and store addresses in arrays of integers. That would probably require some dynamic proxies as well so we will cover this another day.

]]>
https://blog.adamfurmanek.pl/2019/08/10/custom-memory-allocation-in-c-part-12/feed/ 1