Concurrency Part 8 — Tracking mutex owner

This is the eighth part of the Concurrency series. For your convenience you can find other parts in the table of contents in Part 1 – Mutex performance in .NET

We know how to use global mutexes to synchronize processes. However, there is a big drawback — we don’t know who owns the mutex and we cannot get that information easily. There is https://docs.microsoft.com/en-us/windows/win32/debug/wait-chain-traversal API for reading that information but it is not easy to follow. Can we do better?

One common trick is to use memory mapped files to store the information. Let’s see the code:

Magic, a lot. Let’s go part by part.

First, we want to store process ID and thread ID somewhere in the lock to be able to read it easily later. In line 18 we encode those in one 64-bit long variable which we can later replace using CAS operation.

In line 21 we just calculate the time when we should stop trying to get lock. This is for timeouts.

Next, we create memory mapped file (lines 24, 25) not backed by any physical file. This is to avoid permission problems — we cannot map the same file in two processes without copy-on-write semantics. We will need separate reader for debugging.

Next, we spin in the loop. Each time we try to take lock (line 37). If current lock value is zero (line 39) then it means that our lock is available and we have just locked it. We execute the action and then return (which is jump to finally).

However, if lock wasn’t available, we now have the owner of the lock (lines 46 and 47). So we need to check if the owner is still alive. We read process and look for thread in line 52.

If it didn’t exist, we try to clear the lock (line 62). We clear it only if it is still held by the same owner. And here is big warning — process ID and thread ID can be recycled so here we may inadvertently release still used mutex!

Then in line 69 we sleep for some timeout and loop again.

Ultimately, in line 77 we try to clear the lock if it is taken by us. We may end up in this line of code if some exception appears so we cannot just blindly release the mutex.

That’s all, you can verify that it should work pretty well. Just be aware of this chance of clearing up some other mutex, you may come up with different identifier if needed.

In theory, this can be solved by using CAS for 128 bits:

However, it doesn’t work on my machine, my kernel32.dll doesn’t support this CAS operation.

And since this is playing with very dangerous primitives, I take no responsibility for any failures in your systems if it doesn’t work. I tested it in my situation and it behaved correctly, though.