Debugging – Random IT Utensils https://blog.adamfurmanek.pl IT, operating systems, maths, and more. Sun, 02 Jun 2024 09:13:43 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 Bit Twiddling Part 5 — Fixing audio latency in mstsc.exe (RDP) https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/ https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/#respond Thu, 30 May 2024 10:34:31 +0000 https://blog.adamfurmanek.pl/?p=5036 Continue reading Bit Twiddling Part 5 — Fixing audio latency in mstsc.exe (RDP)]]>

This is the fifth part of the Bit Twiddling series. For your convenience you can find other parts in the table of contents in Par 1 — Modifying Android application on a binary level

Today we’re going to fix the audio latency in mstsc.exe. Something that people really ask about on the Internet and there is no definite solution. I’ll show how to hack mstsc.exe to fix the latency. First, I’ll explain why existing solutions do not work and what needs to be done, and at the end of this post I provide an automated PowerShell script that does the magic.

What is the issue

First, a disclaimer. I’ll describe what I suspect is happening. I’m not an expert of the RDP and I didn’t see the source code of mstsc.exe. I’m just guessing what happens based on what I see.

RDP supports multiple channels. There is a separate channel for video and another one for audio. Unfortunately, these two channels are not synchronized which means that they are synchronized on the client on a best effort basis (or rather sheer luck). To understand why it’s hard to synchronize two independent streams, we need to understand how much different they are.

Video is something that doesn’t need to be presented “for some time”. We can simply take the last video frame and show it to the user. If frames are buffered, we just process them as fast as possible to present the last state. You can see that mstsc.exe does exactly that by connecting to a remote host, playing some video, and then suspending the mstsc.exe process with ProcessExplorer. When you resume it after few seconds, you’ll see the video moving much faster until the client catches up with the latest state.

When it comes to audio, things are much different. You can’t just jump to the latest audio because you’d loose the content as it would be unintelligible. Each audio piece has its desired length. When you get delayed, you could just skip the packets (and lose some audio), play it faster (which would make it less intelligible for some time), or just play it as it goes. However, to decide what to do, you would need to understand whether the piece you want to play is delayed or not. You can’t tell that without timestamps or time markers, and I believe RDP doesn’t send those.

As long as you’re getting audio packets “on time”, there is no issue. You just play them. The first part is if you get them “in time”. This depends on your network quality and on the server that sends you the sound. From my experience, Windows Server is better when it comes to speed of sending updates. I can see my mouse moving faster and audio delayed less often when I connect to the Windows Server than Windows 10 (or other client edition). Therefore, use Windows Server where possible. Just keep in mind that you’ll need CAL licenses to send microphone and camera to the server which is a bummer (client edition lets you do that for free). Making the packets to be sent as fast as possible is the problem number one.

However, there is another part to that. If your client gets delayed for whatever reason (CPU spike, overload, or preemption), your sound will effectively slow down. You will just play it “later” even though you received it “on time”. Unfortunately, RDP cannot detect whether this happened because there are no timestamps in the stream. As far as I can tell, this is the true reason why your sound is often delayed. You can get “no latency audio” over the Internet and I had it many times. However, the longer you run the client, the higher the chance that you’ll go out of sync. This is the problem number two.

Therefore, we need to “resync” the client. Let’s see how to do it.

Why existing solutions don’t work and what fix we need

First, let me explain why existing solutions won’t work. Many articles on the Internet tell you to change the audio quality to High in Group Policy and enforce the quality in your rdp.file by setting audioqualitymode:i:2. This fixes the problem number one (although I don’t see much difference to be honest), but it doesn’t address the problem number two.

Some other articles suggest other fixes on the remote side. All these fixes have one thing in common – they don’t fix the client. If the mstsc.exe client cannot catch up when it gets delayed, then the only thing you can do is to reset the audio stream. Actually, this is how you can fix the delay easily – just restart the audio service:

net stop audiosrv & timeout 3 & net start audiosrv

I add the timeout to give some time to clear the buffer on the client. Once the buffer is empty, we restart the service and then the audio should be in sync. Try this to verify if it’s physically possible to deliver the audio “on time” in your networking conditions.

Unfortunately, restarting the audio service have many issues. First, it resets the devices on the remote end, so your audio streams may break and you’ll need to restart them. Second, this simply takes time. You probably don’t want to lose a couple of seconds of audio and microphone when you’re presenting (and unfortunately, you’ll get delayed exactly during that time).

What we need is to fix the client to catch up when there is a delay. However, how can we do that when we don’t have any timestamps? Well, the solution is simple – just drop some audio packages (like 10%) periodically to sync over time. This will decrease the audio quality to some extent and won’t fix the delay immediately, but after few seconds we’ll get back on track. Obviously, you could implement some better heuristics and solutions. There is one problem, though – we need to do that in the mstsc.exe itself. And here comes the low level magic. Let’s implement the solution that drops the audio frames to effectively “resync” the audio.

How to identify

We need to figure out how the sound is played. Let’s take ApiMonitor to trace the application. Luckily enough, it seems that the waveOutPrepareHeader is used, as we can see in the screenshot:

Let’s break there win WinDBG:

bu 0x00007ffb897d3019

kb

 # RetAddr               : Args to Child                                                           : Call Site														
00 00007ffb`897d1064     : 00000252`6d68beb8 00000252`6d68beb8 00000000`000014ac 00000252`6d68be00 : mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d														
01 00007ffb`897e7ee8     : 00000000`00000000 00000073`bfc7fcb9 00000252`6d68be00 00000252`7e69df80 : mstscax!CRdpWinAudioWaveoutPlayback::vcwaveWritePCM+0xec														
02 00007ffb`898e73bf     : 00000000`00000001 00000000`00000003 00000073`bfc7d055 00000073`00001000 : mstscax!CRdpWinAudioWaveoutPlayback::RenderThreadProc+0x2c8														
03 00007ffc`02e57344     : 00000000`000000ac 00000252`6d68be00 00000000`00000000 00000000`00000000 : mstscax!CRdpWinAudioWaveoutPlayback::STATIC_ThreadProc+0xdf														
04 00007ffc`046826b1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14														
05 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

We can see a method named mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite. Let’s see it:

u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d		
									
mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:											
00007ffb`897d2fac 48895c2410      mov     qword ptr [rsp+10h],rbx											
00007ffb`897d2fb1 55              push    rbp											
00007ffb`897d2fb2 56              push    rsi											
00007ffb`897d2fb3 57              push    rdi											
00007ffb`897d2fb4 4883ec40        sub     rsp,40h											
00007ffb`897d2fb8 488bf2          mov     rsi,rdx											
00007ffb`897d2fbb 488bd9          mov     rbx,rcx											
00007ffb`897d2fbe 488b0543287500  mov     rax,qword ptr [mstscax!WPP_GLOBAL_Control (00007ffb`89f25808)]											
00007ffb`897d2fc5 488d2d3c287500  lea     rbp,[mstscax!WPP_GLOBAL_Control (00007ffb`89f25808)]											
00007ffb`897d2fcc 483bc5          cmp     rax,rbp											
00007ffb`897d2fcf 740a            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x2f (00007ffb`897d2fdb)											
00007ffb`897d2fd1 f6401c01        test    byte ptr [rax+1Ch],1											
00007ffb`897d2fd5 0f85e8000000    jne     mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x117 (00007ffb`897d30c3)											
00007ffb`897d2fdb 83bbc000000000  cmp     dword ptr [rbx+0C0h],0											
00007ffb`897d2fe2 7418            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x50 (00007ffb`897d2ffc)											
00007ffb`897d2fe4 488b8bb8000000  mov     rcx,qword ptr [rbx+0B8h]											
00007ffb`897d2feb 4885c9          test    rcx,rcx											
00007ffb`897d2fee 740c            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x50 (00007ffb`897d2ffc)											
00007ffb`897d2ff0 48ff15a9295c00  call    qword ptr [mstscax!_imp_EnterCriticalSection (00007ffb`89d959a0)]											
00007ffb`897d2ff7 0f1f440000      nop     dword ptr [rax+rax]											
00007ffb`897d2ffc 488b4b70        mov     rcx,qword ptr [rbx+70h]											
00007ffb`897d3000 4885c9          test    rcx,rcx											
00007ffb`897d3003 0f84f2000000    je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x14f (00007ffb`897d30fb)											
00007ffb`897d3009 41b830000000    mov     r8d,30h											
00007ffb`897d300f 488bd6          mov     rdx,rsi											
00007ffb`897d3012 48ff15e7a87900  call    qword ptr [mstscax!_imp_waveOutPrepareHeader (00007ffb`89f6d900)]											
00007ffb`897d3019 0f1f440000      nop     dword ptr [rax+rax]

Okay, we can see that this method passes the audio packet to the API. When we look at WAVEHDR structure, we can see that it has the following fields:

typedef struct wavehdr_tag {
  LPSTR              lpData;
  DWORD              dwBufferLength;
  DWORD              dwBytesRecorded;
  DWORD_PTR          dwUser;
  DWORD              dwFlags;
  DWORD              dwLoops;
  struct wavehdr_tag  *lpNext;
  DWORD_PTR          reserved;
} WAVEHDR, *LPWAVEHDR;

This is exactly what we see in ApiMonitor. Seems like the dwBufferLength is what we might want to change. When we shorten this buffer, we’ll effectively make the audio last shorter. We can do that for some of the packets to not break the quality much, and then all should be good.

We can verify that this works with this breakpoint:

bp 00007ffb`897d3012 "r @$t0 = poi(rdx + 8); r @$t1 = @$t0 / 2; ed rdx+8 @$t1; g"

Unfortunately, this makes the client terribly slow. We need to patch the code in place. Effectively, we need to inject a shellcode.

First, we need to allocate some meory with VirtualAllocEx via .dvalloc.

.dvalloc 1024

The debugger allocates the memory. In my case the address is 25fb8960000.

The address of the WinAPI function is in the memory, so we need to remember to extract the pointer from the address:

00007ffb`897d3012 48ff15e7a87900  call    qword ptr [mstscax!_imp_waveOutPrepareHeader (00007ffb`89f6d900)]

Now we need to do two things: first, we need to patch the call site to call our shellcode instead of [mstscax!_imp_waveOutPrepareHeader (00007ffb89f6d900)]. Second, we need to construct a shell code that fixes the audio packet for some of the packets, and then calls [mstscax!_imp_waveOutPrepareHeader (00007ffb89f6d900)] correctly.

To do the first thing, we can do the absolute jump trick. We put the address in the rax register, push it on the stack, and then return. This is the code:

mov rax, 0x25fb8960000	;Move the address to the register
push rax		;Push the address on the stack
ret			;Return. This takes the address from the stuck and jumps
nop			;Nops are just to not break the following instructions when you disassemble with u
nop
nop
nop

We can compile the code with Online assembler and we should get a shell code. We can then put it in place with this line:

e	mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+5d		0x48 0xB8 0x00 0x00 0x96 0xb8 0x5f 0x02 0x00 0x00 0x50 0xC3 0x90 0x90 0x90 0x90

Unfortunately, this patch is long. We need to break few lines and then restore them in the shellcode. So our shell code starts with the lines that we broke:

mov r8d, 0x30	;Preserved code
mov rdx,rsi	;Preserved code

Next, we need to preserve our working registers:

push rbx
push rdx

Okay, now we can do the logic. We want to modify the buffer length. However, we don’t want to do it for all of the packets. We need some source of randomness, like time, random values, or something else. Fortunately, the WAVEHDR structure has the field dwUser which may be “random enough” for our needs. Let’s take that value modulo some constant, and then change the packet length only for some cases.

First, let’s preserve the buffer length for the sake of what we do later:

mov rax, [rdx + 8]	;Load buffer length (that's the second field of the structure)
push rax		;Store buffer length on the stack

Now, let’s load dwUser and divide it by some constant like 23:

mov rax, [rdx + 16]	;Load dwUser which is the fourth field
mov rbx, 23		;Move the constant to the register
xor rdx, rdx		;Clear upper divisor part
div rbx			;Divide

Now, we can restore the buffer length to rax:

pop rax	;Restore buffer length

At this point we have rax with the buffer length, and rdx with the remainder. We can now compare the reminder and skip the code modifying the pucket length if needed:

cmp rdx, 20	;Compare with 20	
jbe 0x17	;Skip the branch

We can see that we avoid the buffer length modification if the remainder is at most 20. Effectively, we have 20/22 = 0.909% chance that we won’t modify the package. This means that we modify something like 9% of the packages, assuming the dwUser has a good distribution. The code is written in this way so you can tune the odds of changing the packet.

Now, let’s modify the package. We want to divide the buffer length by 2, however, we want to keep it generic to be able to experiment with other values:

mov rbx, 1	;Move 1 to rbx to multiply by 1/2
xor rdx, rdx	;Clear remainder
mul rbx		;Multiply
mov rbx, 2	;Store 2 to rbx to multiply by 1/2
xor rdx, rdx	;Clear remainder
div rbx		;Divide

You can play with other values, obviously. From my experiments, halving the value works the best.

Now it’s rather easy. rax has the new buffer length or the original one if we decided not to modify it. Let’s restore other registers:

pop rdx
pop rbx

Let’s update the buffer length:

mov [rdx + 8], rax

Now, we need to prepare the jump addresses. First, we want to put the original return address of the method mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite which is mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d:

mov rax, 0x00007ffb897d3019
push rax

Now, we can jump to the WinAPI method:

push rbx
mov rbx, 0x00007ffbe6c6a860
mov rax, [rbx]
pop rbx
push rax
ret

That’s it. The final shellcode looks like this:

mov r8d, 0x30
mov rdx,rsi
push rbx
push rdx
mov rax, [rdx + 8]
push rax
mov rax, [rdx + 16]
mov rbx, 23
xor rdx, rdx
div rbx
pop rax
cmp rdx, 20
jbe 0x17
mov rbx, 1
xor rdx, rdx
mul rbx
mov rbx, 2
xor rdx, rdx
div rbx
pop rdx
pop rbx
mov [rdx + 8], rax
mov rax, 0x00007ffb897d3019
push rax
push rbx
mov rbx, 0x00007ffbe6c6a860
mov rax, [rbx]
pop rbx
push rax
ret

We can implant it with this:

e	25f`b8960000		0x41 0xB8 0x30 0x00 0x00 0x00 0x48 0x89 0xF2 0x53 0x52 0x48 0x8B 0x42 0x08 0x50 0x48 0x8B 0x42 0x10 0x48 0xC7 0xc3 0x17 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x58 0x48 0x83 0xFA 0x14 0x76 0x1A 0x48 0xC7 0xC3 0x01 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xE3 0x48 0xC7 0xC3 0x02 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x5A 0x5B 0x48 0x89 0x42 0x08 0x48 0xB8 0x19 0x30 0x7D 0x89 0xFB 0x7F 0x00 0x00 0x50 0x48 0xBB 0x60 0xA8 0xC6 0xE6 0xFB 0x7F 0x00 0x00 0x50 0xC3

Automated fix

We can now fix the code automatically. We need to do the following:

  • Start mstsc.exe
  • Find it’s process ID
  • Attach the debugger and find all the addresses: free memory, mstsc.exe method, WinAPI method
  • Construct the shellcode
  • Attach the debugger and patch the code
  • Detach the debugger

We can do all of that with PowerShell. Here is the code:

Function Run-Mstsc($rdpPath, $cdbPath, $numerator, $denominator){
	$id = get-random
	$code = @"
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;
	
namespace MstscPatcher
{
	public class Env$id {
		public static void Start() {
			Process[] processes = Process.GetProcessesByName("mstsc");
			RunProcess("mstsc.exe", "$rdpPath");
			Thread.Sleep(3000);
			Process[] processes2 = Process.GetProcessesByName("mstsc");
			var idToPatch = processes2.Select(p => p.Id).OrderBy(i => i).Except(processes.Select(p => p.Id).OrderBy(i => i)).First();
			Patch(idToPatch);
		}
		
		public static void Patch(int id){
			Console.WriteLine(id);
			var addresses = RunCbd(id, @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
!address
u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+100
.dvalloc 1024
qd
			");
			
			string freeMemoryAddress = addresses.Where(o => o.Contains("Allocated 2000 bytes starting at")).First().Split(' ').Last().Trim();
			Console.WriteLine("Free memory: " + freeMemoryAddress);
			
			var patchAddress = addresses.SkipWhile(o => !o.StartsWith("mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:"))
				.SkipWhile(o => !o.Contains("r8d,30h"))
				.First().Split(' ').First().Trim();
			Console.WriteLine("Patch address: " + patchAddress);
			var returnAddress = (Convert.ToUInt64(patchAddress.Replace(((char)96).ToString(),""), 16) + 0x10).ToString("X").Replace("0x", "");
			Console.WriteLine("Return address: " + returnAddress);
			
			var winApiAddress = addresses.SkipWhile(o => !o.Contains("[mstscax!_imp_waveOutPrepareHeader"))
				.First().Split('(')[1].Split(')')[0].Trim();
			Console.WriteLine("WinAPI address: " + winApiAddress);
			
			Func<string, IEnumerable<string>> splitInPairs = address => address.Where((c, i) => i % 2 == 0).Zip(address.Where((c, i) => i % 2 == 1), (first, second) => first.ToString() + second.ToString());			
			Func<string, string> translateToBytes = address => string.Join(" ", splitInPairs(address.Replace(((char)96).ToString(), "").PadLeft(16, '0')).Reverse().Select(p => "0x" + p));
						
			var finalScript = @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
e	" + patchAddress + @"	0x48 0xB8 " + translateToBytes(freeMemoryAddress) + @" 0x50 0xC3 0x90 0x90 0x90 0x90	
e	" + freeMemoryAddress + @"	0x41 0xB8 0x30 0x00 0x00 0x00 0x48 0x89 0xF2 0x53 0x52 0x48 0x8B 0x42 0x08 0x50 0x48 0x8B 0x42 0x10 0x48 0xC7 0xc3 $denominator 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x58 0x48 0x83 0xFA $numerator 0x76 0x1A 0x48 0xC7 0xC3 0x01 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xE3 0x48 0xC7 0xC3 0x02 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x5A 0x5B 0x48 0x89 0x42 0x08 0x48 0xB8 " + translateToBytes(returnAddress) + @" 0x50 0x53 0x48 0xBB " + translateToBytes(winApiAddress) + @" 0x48 0x8B 0x03 0x5B 0x50 0xC3	
qd
			";
			Console.WriteLine(finalScript);
			RunCbd(id, finalScript);
		}
		
		public static string[] RunCbd(int id, string script) {
			Console.WriteLine(script);
			File.WriteAllText("mstsc.txt", script);
			string output = "";
			Process process = RunProcess("$cdbPath", "-p " + id + " -cf mstsc.txt", text => output += text + "\n");
			process.WaitForExit();
			File.Delete("mstsc.txt");
			
			return output.Split('\n');
		}
		
		public static Process RunProcess(string fileName, string arguments, Action<string> outputReader = null){
			ProcessStartInfo startInfo = new ProcessStartInfo
			{
				FileName = fileName,
				Arguments = arguments,
				UseShellExecute = outputReader == null,
				RedirectStandardOutput = outputReader != null,
				RedirectStandardError = outputReader != null
			};
			
			if(outputReader != null){
				var process = new Process{
					StartInfo = startInfo
				};
				process.OutputDataReceived += (sender, args) => outputReader(args.Data);
				process.ErrorDataReceived += (sender, args) => outputReader(args.Data);

				process.Start();
				process.BeginOutputReadLine();
				process.BeginErrorReadLine();
				return process;
			}else {
				return Process.Start(startInfo);
			}
		}
	}
}
"@.Replace('$id', $id)
	$assemblies = ("System.Core","System.Xml.Linq","System.Data","System.Xml", "System.Data.DataSetExtensions", "Microsoft.CSharp")
	Add-Type -referencedAssemblies $assemblies -TypeDefinition $code -Language CSharp
	iex "[MstscPatcher.Env$id]::Start()"
}


Run-Mstsc "my_rdp_settings.rdp".Replace("\", "\\") "cdb.exe".Replace("\", "\\") "0x14" "0x17"

We compile some C# code on the fly. First, we find existing mstsc.exe instances (line 16), then run the new instance (line 17), wait a bit for the mstsc.exe to spawn a child process, and then find the id (lines 19-20). We can then patch the existing id.

First, we look for addresses. We do all the manual steps we did above to find the memory address, and two function addresses. The script is in lines 27-32. Notice that I load symbols as we need them and CDB may not have them configured on the box.

We can now parse the output. We first extract the allocated memory in lines 35-36.

Next, we look for the call site. We dump the whole method, and then find the first occurrence of mov 8d,30h. That’s our call site. This is in lines 38-41.

Next, we calculate the return address which is 16 bytes further. This is in lines 42-43.

Finally, I calculate the WinAPI method address. I extract the location of the pointer for the method (lines 45-47).

Next, we need to construct the shell code. This is exactly what we did above. We just need to format addresses properly (this is in helper methods in lines 49-50), and then build the script (lines 52-558). We can run it and that’s it. The last thing is customization of the values. You can see in line 108 that I made two parameters to change numerator and denominator for the odds of modifying the package. This way you can easily change how many packets are broken. The more you break, the faster you resynchronize, however, the worse the sound is.

That’s it. I verified that on Windows 10 22H2 x64, Windows 11 22H2 x64, Windows Server 2016 1607 x64, and Windows Server 2019 1809 x64, and it worked well. Your mileage may vary, however, the approach should work anywhere. Generally, to make this script work somewhere else, you just need to adjust how we find the call site, the return address, and the address of the WinAPI function. Assuming that the WinAPI is still called via the pointer stored in memory, then you won’t need to touch the machine code payload.

Below is the script for x86 bit (worked on Windows 10 10240 x86). Main differences are in how we access the data structure as the pointer is on the stack (and not in the register).

Function Run-Mstsc($rdpPath, $cdbPath, $numerator, $denominator){
	$id = get-random
	$code = @"
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;
	
namespace MstscPatcher
{
	public class Env$id {
		public static void Start() {
			Process[] processes = Process.GetProcessesByName("mstsc");
			RunProcess("mstsc.exe", "$rdpPath");
			Thread.Sleep(3000);
			Process[] processes2 = Process.GetProcessesByName("mstsc");
			var idToPatch = processes2.Select(p => p.Id).OrderBy(i => i).Except(processes.Select(p => p.Id).OrderBy(i => i)).First();
			Patch(idToPatch);
		}
		
		public static void Patch(int id){
			Console.WriteLine(id);
			var addresses = RunCbd(id, @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
!address
u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+100
.dvalloc 1024
qd
			");
			
			string freeMemoryAddress = addresses.Where(o => o.Contains("Allocated 2000 bytes starting at")).First().Split(' ').Last().Trim();
			Console.WriteLine("Free memory: " + freeMemoryAddress);
			
			var patchAddress = addresses.SkipWhile(o => !o.StartsWith("mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:"))
				.SkipWhile(o => !(o.Contains("dword ptr [ebx+3Ch]") && o.Contains("push")))
				.First().Split(' ').First().Trim();
			Console.WriteLine("Patch address: " + patchAddress);
			var returnAddress = (Convert.ToUInt64(patchAddress.Replace(((char)96).ToString(),""), 16) + 0x9).ToString("X").Replace("0x", "");
			Console.WriteLine("Return address: " + returnAddress);
			
			var winApiAddress = addresses.SkipWhile(o => !o.Contains("[mstscax!_imp__waveOutPrepareHeader"))
				.First().Split('(')[1].Split(')')[0].Trim();
			Console.WriteLine("WinAPI address: " + winApiAddress);
			
			Func<string, IEnumerable<string>> splitInPairs = address => address.Where((c, i) => i % 2 == 0).Zip(address.Where((c, i) => i % 2 == 1), (first, second) => first.ToString() + second.ToString());			
			Func<string, string> translateToBytes = address => string.Join(" ", splitInPairs(address.Replace(((char)96).ToString(), "").PadLeft(8, '0')).Reverse().Select(p => "0x" + p));
						
			var finalScript = @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
e	" + patchAddress + @"	0xB8 " + translateToBytes(freeMemoryAddress) + @" 0x50 0xC3 0x90 0x90	
e	" + freeMemoryAddress + @"	0xFF 0x73 0x3C 0x53 0x52 0x8B 0x53 0x3C 0x52 0x8B 0x42 0x04 0x50 0x8B 0x42 0x0C 0xBB $denominator 0x00 0x00 0x00 0x31 0xD2 0xF7 0xF3 0x58 0x83 0xFA $numerator 0x76 0x12 0xBB 0x01 0x00 0x00 0x00 0x31 0xD2 0xF7 0xE3 0xBB 0x02 0x00 0x00 0x00 0x31 0xD2 0xF7 0xF3 0x5A 0x89 0x42 0x04 0x5A 0x5B 0xB8 " + translateToBytes(returnAddress) + @" 0x50 0x53 0xBB " + translateToBytes(winApiAddress) + @" 0x8B 0x03 0x5B 0x50 0xC3	
qd
			";
			Console.WriteLine(finalScript);
			RunCbd(id, finalScript);
		}
		
		public static string[] RunCbd(int id, string script) {
			Console.WriteLine(script);
			File.WriteAllText("mstsc.txt", script);
			string output = "";
			Process process = RunProcess("$cdbPath", "-p " + id + " -cf mstsc.txt", text => output += text + "\n");
			process.WaitForExit();
			File.Delete("mstsc.txt");
			
			return output.Split('\n');
		}
		
		public static Process RunProcess(string fileName, string arguments, Action<string> outputReader = null){
			ProcessStartInfo startInfo = new ProcessStartInfo
			{
				FileName = fileName,
				Arguments = arguments,
				UseShellExecute = outputReader == null,
				RedirectStandardOutput = outputReader != null,
				RedirectStandardError = outputReader != null
			};
			
			if(outputReader != null){
				var process = new Process{
					StartInfo = startInfo
				};
				process.OutputDataReceived += (sender, args) => outputReader(args.Data);
				process.ErrorDataReceived += (sender, args) => outputReader(args.Data);

				process.Start();
				process.BeginOutputReadLine();
				process.BeginErrorReadLine();
				return process;
			}else {
				return Process.Start(startInfo);
			}
		}
	}
}
"@.Replace('$id', $id)
	$assemblies = ("System.Core","System.Xml.Linq","System.Data","System.Xml", "System.Data.DataSetExtensions", "Microsoft.CSharp")
	Add-Type -referencedAssemblies $assemblies -TypeDefinition $code -Language CSharp
	iex "[MstscPatcher.Env$id]::Start()"
}

Run-Mstsc "my_rdp_settings.rdp".Replace("\", "\\") "cdb.exe".Replace("\", "\\") "0x14" "0x17"

Some keyword to make this easier to find on the Internet

Her some keywords to make this article easier to find on the Internet.

audio latency in mstsc.exe
audio latency in rdp
audio delay in mstsc.exe
audio delay in rdp
laggy sound in rdp
sound desynchronized in rdp
sound latency in rdp
slow audio
how to fix audio in rdp

Enjoy!

]]>
https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/feed/ 0
Custom memory allocation in C# Part 18 — Hijacking methods on .NET 5 with modifying machine code https://blog.adamfurmanek.pl/2021/12/11/custom-memory-allocation-in-c-part-18/ https://blog.adamfurmanek.pl/2021/12/11/custom-memory-allocation-in-c-part-18/#respond Sat, 11 Dec 2021 09:00:40 +0000 https://blog.adamfurmanek.pl/?p=4272 Continue reading Custom memory allocation in C# Part 18 — Hijacking methods on .NET 5 with modifying machine code]]>

This is the eighteenth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

Today we are going to see a rewritten way of hijacking method with machine code. It works in Windows and Linux, for both Debug and Release using .NET 5.

using System;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;

namespace MethodHijackerNetCore
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"Calling StaticString method before hacking:\t{TestClass.StaticString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.StaticString), typeof(Program), nameof(StaticStringHijacked));
            Console.WriteLine($"Calling StaticString method after hacking:\t{TestClass.StaticString()}");

            Console.WriteLine();

            var instance = new TestClass();
            Console.WriteLine($"Calling InstanceString method before hacking:\t{instance.InstanceString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.InstanceString), typeof(Program), nameof(InstanceStringHijacked));
            Console.WriteLine($"Calling InstanceString method after hacking:\t{instance.InstanceString()}");

            Console.WriteLine();

            Vector2 v = new Vector2(9.856331f, -2.2437377f);
            for (int i = 1; i <= 35 ; i++)
            {
                MultiTieredClass.Test(v, i);
                Thread.Sleep(100);
            }
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticStringHijacked()
        {
            return "Static string hijacked";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceStringHijacked()
        {
            return "Instance string hijacked";
        }

        public static void HijackMethod(Type sourceType, string sourceMethod, Type targetType, string targetMethod)
        {
            var source = sourceType.GetMethod(sourceMethod);
            var target = targetType.GetMethod(targetMethod);

            RuntimeHelpers.PrepareMethod(source.MethodHandle);
            RuntimeHelpers.PrepareMethod(target.MethodHandle);


            var offset = 2 * IntPtr.Size;
            IntPtr sourceAddress = Marshal.ReadIntPtr(source.MethodHandle.Value, offset);
            IntPtr targetAddress = Marshal.ReadIntPtr(target.MethodHandle.Value, offset);

            var is32Bit = IntPtr.Size == 4;
            byte[] instruction;

            if (is32Bit)
            {
                instruction = new byte[] {
                    0x68, // push <value>
                }
                 .Concat(BitConverter.GetBytes((int)targetAddress))
                 .Concat(new byte[] {
                    0xC3 //ret
                 }).ToArray();
            }
            else
            {
                instruction = new byte[] {
                    0x48, 0xB8 // mov rax <value>
                }
                .Concat(BitConverter.GetBytes((long)targetAddress))
                .Concat(new byte[] {
                    0x50, // push rax
                    0xC3  // ret
                }).ToArray();
            }

            Marshal.Copy(instruction, 0, sourceAddress, instruction.Length);
        }
    }

    class TestClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticString()
        {
            return "Static string";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceString()
        {
            return "Instance string";
        }
    }

    class MultiTieredClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Test(Vector2 v, int i)
        {
            v = Vector2.Normalize(v);
            Console.WriteLine($"Vector iteration {i:0000}:\t{v}\t{TestClass.StaticString()}");
        }
    }
}

Tested with .NET 5.0.102 on Windows and .NET 5.0.401 on WSL2 Ubuntu 20.04. This is the output:

Calling StaticString method before hacking:     Static string
Calling StaticString method after hacking:      Static string hijacked

Calling InstanceString method before hacking:   Instance string
Calling InstanceString method after hacking:    Instance string hijacked

Vector iteration 0001:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0002:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0003:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0004:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0005:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0006:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0007:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0008:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0009:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0010:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0011:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0012:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0013:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0014:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0015:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0016:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0017:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0018:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0019:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0020:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0021:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0022:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0023:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0024:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0025:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0026:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0027:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0028:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0029:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0030:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0031:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0032:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0033:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0034:  <0.97505456, -0.22196563>       Static string
Vector iteration 0035:  <0.97505456, -0.22196563>       Static string
Examine MethodDescriptor: 7FFA3ACF5218

So we clearly see that the hacking works and that after multitiered compilation kicks in it no longer calls method but inlines it.

]]>
https://blog.adamfurmanek.pl/2021/12/11/custom-memory-allocation-in-c-part-18/feed/ 0
Custom memory allocation in C# Part 17 — Hijacking methods on .NET 5 with modifying metadata curious thing https://blog.adamfurmanek.pl/2021/12/04/custom-memory-allocation-in-c-part-17/ https://blog.adamfurmanek.pl/2021/12/04/custom-memory-allocation-in-c-part-17/#respond Sat, 04 Dec 2021 09:00:48 +0000 https://blog.adamfurmanek.pl/?p=4270 Continue reading Custom memory allocation in C# Part 17 — Hijacking methods on .NET 5 with modifying metadata curious thing]]>

This is the seventeenth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

I was rewriting my method hijacking samples to .NET 5 and I found an interesting behavior. Let’s take this code:

using System;
using System.Linq;
using System.Numerics;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Threading;

namespace OverridingSealedMethodNetCore
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine($"Calling StaticString method before hacking:\t{TestClass.StaticString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.StaticString), typeof(Program), nameof(StaticStringHijacked));
            Console.WriteLine($"Calling StaticString method after hacking:\t{TestClass.StaticString()}");

            Console.WriteLine();

            var instance = new TestClass();
            Console.WriteLine($"Calling InstanceString method before hacking:\t{instance.InstanceString()}");
            HijackMethod(typeof(TestClass), nameof(TestClass.InstanceString), typeof(Program), nameof(InstanceStringHijacked));
            Console.WriteLine($"Calling InstanceString method after hacking:\t{instance.InstanceString()}");

            Console.WriteLine();

            Vector2 v = new Vector2(9.856331f, -2.2437377f);
            for (int i = 1; i <= 35; i++)
            {
                MultiTieredClass.Test(v, i);
                Thread.Sleep(100);
            }
        }

        public static void HijackMethod(Type sourceType, string sourceMethod, Type targetType, string targetMethod)
        {
            // Get methods using reflection
            var source = sourceType.GetMethod(sourceMethod);
            var target = targetType.GetMethod(targetMethod);

            // Prepare methods to get machine code (not needed in this example, though)
            RuntimeHelpers.PrepareMethod(source.MethodHandle);
            RuntimeHelpers.PrepareMethod(target.MethodHandle);

            var sourceMethodDescriptorAddress = source.MethodHandle.Value;
            var targetMethodMachineCodeAddress = target.MethodHandle.GetFunctionPointer();

            // Pointer is two pointers from the beginning of the method descriptor
            Marshal.WriteIntPtr(sourceMethodDescriptorAddress, 2 * IntPtr.Size, targetMethodMachineCodeAddress);
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticStringHijacked()
        {
            return "Static string hijacked";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceStringHijacked()
        {
             return "Instance string hijacked";
        }
    }

    class TestClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static string StaticString()
        {
            return "Static string";
        }

        [MethodImpl(MethodImplOptions.NoInlining)]
        public string InstanceString()
        {
            return "Instance string";
        }
    }

    class MultiTieredClass
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public static void Test(Vector2 v, int i)
        {
            v = Vector2.Normalize(v);
            Console.WriteLine($"Vector iteration {i:0000}:\t{v}\t{TestClass.StaticString()}");
        }
    }
}

If you follow my blog then there is nothing new here. We try to hijack method by modifying its runtime metadata. The MultiTiered part is only to show recompilation of the code. I’m running this on W10 x64 in Release x64 mode and I’m getting this output:

Calling StaticString method before hacking:     Static string
Calling StaticString method after hacking:      Static string

Calling InstanceString method before hacking:   Instance string
Calling InstanceString method after hacking:    Instance string

Vector iteration 0001:  <0.9750545, -0.22196561>        Static string
Vector iteration 0002:  <0.9750545, -0.22196561>        Static string
Vector iteration 0003:  <0.9750545, -0.22196561>        Static string
Vector iteration 0004:  <0.9750545, -0.22196561>        Static string
Vector iteration 0005:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0006:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0007:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0008:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0009:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0010:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0011:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0012:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0013:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0014:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0015:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0016:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0017:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0018:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0019:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0020:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0021:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0022:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0023:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0024:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0025:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0026:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0027:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0028:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0029:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0030:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0031:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0032:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0033:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0034:  <0.9750545, -0.22196561>        Static string hijacked
Vector iteration 0035:  <0.97505456, -0.22196563>       Static string

And this is nice. Notice that first two lines of the output show that even though we hacked the method, we’re still not getting the new behavior. That’s first why.

Next, we see that we start calling the example of multitiered compilation method and first 4 instances are consistent. However, in fifth one we see that a hijacked method was called instead of the original one. That’s second why. This lasts until iteration 35 when multitiered compilation kicks in and recompiles things.

I don’t know the answer why it works this way but I presume there is this new code cache thing which was implemented around .NET Core 2.1 to support multitiered compilation. I may be wrong, though.

]]>
https://blog.adamfurmanek.pl/2021/12/04/custom-memory-allocation-in-c-part-17/feed/ 0
Custom memory allocation in C# Part 16 — Hijacking new on Linux with .NET 5 https://blog.adamfurmanek.pl/2021/09/25/custom-memory-allocation-in-c-part-16/ https://blog.adamfurmanek.pl/2021/09/25/custom-memory-allocation-in-c-part-16/#comments Sat, 25 Sep 2021 08:00:20 +0000 https://blog.adamfurmanek.pl/?p=4019 Continue reading Custom memory allocation in C# Part 16 — Hijacking new on Linux with .NET 5]]>

This is the sixteenth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

I was recently asked if it’s possible to hijack the new operator in Linux. We’ve already seen that we can do it in both .NET Framework and .NET Core, in both x86_32 and x86_64, now it’s time to do it in .NET 5 on Linux.

Docker configuration

I’m going to use Ubuntu 20.04 on WSL2 running on Windows 10:

afish@ubuntu:/home/af$ uname -r
4.19.128-microsoft-standard

I’m going to use Docker for installing .NET and everything around. So I do this to create a container based on .NET 5:

docker run --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --security-opt apparmor=unconfined --rm -i -v /home/afish/makeref:/makeref mcr.microsoft.com/dotnet/sdk:5.0 bash

You can see I’m mapping directory /home/afish/makeref and enabling some security flags to be able to debug application and modify page protection.

lldb and SOS

First thing I want to do is to install lldb and SOS:

apt-get update

Get:1 http://security.debian.org/debian-security buster/updates InRelease [65.4 kB]
Get:2 http://deb.debian.org/debian buster InRelease [121 kB]
Get:3 http://deb.debian.org/debian buster-updates InRelease [51.9 kB]
Get:4 http://security.debian.org/debian-security buster/updates/main amd64 Packages [258 kB]
Get:5 http://deb.debian.org/debian buster/main amd64 Packages [7907 kB]
Get:6 http://deb.debian.org/debian buster-updates/main amd64 Packages [7860 B]
Fetched 8412 kB in 2s (3467 kB/s)
Reading package lists...

apt-get install -y lldb

Reading package lists...
Building dependency tree...
Reading state information...
The following additional packages will be installed:
  binfmt-support bzip2 file libbsd0 libc-dev-bin libc6-dev libedit2 libffi-dev
  libgpm2 liblldb-7 libllvm7 libmagic-mgc libmagic1 libncurses-dev libncurses6
  libpipeline1 libpython-stdlib libpython2-stdlib libpython2.7
  libpython2.7-minimal libpython2.7-stdlib libreadline7 libsqlite3-0
  libtinfo-dev linux-libc-dev lldb-7 llvm-7 llvm-7-dev llvm-7-runtime lsb-base
  manpages manpages-dev mime-support python python-lldb-7 python-minimal
  python-six python2 python2-minimal python2.7 python2.7-minimal
  readline-common xz-utils
Suggested packages:
  bzip2-doc glibc-doc gpm ncurses-doc llvm-7-doc man-browser python-doc
  python-tk python2-doc python2.7-doc binutils readline-doc
The following NEW packages will be installed:
  binfmt-support bzip2 file libbsd0 libc-dev-bin libc6-dev libedit2 libffi-dev
  libgpm2 liblldb-7 libllvm7 libmagic-mgc libmagic1 libncurses-dev libncurses6
  libpipeline1 libpython-stdlib libpython2-stdlib libpython2.7
  libpython2.7-minimal libpython2.7-stdlib libreadline7 libsqlite3-0
  libtinfo-dev linux-libc-dev lldb lldb-7 llvm-7 llvm-7-dev llvm-7-runtime
  lsb-base manpages manpages-dev mime-support python python-lldb-7
  python-minimal python-six python2 python2-minimal python2.7
  python2.7-minimal readline-common xz-utils
0 upgraded, 44 newly installed, 0 to remove and 1 not upgraded.
Need to get 71.3 MB of archives.
After this operation, 369 MB of additional disk space will be used.
Get:1 http://deb.debian.org/debian buster/main amd64 libpython2.7-minimal amd64 2.7.16-2+deb10u1 [395 kB]
Get:2 http://deb.debian.org/debian buster/main amd64 python2.7-minimal amd64 2.7.16-2+deb10u1 [1369 kB]
Get:3 http://deb.debian.org/debian buster/main amd64 python2-minimal amd64 2.7.16-1 [41.4 kB]
Get:4 http://deb.debian.org/debian buster/main amd64 python-minimal amd64 2.7.16-1 [21.0 kB]
Get:5 http://deb.debian.org/debian buster/main amd64 mime-support all 3.62 [37.2 kB]
Get:6 http://deb.debian.org/debian buster/main amd64 readline-common all 7.0-5 [70.6 kB]
Get:7 http://deb.debian.org/debian buster/main amd64 libreadline7 amd64 7.0-5 [151 kB]
Get:8 http://deb.debian.org/debian buster/main amd64 libsqlite3-0 amd64 3.27.2-3+deb10u1 [641 kB]
Get:9 http://deb.debian.org/debian buster/main amd64 libpython2.7-stdlib amd64 2.7.16-2+deb10u1 [1912 kB]
Get:10 http://deb.debian.org/debian buster/main amd64 python2.7 amd64 2.7.16-2+deb10u1 [305 kB]
Get:11 http://deb.debian.org/debian buster/main amd64 libpython2-stdlib amd64 2.7.16-1 [20.8 kB]
Get:12 http://deb.debian.org/debian buster/main amd64 libpython-stdlib amd64 2.7.16-1 [20.8 kB]
Get:13 http://deb.debian.org/debian buster/main amd64 python2 amd64 2.7.16-1 [41.6 kB]
Get:14 http://deb.debian.org/debian buster/main amd64 python amd64 2.7.16-1 [22.8 kB]
Get:15 http://deb.debian.org/debian buster/main amd64 bzip2 amd64 1.0.6-9.2~deb10u1 [48.4 kB]
Get:16 http://deb.debian.org/debian buster/main amd64 libmagic-mgc amd64 1:5.35-4+deb10u1 [242 kB]
Get:17 http://deb.debian.org/debian buster/main amd64 libmagic1 amd64 1:5.35-4+deb10u1 [117 kB]
Get:18 http://deb.debian.org/debian buster/main amd64 file amd64 1:5.35-4+deb10u1 [66.4 kB]
Get:19 http://deb.debian.org/debian buster/main amd64 manpages all 4.16-2 [1295 kB]
Get:20 http://deb.debian.org/debian buster/main amd64 xz-utils amd64 5.2.4-1 [183 kB]
Get:21 http://deb.debian.org/debian buster/main amd64 libpipeline1 amd64 1.5.1-2 [31.2 kB]
Get:22 http://deb.debian.org/debian buster/main amd64 lsb-base all 10.2019051400 [28.4 kB]
Get:23 http://deb.debian.org/debian buster/main amd64 binfmt-support amd64 2.2.0-2 [70.0 kB]
Get:24 http://deb.debian.org/debian buster/main amd64 libbsd0 amd64 0.9.1-2 [99.5 kB]
Get:25 http://deb.debian.org/debian buster/main amd64 libc-dev-bin amd64 2.28-10 [275 kB]
Get:26 http://deb.debian.org/debian buster/main amd64 linux-libc-dev amd64 4.19.160-2 [1416 kB]
Get:27 http://deb.debian.org/debian buster/main amd64 libc6-dev amd64 2.28-10 [2691 kB]
Get:28 http://deb.debian.org/debian buster/main amd64 libedit2 amd64 3.1-20181209-1 [94.0 kB]
Get:29 http://deb.debian.org/debian buster/main amd64 libffi-dev amd64 3.2.1-9 [156 kB]
Get:30 http://deb.debian.org/debian buster/main amd64 libgpm2 amd64 1.20.7-5 [35.1 kB]
Get:31 http://deb.debian.org/debian buster/main amd64 libllvm7 amd64 1:7.0.1-8+deb10u2 [13.1 MB]
Get:32 http://deb.debian.org/debian buster/main amd64 libncurses6 amd64 6.1+20181013-2+deb10u2 [102 kB]
Get:33 http://deb.debian.org/debian buster/main amd64 libpython2.7 amd64 2.7.16-2+deb10u1 [1036 kB]
Get:34 http://deb.debian.org/debian buster/main amd64 liblldb-7 amd64 1:7.0.1-8+deb10u2 [7938 kB]
Get:35 http://deb.debian.org/debian buster/main amd64 libncurses-dev amd64 6.1+20181013-2+deb10u2 [333 kB]
Get:36 http://deb.debian.org/debian buster/main amd64 libtinfo-dev amd64 6.1+20181013-2+deb10u2 [940 B]
Get:37 http://deb.debian.org/debian buster/main amd64 llvm-7-runtime amd64 1:7.0.1-8+deb10u2 [190 kB]
Get:38 http://deb.debian.org/debian buster/main amd64 llvm-7 amd64 1:7.0.1-8+deb10u2 [4554 kB]
Get:39 http://deb.debian.org/debian buster/main amd64 llvm-7-dev amd64 1:7.0.1-8+deb10u2 [21.3 MB]
Get:40 http://deb.debian.org/debian buster/main amd64 python-six all 1.12.0-1 [15.7 kB]
Get:41 http://deb.debian.org/debian buster/main amd64 python-lldb-7 amd64 1:7.0.1-8+deb10u2 [122 kB]
Get:42 http://deb.debian.org/debian buster/main amd64 lldb-7 amd64 1:7.0.1-8+deb10u2 [8459 kB]
Get:43 http://deb.debian.org/debian buster/main amd64 lldb amd64 1:7.0-47 [7176 B]
Get:44 http://deb.debian.org/debian buster/main amd64 manpages-dev all 4.16-2 [2232 kB]
debconf: delaying package configuration, since apt-utils is not installed
Fetched 71.3 MB in 2s (29.9 MB/s)
Selecting previously unselected package libpython2.7-minimal:amd64.
(Reading database ... 9877 files and directories currently installed.)
Preparing to unpack .../00-libpython2.7-minimal_2.7.16-2+deb10u1_amd64.deb ...
Unpacking libpython2.7-minimal:amd64 (2.7.16-2+deb10u1) ...
Selecting previously unselected package python2.7-minimal.
Preparing to unpack .../01-python2.7-minimal_2.7.16-2+deb10u1_amd64.deb ...
Unpacking python2.7-minimal (2.7.16-2+deb10u1) ...
Selecting previously unselected package python2-minimal.
Preparing to unpack .../02-python2-minimal_2.7.16-1_amd64.deb ...
Unpacking python2-minimal (2.7.16-1) ...
Selecting previously unselected package python-minimal.
Preparing to unpack .../03-python-minimal_2.7.16-1_amd64.deb ...
Unpacking python-minimal (2.7.16-1) ...
Selecting previously unselected package mime-support.
Preparing to unpack .../04-mime-support_3.62_all.deb ...
Unpacking mime-support (3.62) ...
Selecting previously unselected package readline-common.
Preparing to unpack .../05-readline-common_7.0-5_all.deb ...
Unpacking readline-common (7.0-5) ...
Selecting previously unselected package libreadline7:amd64.
Preparing to unpack .../06-libreadline7_7.0-5_amd64.deb ...
Unpacking libreadline7:amd64 (7.0-5) ...
Selecting previously unselected package libsqlite3-0:amd64.
Preparing to unpack .../07-libsqlite3-0_3.27.2-3+deb10u1_amd64.deb ...
Unpacking libsqlite3-0:amd64 (3.27.2-3+deb10u1) ...
Selecting previously unselected package libpython2.7-stdlib:amd64.
Preparing to unpack .../08-libpython2.7-stdlib_2.7.16-2+deb10u1_amd64.deb ...
Unpacking libpython2.7-stdlib:amd64 (2.7.16-2+deb10u1) ...
Selecting previously unselected package python2.7.
Preparing to unpack .../09-python2.7_2.7.16-2+deb10u1_amd64.deb ...
Unpacking python2.7 (2.7.16-2+deb10u1) ...
Selecting previously unselected package libpython2-stdlib:amd64.
Preparing to unpack .../10-libpython2-stdlib_2.7.16-1_amd64.deb ...
Unpacking libpython2-stdlib:amd64 (2.7.16-1) ...
Selecting previously unselected package libpython-stdlib:amd64.
Preparing to unpack .../11-libpython-stdlib_2.7.16-1_amd64.deb ...
Unpacking libpython-stdlib:amd64 (2.7.16-1) ...
Setting up libpython2.7-minimal:amd64 (2.7.16-2+deb10u1) ...
Setting up python2.7-minimal (2.7.16-2+deb10u1) ...
Linking and byte-compiling packages for runtime python2.7...
Setting up python2-minimal (2.7.16-1) ...
Selecting previously unselected package python2.
(Reading database ... 10694 files and directories currently installed.)
Preparing to unpack .../python2_2.7.16-1_amd64.deb ...
Unpacking python2 (2.7.16-1) ...
Setting up python-minimal (2.7.16-1) ...
Selecting previously unselected package python.
(Reading database ... 10727 files and directories currently installed.)
Preparing to unpack .../00-python_2.7.16-1_amd64.deb ...
Unpacking python (2.7.16-1) ...
Selecting previously unselected package bzip2.
Preparing to unpack .../01-bzip2_1.0.6-9.2~deb10u1_amd64.deb ...
Unpacking bzip2 (1.0.6-9.2~deb10u1) ...
Selecting previously unselected package libmagic-mgc.
Preparing to unpack .../02-libmagic-mgc_1%3a5.35-4+deb10u1_amd64.deb ...
Unpacking libmagic-mgc (1:5.35-4+deb10u1) ...
Selecting previously unselected package libmagic1:amd64.
Preparing to unpack .../03-libmagic1_1%3a5.35-4+deb10u1_amd64.deb ...
Unpacking libmagic1:amd64 (1:5.35-4+deb10u1) ...
Selecting previously unselected package file.
Preparing to unpack .../04-file_1%3a5.35-4+deb10u1_amd64.deb ...
Unpacking file (1:5.35-4+deb10u1) ...
Selecting previously unselected package manpages.
Preparing to unpack .../05-manpages_4.16-2_all.deb ...
Unpacking manpages (4.16-2) ...
Selecting previously unselected package xz-utils.
Preparing to unpack .../06-xz-utils_5.2.4-1_amd64.deb ...
Unpacking xz-utils (5.2.4-1) ...
Selecting previously unselected package libpipeline1:amd64.
Preparing to unpack .../07-libpipeline1_1.5.1-2_amd64.deb ...
Unpacking libpipeline1:amd64 (1.5.1-2) ...
Selecting previously unselected package lsb-base.
Preparing to unpack .../08-lsb-base_10.2019051400_all.deb ...
Unpacking lsb-base (10.2019051400) ...
Selecting previously unselected package binfmt-support.
Preparing to unpack .../09-binfmt-support_2.2.0-2_amd64.deb ...
Unpacking binfmt-support (2.2.0-2) ...
Selecting previously unselected package libbsd0:amd64.
Preparing to unpack .../10-libbsd0_0.9.1-2_amd64.deb ...
Unpacking libbsd0:amd64 (0.9.1-2) ...
Selecting previously unselected package libc-dev-bin.
Preparing to unpack .../11-libc-dev-bin_2.28-10_amd64.deb ...
Unpacking libc-dev-bin (2.28-10) ...
Selecting previously unselected package linux-libc-dev:amd64.
Preparing to unpack .../12-linux-libc-dev_4.19.160-2_amd64.deb ...
Unpacking linux-libc-dev:amd64 (4.19.160-2) ...
Selecting previously unselected package libc6-dev:amd64.
Preparing to unpack .../13-libc6-dev_2.28-10_amd64.deb ...
Unpacking libc6-dev:amd64 (2.28-10) ...
Selecting previously unselected package libedit2:amd64.
Preparing to unpack .../14-libedit2_3.1-20181209-1_amd64.deb ...
Unpacking libedit2:amd64 (3.1-20181209-1) ...
Selecting previously unselected package libffi-dev:amd64.
Preparing to unpack .../15-libffi-dev_3.2.1-9_amd64.deb ...
Unpacking libffi-dev:amd64 (3.2.1-9) ...
Selecting previously unselected package libgpm2:amd64.
Preparing to unpack .../16-libgpm2_1.20.7-5_amd64.deb ...
Unpacking libgpm2:amd64 (1.20.7-5) ...
Selecting previously unselected package libllvm7:amd64.
Preparing to unpack .../17-libllvm7_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking libllvm7:amd64 (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package libncurses6:amd64.
Preparing to unpack .../18-libncurses6_6.1+20181013-2+deb10u2_amd64.deb ...
Unpacking libncurses6:amd64 (6.1+20181013-2+deb10u2) ...
dSelecting previously unselected package libpython2.7:amd64.
Preparing to unpack .../19-libpython2.7_2.7.16-2+deb10u1_amd64.deb ...
Unpacking libpython2.7:amd64 (2.7.16-2+deb10u1) ...
irSelecting previously unselected package liblldb-7.
Preparing to unpack .../20-liblldb-7_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking liblldb-7 (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package libncurses-dev:amd64.
Preparing to unpack .../21-libncurses-dev_6.1+20181013-2+deb10u2_amd64.deb ...
Unpacking libncurses-dev:amd64 (6.1+20181013-2+deb10u2) ...
Selecting previously unselected package libtinfo-dev:amd64.
Preparing to unpack .../22-libtinfo-dev_6.1+20181013-2+deb10u2_amd64.deb ...
Unpacking libtinfo-dev:amd64 (6.1+20181013-2+deb10u2) ...
Selecting previously unselected package llvm-7-runtime.
Preparing to unpack .../23-llvm-7-runtime_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking llvm-7-runtime (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package llvm-7.
Preparing to unpack .../24-llvm-7_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking llvm-7 (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package llvm-7-dev.
Preparing to unpack .../25-llvm-7-dev_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking llvm-7-dev (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package python-six.
Preparing to unpack .../26-python-six_1.12.0-1_all.deb ...
Unpacking python-six (1.12.0-1) ...
Selecting previously unselected package python-lldb-7.
Preparing to unpack .../27-python-lldb-7_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking python-lldb-7 (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package lldb-7.
Preparing to unpack .../28-lldb-7_1%3a7.0.1-8+deb10u2_amd64.deb ...
Unpacking lldb-7 (1:7.0.1-8+deb10u2) ...
Selecting previously unselected package lldb.
Preparing to unpack .../29-lldb_1%3a7.0-47_amd64.deb ...
Unpacking lldb (1:7.0-47) ...
Selecting previously unselected package manpages-dev.
Preparing to unpack .../30-manpages-dev_4.16-2_all.deb ...
Unpacking manpages-dev (4.16-2) ...
Setting up libpipeline1:amd64 (1.5.1-2) ...
Setting up lsb-base (10.2019051400) ...
Setting up libgpm2:amd64 (1.20.7-5) ...
Setting up mime-support (3.62) ...
Setting up libmagic-mgc (1:5.35-4+deb10u1) ...
Setting up manpages (4.16-2) ...
Setting up libsqlite3-0:amd64 (3.27.2-3+deb10u1) ...
Setting up libmagic1:amd64 (1:5.35-4+deb10u1) ...
Setting up linux-libc-dev:amd64 (4.19.160-2) ...
Setting up file (1:5.35-4+deb10u1) ...
Setting up bzip2 (1.0.6-9.2~deb10u1) ...
Setting up libffi-dev:amd64 (3.2.1-9) ...
Setting up libncurses6:amd64 (6.1+20181013-2+deb10u2) ...
Setting up xz-utils (5.2.4-1) ...
update-alternatives: using /usr/bin/xz to provide /usr/bin/lzma (lzma) in auto mode
update-alternatives: warning: skip creation of /usr/share/man/man1/lzma.1.gz because associated file /usr/share/man/man1/xz.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/unlzma.1.gz because associated file /usr/share/man/man1/unxz.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzcat.1.gz because associated file /usr/share/man/man1/xzcat.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzmore.1.gz because associated file /usr/share/man/man1/xzmore.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzless.1.gz because associated file /usr/share/man/man1/xzless.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzdiff.1.gz because associated file /usr/share/man/man1/xzdiff.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzcmp.1.gz because associated file /usr/share/man/man1/xzcmp.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzgrep.1.gz because associated file /usr/share/man/man1/xzgrep.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzegrep.1.gz because associated file /usr/share/man/man1/xzegrep.1.gz (of link group lzma) doesn't exist
update-alternatives: warning: skip creation of /usr/share/man/man1/lzfgrep.1.gz because associated file /usr/share/man/man1/xzfgrep.1.gz (of link group lzma) doesn't exist
Setting up binfmt-support (2.2.0-2) ...
invoke-rc.d: could not determine current runlevel
invoke-rc.d: policy-rc.d denied execution of start.
Setting up libc-dev-bin (2.28-10) ...
Setting up libbsd0:amd64 (0.9.1-2) ...
Setting up readline-common (7.0-5) ...
Setting up libreadline7:amd64 (7.0-5) ...
Setting up manpages-dev (4.16-2) ...
Setting up libedit2:amd64 (3.1-20181209-1) ...
Setting up libpython2.7-stdlib:amd64 (2.7.16-2+deb10u1) ...
Setting up libllvm7:amd64 (1:7.0.1-8+deb10u2) ...
Setting up libc6-dev:amd64 (2.28-10) ...
Setting up libpython2.7:amd64 (2.7.16-2+deb10u1) ...
Setting up libncurses-dev:amd64 (6.1+20181013-2+deb10u2) ...
Setting up llvm-7-runtime (1:7.0.1-8+deb10u2) ...
Setting up python2.7 (2.7.16-2+deb10u1) ...
Setting up llvm-7 (1:7.0.1-8+deb10u2) ...
Setting up libpython2-stdlib:amd64 (2.7.16-1) ...
Setting up python2 (2.7.16-1) ...
Setting up libpython-stdlib:amd64 (2.7.16-1) ...
Setting up python (2.7.16-1) ...
Setting up libtinfo-dev:amd64 (6.1+20181013-2+deb10u2) ...
Setting up liblldb-7 (1:7.0.1-8+deb10u2) ...
Setting up llvm-7-dev (1:7.0.1-8+deb10u2) ...
Setting up python-six (1.12.0-1) ...
Setting up python-lldb-7 (1:7.0.1-8+deb10u2) ...
Setting up lldb-7 (1:7.0.1-8+deb10u2) ...
Setting up lldb (1:7.0-47) ...
Processing triggers for libc-bin (2.28-10) ...

dotnet tool install --global dotnet-sos

Tools directory '/root/.dotnet/tools' is not currently on the PATH environment variable.
If you are using bash, you can add it to your profile by running the following command:

cat << \EOF >> ~/.bash_profile
# Add .NET Core SDK tools
export PATH="$PATH:/root/.dotnet/tools"
EOF

You can add it to the current session by running the following command:

export PATH="$PATH:/root/.dotnet/tools"

You can invoke the tool using the following command: dotnet-sos
Tool 'dotnet-sos' (version '5.0.160202') was successfully installed.

/root/.dotnet/tools/dotnet-sos install

Installing SOS to /root/.dotnet/sos from /root/.dotnet/tools/.store/dotnet-sos/5.0.160202/dotnet-sos/5.0.160202/tools/netcoreapp2.1/any/linux-x64
Creating installation directory...
Copying files...
Creating new /root/.lldbinit file - LLDB will load SOS automatically at startup
SOS install succeeded

Project configuration

Okay. Now we can create new project with dotnet new console -lang C# makeref, enter the directory and modify the csproj file:

<Project Sdk="Microsoft.NET.Sdk">

  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>net5.0</TargetFramework>
    <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
  </PropertyGroup>

</Project>

Notice that I added AllowUnsafeBlocks to enable unsafe code (which we already know is not needed but just to keep it simple).

Application

Finally, the application code:

using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Reflection;


namespace HijackingNewOperatorNetCore
{
    class Program
    {
        static void Main(string[] args)
        {
            var allocator = new GenericMemoryAllocator();

            // Allocate object through allocator
            var customlyAlocated = allocator.Allocate<TestClass>();
            // Allocate ordinary object
            var ordinary = new object();

            // Hijack method and allocate object
            HijackNew();
            System.Diagnostics.Debugger.Break();
            var hijacked = new object();

            // Observe that hijacked objects are in generation 2
            Console.WriteLine($"Object customly allocated by hand: {GC.GetGeneration(customlyAlocated)}");
            Console.WriteLine($"Object created normally: {GC.GetGeneration(ordinary)}");
            Console.WriteLine($"Object with hijacked newobj: {GC.GetGeneration(hijacked)}");
        }

        public static void HijackNew()
        {
            var methodHandle = typeof(GenericMemoryAllocator).GetMethod(nameof(GenericMemoryAllocator.RawAllocate)).MethodHandle;
            RuntimeHelpers.PrepareMethod(methodHandle);

            var myAllocAddress = Marshal.ReadIntPtr(methodHandle.Value, 8);
            var defaultAllocAddress = GenericMemoryAllocator.GetAllocMethodAddress();


            int offset = (int)((long)myAllocAddress - defaultAllocAddress - 4 - 1); // 4 bytes for relative address and one byte for opcode
            byte[] instruction = {
                0xE9, // Long jump instruction
                (byte)(offset & 0xFF),
                (byte)((offset >> 8) & 0xFF),
                (byte)((offset >> 16) & 0xFF),
                (byte)((offset >> 24) & 0xFF)
            };

            GenericMemoryAllocator.UnlockPage((IntPtr)defaultAllocAddress);
            Marshal.Copy(instruction, 0, (IntPtr)defaultAllocAddress, instruction.Length);
        }
    }

    class TestClass
    {
        public int a, b, c, d;
    }
}

namespace HijackingNewOperatorNetCore
{
    class GenericMemoryAllocator
    {
        public T Allocate<T>()
        {
            var methodTable = typeof(T).TypeHandle.Value; // Get handle to the method table
            RawAllocate(methodTable); // Allocate the object and set the field, also JIT-compile the method
            return (T)Dummy;
        }

        // Method needs to be static in order to maintain the calling convention
        public static unsafe IntPtr RawAllocate(IntPtr methodTable)
        {
            // Calculate the object size by extracting it from method table and dividing by int size.
            // We assume that the size starts 4 bytes after the beginning of method table (works from .NET 3.5 to .NET Core 3.1)
            int objectSize = Marshal.ReadInt32(methodTable, 4) / sizeof(int);
            // Skip sizeof(int) bytes for syncblock
            _currentOffset++;
            // Write the address to method table
            Memory[_currentOffset] = methodTable;

            // Get the handle for the newly created object
            TypedReference newObjectReference = __makeref(Dummy);
            // Get the handle for the memory
            TypedReference memoryReference = __makeref(Memory);
            // Calculate the address of  the spawned object. We need to add 2 since we need to skip the method table of the array and the array size
            var spawnedObjectAddress = *(IntPtr*)*(IntPtr*)&memoryReference + (_currentOffset + 2) * sizeof(IntPtr);

            // Modify the handle for the new object using the address of the existing memory
            *(IntPtr*)*(IntPtr*)&newObjectReference = spawnedObjectAddress;

            // Move within the memory
            _currentOffset += objectSize;

            return *(IntPtr*)*(IntPtr*)&newObjectReference;
        }

        // Fields needs to be static in order to be accessible from RawAllocate
        private static bool Is64 = IntPtr.Size == sizeof(long);
        // Array big enough to be stored in Generation 2
        private static IntPtr[] Memory = new IntPtr[102400];
        private static int _currentOffset;
        private static object Dummy = new object();

        // This method is used to find the address of the CLR allocation function
        [MethodImpl(MethodImplOptions.NoOptimization)]
        private void CreateObject()
        {
            new object();
        }

        public static long GetAllocMethodAddress()
        {
            // Get the handle to the method creating the object
            var methodHandle = typeof(GenericMemoryAllocator).GetMethod(nameof(CreateObject), BindingFlags.NonPublic | BindingFlags.Instance).MethodHandle;

            // JIT-compile methods
            RuntimeHelpers.PrepareMethod(methodHandle);

            // Get the address of the jitted method
            IntPtr methodAddress = Marshal.ReadIntPtr(methodHandle.Value, 16);

            // Call to internal function differs between architectures, builds etc
            int offset = 51;

            // Read the jump offset
            int jumpOffset = 0;
            for (int i = 1; i < 5; ++i)
            {
                jumpOffset = jumpOffset + (Marshal.ReadByte(methodAddress, offset + i) << (i - 1) * 8);
            }
            // Calculate the absolute address
            long absoluteAddress = (long)methodAddress + offset + jumpOffset + 1 + 4; // 1 byte for jmp instruction, 4 bytes for relative address

            return absoluteAddress;
        }

        // Method to unlock the page for executing
        [DllImport("libc", SetLastError = true)]
        static extern int mprotect(IntPtr lpAddress, uint dwSize, uint flags);

        // Unlocks the page for executing
        public static void UnlockPage(IntPtr address)
        {
              long newAddress = ((long)address) & (long)(~0 << 12);
              IntPtr na = (IntPtr)newAddress;
              long length = ((long)address) + 6 - newAddress;
              // 1 for read, 2 for write, 4 for execute
              mprotect(na, (uint)length, 1 | 2 | 4);
        }
    }
}

This should look familiar to you. There are three main differences from Windows solution.

First, the CreateObject method in line 107 is now assembled differently. Machine code looks like this (and let’s see lldb in action at the same time):

dotnet --version

5.0.101

dotnet build

Microsoft (R) Build Engine version 16.8.0+126527ff1 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  All projects are up-to-date for restore.
/makeref/makeref/Program.cs(56,26): warning CS0649: Field 'TestClass.c' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,23): warning CS0649: Field 'TestClass.b' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,20): warning CS0649: Field 'TestClass.a' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,29): warning CS0649: Field 'TestClass.d' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
  makeref -> /makeref/makeref/bin/Debug/net5.0/makeref.dll

Build succeeded.

/makeref/makeref/Program.cs(56,26): warning CS0649: Field 'TestClass.c' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,23): warning CS0649: Field 'TestClass.b' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,20): warning CS0649: Field 'TestClass.a' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
/makeref/makeref/Program.cs(56,29): warning CS0649: Field 'TestClass.d' is never assigned to, and will always have its default value 0 [/makeref/makeref/makeref.csproj]
    4 Warning(s)
    0 Error(s)

Time Elapsed 00:00:04.77

lldb bin/Debug/net5.0/makeref

(lldb) target create "bin/Debug/net5.0/makeref"
Current executable set to 'bin/Debug/net5.0/makeref' (x86_64).

r

(lldb) r
Process 1181 launched: '/makeref/makeref/bin/Debug/net5.0/makeref' (x86_64)
Process 1181 stopped
* thread #1, name = 'makeref', stop reason = signal SIGTRAP
    frame #0: 0x00007ffff73ee1ed libcoreclr.so`___lldb_unnamed_symbol15306$$libcoreclr.so + 1
libcoreclr.so`___lldb_unnamed_symbol15306$$libcoreclr.so:
->  0x7ffff73ee1ed <+1>: retq
    0x7ffff73ee1ee <+2>: nop

libcoreclr.so`___lldb_unnamed_symbol15307$$libcoreclr.so:
    0x7ffff73ee1f0 <+0>: pushq  %rbp
    0x7ffff73ee1f1 <+1>: movq   0xd8(%rdi), %r12

sos Name2EE makeref.dll HijackingNewOperatorNetCore.GenericMemoryAllocator.CreateObject

(lldb) sos Name2EE makeref.dll HijackingNewOperatorNetCore.GenericMemoryAllocator.CreateObject
Module:      00007fff7de42788
Assembly:    makeref.dll
Token:       0000000006000007
MethodDesc:  00007fff7deb6ea0
Name:        HijackingNewOperatorNetCore.GenericMemoryAllocator.CreateObject()
JITTED Code Address: 00007fff7dda9030

sos u 00007fff7dda9030

(lldb) sos u 00007fff7dda9030
Normal JIT generated code
HijackingNewOperatorNetCore.GenericMemoryAllocator.CreateObject()
ilAddr is 00007FFFF3D4F463 pImport is 00000000014B3BF0
Begin 00007FFF7DDA9030, size 4d

/makeref/makeref/Program.cs @ 113:
>>> 00007fff7dda9030 55                   push    rbp
00007fff7dda9031 4883ec10             sub     rsp, 0x10
00007fff7dda9035 488d6c2410           lea     rbp, [rsp + 0x10]
00007fff7dda903a 33c0                 xor     eax, eax
00007fff7dda903c 488945f0             mov     qword ptr [rbp - 0x10], rax
00007fff7dda9040 48897df8             mov     qword ptr [rbp - 0x8], rdi
00007fff7dda9044 48b8082ce47dff7f0000 movabs  rax, 0x7fff7de42c08
00007fff7dda904e 833800               cmp     dword ptr [rax], 0x0
00007fff7dda9051 7405                 je      0x7fff7dda9058
00007fff7dda9053 e828213879           call    0x7ffff712b180 (JitHelp: CORINFO_HELP_DBG_IS_JUST_MY_CODE)
00007fff7dda9058 90                   nop

/makeref/makeref/Program.cs @ 114:
00007fff7dda9059 48bf000cd67dff7f0000 movabs  rdi, 0x7fff7dd60c00
00007fff7dda9063 e8c8953779           call    0x7ffff7122630 (HijackingNewOperatorNetCore.GenericMemoryAllocator.RawAllocate(IntPtr), mdToken: 0000000006000006)
00007fff7dda9068 488945f0             mov     qword ptr [rbp - 0x10], rax
00007fff7dda906c 488b7df0             mov     rdi, qword ptr [rbp - 0x10]
00007fff7dda9070 e81370feff           call    0x7fff7dd90088 (System.Object..ctor(), mdToken: 000000000600045E)
00007fff7dda9075 90                   nop

/makeref/makeref/Program.cs @ 115:
00007fff7dda9076 90                   nop
00007fff7dda9077 488d6500             lea     rsp, [rbp]
00007fff7dda907b 5d                   pop     rbp
00007fff7dda907c c3                   ret

If you count all bytes you’ll find out that the offset is now 51.

Second difference is the method descriptor. Function address used to be 8 bytes from the beginning, now it’s 16 (in line 121):

memory read -count 64 00007fff7deb6ea0

(lldb) memory read -count 64 00007fff7deb6ea0
0x7fff7deb6ea0: 07 00 08 03 08 00 28 00 28 5b da 7d ff 7f 00 00  ......(.([.}....
0x7fff7deb6eb0: 30 90 da 7d ff 7f 00 00 08 00 0b 03 09 00 a8 00  0..}............
0x7fff7deb6ec0: 30 5b da 7d ff 7f 00 00 80 8a da 7d ff 7f 00 00  0[.}.......}....
0x7fff7deb6ed0: 09 00 0e 03 0a 00 8a 00 00 00 00 00 00 00 00 00  ................

Finally, we cannot use VirtualProtectEx anymore as we’re on Linux. We need to go with mprotect:

[DllImport("libc", SetLastError = true)]
static extern int mprotect(IntPtr lpAddress, uint dwSize, uint flags);

public static void UnlockPage(IntPtr address)
{
	  long newAddress = ((long)address) & (long)(~0 << 12);
	  IntPtr na = (IntPtr)newAddress;
	  long length = ((long)address) + 6 - newAddress;
	  // 1 for read, 2 for write, 4 for execute
	  mprotect(na, (uint)length, 1 | 2 | 4);
}

mprotect requires the address to be aligned to a page boundary (which is 4096 bytes on my machine) so I clear lowest 12 bits (line 6 in the listing above). Next, I calculate new offset of the method (I’m actually not sure if that’s needed). Finally, I enable all permissions for the page in line 10.

And just for the sake of completeness, final output:

dotnet run

Object customly allocated by hand: 2
Object created normally: 0
Object with hijacked newobj: 2

Final notes

As you can see, there is no magic in this approach, it’s just a bunch of bytes which we can modify in the same way as long as we’re on the same architecture. However, keep in mind the following:

  • I do not recommend using this in production code. I do use things like these in real applications but this is always risky and requires good understanding of all internals
  • This is just one of multiple allocation methods provided by .NET. If you want it to be “production ready” then you need to update all of them
  • Since you override the method globally, you can’t control easily when it’s called. In other words, .NET will use your logic as well so you need to take care of all memory management (or do some fancy juggling to call .NET methods when you actually need to allocate some new memory)
  • Keep in mind that .NET scans the heap and requires it to be parseable. Be careful with what you allocate and how. Also, make sure your objects are pinned or that you have good concurrency management (since GC can kick in anytime and move objects around)

Have fun!

]]>
https://blog.adamfurmanek.pl/2021/09/25/custom-memory-allocation-in-c-part-16/feed/ 1
ILP Part 58 – Kantoriiroodo https://blog.adamfurmanek.pl/2021/03/13/ilp-part-58/ https://blog.adamfurmanek.pl/2021/03/13/ilp-part-58/#comments Sat, 13 Mar 2021 09:00:04 +0000 https://blog.adamfurmanek.pl/?p=3795 Continue reading ILP Part 58 – Kantoriiroodo]]>

This is the fifty eighth part of the ILP series. For your convenience you can find other parts in the table of contents in Part 1 – Boolean algebra

Today we are going to solve kantoriiroodo riddle. Refer to the link to see how it works. Basically, we have a board of any size (like a chess board) divided into multiple “parcels” just like building parcels. Each parcel may (but is not obliged to) have a cost. We want to build a closed road (basically a loop) which goes through multiple fields and satisfies the following:

  • It can enter and exit the parcel exactly once
  • It must not cross itself
  • It can go horizontally or vertically (so we cannot go diagonally, only four basic directions)
  • Two neighboring fields may be empty in the solution if and only if they belong to the same parcel
  • If parcel has a cost then it indicates how many fields of the parcel need to be occupied by road (no more, no less)

Let’s solve it with ILP. For each field we will have four variables indicating whether we have a road going up, down, left, or right. Full solution looks like this:

var owners = new char[][] {
	new char[] { '1', '1', '2', '2', '3', '3', '4', '4', '4', '5' },
	new char[] { '1', '1', '2', '6', '6', '3', '3', '4', '7', '7' },
	new char[] { '8', '9', '9', '6', 'A', 'B', 'B', 'C', '7', '7' },
	new char[] { '8', 'D', 'D', '6', 'A', 'E', 'C', 'C', 'F', 'F' },
	new char[] { '8', 'G', 'D', '6', '6', 'E', 'H', 'H', 'I', 'F' },
	new char[] { 'J', 'G', 'D', 'D', 'K', 'L', 'L', 'H', 'I', 'M' },
	new char[] { 'J', 'J', 'N', 'N', 'K', 'O', 'L', 'H', 'H', 'M' },
	new char[] { 'P', 'P', 'N', 'Q', 'Q', 'O', 'L', 'R', 'R', 'M' },
	new char[] { 'P', 'P', 'S', 'T', 'T', 'L', 'L', 'U', 'V', 'V' },
	new char[] { 'W', 'S', 'S', 'S', 'T', 'T', 'U', 'U', 'V', 'V' }
};

var mustTake = new Dictionary<char, int>() {
	{ '1', 4 },
	{ '4', 3 },
	{ '6', 2 },
	{ '7', 4 },
	{ 'D', 3 },
	{ 'F', 2 },
	{ 'J', 2 },
	{ 'L', 4 },
	{ 'M', 2 },
	{ 'N', 1 },
	{ 'P', 2 },
	{ 'S', 3 },
	{ 'U', 3 },
	{ 'V', 1 }
};

var height = owners.Length;
var width = owners[0].Length;

var solver = new CplexMilpSolver(23);

dynamic fields = Enumerable.Range(0, height).Select(i => Enumerable.Range(0, width).Select(j => new System.Dynamic.ExpandoObject()).ToArray()).ToArray();

for(int i = 0;i<height;++i) {
	for(int j = 0;j<width;++j) {
		IVariable right = j < width - 1 ? solver.CreateAnonymous(Domain.BinaryInteger) : null;
		IVariable bottom = i < height - 1 ? solver.CreateAnonymous(Domain.BinaryInteger) : null;
		IVariable top = i > 0 ? solver.CreateAnonymous(Domain.BinaryInteger) : null;
		IVariable left = j > 0 ? solver.CreateAnonymous(Domain.BinaryInteger) : null;

		fields[i][j].Right = right;
		fields[i][j].Bottom = bottom;
		fields[i][j].Top = top;
		fields[i][j].Left = left;

		if(j > 0) {
			solver.Set(ConstraintType.Equal, left, fields[i][j-1].Right);
		}

		if(i > 0) {
			solver.Set(ConstraintType.Equal, top, fields[i-1][j].Bottom);
		}

		fields[i][j].All = new IVariable[] { right, bottom, left, top }.Where(x => x != null).ToArray();
	}
}

Console.WriteLine("Make sure field is used at most once");

for(int i = 0;i<height;++i) {
	for(int j = 0;j<width;++j) {
		var sum = solver.Operation(OperationType.Addition, fields[i][j].All);
		var isEmpty = solver.Operation(OperationType.IsEqual, sum, solver.FromConstant(0));
		var isInOut = solver.Operation(OperationType.IsEqual, sum, solver.FromConstant(2));
		var or = solver.Operation(OperationType.Disjunction, isEmpty, isInOut);
		solver.Set(ConstraintType.Equal, or, solver.FromConstant(1));
	}
}

Console.WriteLine("Make sure the road is a loop - most likely not needed");

Console.WriteLine("Make sure neighbouring fields for different owners are not both empty");

for(int i = 0;i<height;++i) {
	for(int j = 0;j<width;++j) {
		if(i < height - 1) {
			if(owners[i][j] != owners[i+1][j]) {
				IVariable[] a = fields[i][j].All;
				IVariable[] b = fields[i+1][j].All;
				solver.Set(ConstraintType.GreaterOrEqual, solver.Operation(OperationType.Addition, a.Concat(b).ToArray()), solver.FromConstant(1));
			}
		}

		if(j < width - 1) {
			if(owners[i][j] != owners[i][j+1]) {
				IVariable[] a = fields[i][j].All;
				IVariable[] b = fields[i][j+1].All;
				solver.Set(ConstraintType.GreaterOrEqual, solver.Operation(OperationType.Addition, a.Concat(b).ToArray()), solver.FromConstant(1));
			}
		}
	}
}

Console.WriteLine("Make sure we use exact number of fields in owners");

foreach(var key in mustTake.Keys) {
	var fieldsToUse = mustTake[key];

	var usedFields = new List<IVariable>();

	for(int i = 0;i<height;++i) {
		for(int j = 0;j<width;++j) {
			if(owners[i][j] == key) {
				usedFields.Add(solver.Operation(OperationType.Disjunction, fields[i][j].All));
			}
		}
	}

	solver.Set(ConstraintType.Equal, solver.Operation(OperationType.Addition, usedFields.ToArray()), solver.FromConstant(fieldsToUse));
}

Console.WriteLine("Make sure we enter and exit the owner exactly once");

foreach(var key in owners.SelectMany(r => r).Distinct()) {
	var outerEdges = new List<IVariable>();

	for(int i = 0;i<height;++i) {
		for(int j = 0;j<width;++j) {
			if(owners[i][j] == key) {
				if(i > 0 && owners[i-1][j] != key) {
					outerEdges.Add(fields[i][j].Top);
				}

				if(i < height - 1 && owners[i+1][j] != key) {
					outerEdges.Add(fields[i][j].Bottom);
				}

				if(j > 0 && owners[i][j-1] != key) {
					outerEdges.Add(fields[i][j].Left);
				}

				if(j < width - 1 && owners[i][j+1] != key) {
					outerEdges.Add(fields[i][j].Right);
				}
			}
		}
	}

	solver.Set(ConstraintType.Equal, solver.Operation(OperationType.Addition, outerEdges.ToArray()), solver.FromConstant(2));
}

var goal = solver.FromConstant(0);

solver.AddGoal("Goal", goal);
solver.Solve();

for(int i = 0;i<height;++i) {
	for(int j = 0;j<width;++j) {
		var top = i > 0 ? solver.GetValue(fields[i][j].Top) > 0 : false;
		var right = j < width -1 ? solver.GetValue(fields[i][j].Right) > 0 : false;
		var left = j > 0 ? solver.GetValue(fields[i][j].Left) > 0 : false;
		var bottom = i < height - 1 ? solver.GetValue(fields[i][j].Bottom) > 0 : false;

		if(top && bottom) Console.Write("¦");
		if(top && left) Console.Write("+");
		if(top && right) Console.Write("+");
		if(left && right) Console.Write("-");
		if(bottom && left) Console.Write("+");
		if(bottom && right) Console.Write("+");
		if(top == left && left == right && right == bottom && bottom == false) Console.Write("o");
	}

	Console.WriteLine();
}

In lines 1-12 we specify the parcels. The example is from the linked blog (the second example, without given solution). You can see each character denotes parcel owner.

In lines 14-29 we specify how many fields we need to take for the road for each parcel with a cost. Notice that some parcels do not have a cost.

First, in lines 38-60 we initialize variables for each directions. Not all variables exist as we can go out of the board. Each variable is a binary integer denoting whether given direction is used for the road or not. Also, we make sure that if we go up from some field then we at the same time go down from the field above (and the same for other directions).

Lines 62-72 specify that we use each field exactly once – the road doesn’t cross itself. For that we must have either zero directions used in the field (meaning there is no road in the field) or exactly two directions.

In lines 76-96 we make sure neighboring fields of different parcels are not empty. We iterate through fields and check if they have same owners – if they don’t then we add directions for both of them and make sure they are not zero (so there will be road in at least one of the fields).

In lines 98-114 we take the cost into account. For each parcel with a cost we find its fields, for each field we calculate a binary indicating whether there is a road or not (by using disjunction), finally we add all these results and make sure correct number of fields is used.

In lines 118-144 we make sure we enter and exit the parcel exactly once. For each field of the parcel we find its outer directions, add all of them and make sure there is one entrance and one exit.

Finally, we add a dummy goal and that’s all.

You may notice we didn’t specify the requirement that the road is a loop. This doesn’t break the solution but in general should be added as well. This can’t be added as a local requirement for each neighboring fields, should be considered globally and hence is slightly more sophisticated.

Result:

Tried aggregator 4 times.
MIP Presolve eliminated 1264 rows and 611 columns.
MIP Presolve modified 2799 coefficients.
Aggregator did 672 substitutions.
Reduced MIP has 410 rows, 233 columns, and 1706 nonzeros.
Reduced MIP has 233 binaries, 0 generals, 0 SOSs, and 0 indicators.
Presolve time = 0.02 sec. (11.91 ticks)
Found incumbent of value 0.000000 after 0.02 sec. (14.03 ticks)

Root node processing (before b&c):
  Real time             =    0.02 sec. (14.10 ticks)
Parallel b&c, 4 threads:
  Real time             =    0.00 sec. (0.00 ticks)
  Sync time (average)   =    0.00 sec.
  Wait time (average)   =    0.00 sec.
                          ------------
Total (root+branch&cut) =    0.02 sec. (14.10 ticks)
┌┐┌--┐o┌-┐
│││oo└┐│┌┘
│└┘o┌-┘│└┐
│oo┌┘┌-┘o│
└┐┌┘o│oo┌┘
o│└-┐└┐┌┘o
┌┘oo└┐│└-┐
│o┌┐o└┘o┌┘
│o│└┐oo┌┘o
└-┘o└--┘oo

And the image:

Solution

]]>
https://blog.adamfurmanek.pl/2021/03/13/ilp-part-58/feed/ 1
Custom memory allocation in C# Part 15 — Allocating object on a stack without unsafe https://blog.adamfurmanek.pl/2021/03/06/custom-memory-allocation-in-c-part-15/ https://blog.adamfurmanek.pl/2021/03/06/custom-memory-allocation-in-c-part-15/#respond Sat, 06 Mar 2021 09:00:37 +0000 https://blog.adamfurmanek.pl/?p=3789 Continue reading Custom memory allocation in C# Part 15 — Allocating object on a stack without unsafe]]>

This is the fifteenth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

Last time we saw how to do unsafe operations without the unsafe keyword. This time we’ll allocate some reference type on a stack in a similar manner.

Let’s this code:

using System;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;

namespace Makeref_Safe_OnStack
{
    public class Program
	{
		public static void Main(string[] args)
		{
			Structure structure = new Structure();
			structure.syncBlock = 0xBADF00D;
			structure.methodHandle = 0xBADF00D;
			structure.field = 0xBADF00D;

			var method = typeof(Program).GetMethod("GetStackAddress", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.NonPublic);
			RuntimeHelpers.PrepareMethod(method.MethodHandle);
			var codeAddress = method.MethodHandle.GetFunctionPointer();

			Console.WriteLine(codeAddress.ToString("X"));

			Marshal.Copy(addressGetter, 0, codeAddress, addressGetter.Length);

			var structureAddress = GetStackAddress() + 132;
			Console.WriteLine(structureAddress.ToString("X"));

			structure.syncBlock = 0;
			structure.methodHandle = (int)typeof(Klass).TypeHandle.Value;
			structure.field = 123;

			Holder<Klass> holder = new Holder<Klass>();
			GCHandle holderHandle = GCHandle.Alloc(holder);
			var holderAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(holderHandle));
			Marshal.WriteIntPtr(holderAddress, 4, structureAddress + 4); // Skip first integer for sync block, assumes x86

			Console.WriteLine(holder.reference.GetType());
			Console.WriteLine(holder.reference.Field);

			structure.field = 456;
			Console.WriteLine(holder.reference.Field);
		}

		static byte[] addressGetter = new byte[] 
		{
			0x89, 0xE0, // mov eax, esp
			0xC3        // ret
		};

		static IntPtr GetStackAddress()
		{
			Console.WriteLine("Some dummy code to be replaced");
			return IntPtr.Zero;
		}
	}

	class Holder<T>
	{
		public T reference;
	}

	class Klass
	{
		public int Field;
	}

	struct Structure
	{
		public int syncBlock;
		public int methodHandle;
		public int field;
	}
}

In line 43 we have a machine code for getting the esp register. In line 22 we modify the GetStackAddress method with our machine code.

We allocate some structure on the stack which we’ll later override with a reference type. We do it in line 34.

Finally, you can see how we take type in line 36 and then modify reference instance field by using structure. This confirms the object is exactly in the stack.

Output:

438F1F0
88BE8E0
Makeref_Safe_OnStack.Klass
123
456

]]>
https://blog.adamfurmanek.pl/2021/03/06/custom-memory-allocation-in-c-part-15/feed/ 0
Custom memory allocation in C# Part 14 — Unsafe code without unsafe keyword https://blog.adamfurmanek.pl/2021/02/27/custom-memory-allocation-in-c-part-14/ https://blog.adamfurmanek.pl/2021/02/27/custom-memory-allocation-in-c-part-14/#comments Sat, 27 Feb 2021 09:00:32 +0000 https://blog.adamfurmanek.pl/?p=3784 Continue reading Custom memory allocation in C# Part 14 — Unsafe code without unsafe keyword]]>

This is the fourteenth part of the Custom memory allocation series. For your convenience you can find other parts in the table of contents in Part 1 — Allocating object on a stack

This whole series is about unsafe operations and manual memory managing. However, all the things I’ve shown can be done with no unsafe keyword at all and no TypedReference instances. Let’s start with getting an instance address:

using System;
using System.Runtime.InteropServices;

namespace Makeref_Safe
{
	class Program
	{
		static void Main(string[] args)
		{
			int[] array = new int[20];

			array[1] = (int)typeof(Klass).TypeHandle.Value;
			array[2] = 123;

			GCHandle arrayHandle = GCHandle.Alloc(array);
			var arrayAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(arrayHandle));

			Holder<Klass> holder = new Holder<Klass>();
			GCHandle holderHandle = GCHandle.Alloc(holder);
			var holderAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(holderHandle));
			Marshal.WriteIntPtr(holderAddress, 4, arrayAddress + 3 * 4); // Skip Method Handle, array size, first integer for sync block, assumes x86

			Console.WriteLine(holder.reference.GetType());
			Console.WriteLine(holder.reference.Field);
			array[2] = 456;
			Console.WriteLine(holder.reference.Field);
		}
	}

	class Holder<T>
	{
		public T reference;
	}

	class Klass
	{
		public int Field;
	}
}

You can see the output:

Klass
123
456

So you can see that I’m able to get the object address and use it to modify anything in place. Whereas with TypedReference we could modify any reference directly, this time we need to wrap it with some holder to get another level of indirection. However, the concept is exactly the same.

What about implementing the UnsafeList in the safe way?

using System.Runtime.InteropServices;

namespace UnsafeList
{
    public class SafeList<T> where T : class
    {
        private readonly int _elementSize;
        private T _target;
        private int[] _storage;
        private int _currentIndex = -1;
        private Holder<T> holder = new Holder<T>();
        GCHandle storageHandle;
        GCHandle holderHandle;

        public SafeList(int size, int elementSize)
        {
            _elementSize = elementSize;
            _storage = new int[size * _elementSize];
            _target = default(T);
            storageHandle = GCHandle.Alloc(_storage);
            holderHandle = GCHandle.Alloc(holder);
        }

        public int Add(T item)
        {
            _currentIndex++;

            GCHandle handle = GCHandle.Alloc(item);
            var itemAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(handle)) - 4; // Assumes 86
            handle.Free();

            for (int i = 1; i < _elementSize; ++i)
            {
                _storage[_currentIndex*_elementSize + i] = Marshal.ReadInt32(itemAddress + i * 4);
            }


            return _currentIndex;
        }

        public T GetInternal(int index)
        {
            var storageAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(storageHandle));
            var holderAddress = Marshal.ReadIntPtr(GCHandle.ToIntPtr(holderHandle));

            Marshal.WriteIntPtr(holderAddress, 4, storageAddress + 2 * 4 + index * _elementSize * 4 + 4); // Skip 2*4 for Method Handle and array size, index * _elementSize * 4 for elements, 4 for sync block

            return holder.reference;
        }

        public T Get(int index)
        {
            return GetInternal(index);
        }

        public void Free()
        {
            storageHandle.Free();
            holderHandle.Free();
        }
    }

    class Holder<T>
    {
        public T reference;
    }
}

Exactly the same principles. However, this time we can benchmark it in Dotnetfiddle:

------------------------
Array
Insertion time: 8
Sum: -1610778624
Calculation time: 4
Array
Insertion time: 20
Sum: -1610778624
Calculation time: 4
Array
Insertion time: 32
Sum: -1610778624
Calculation time: 3
------------------------
List
Insertion time: 7
Sum: -1610778624
Calculation time: 3
List
Insertion time: 16
Sum: -1610778624
Calculation time: 3
List
Insertion time: 42
Sum: -1610778624
Calculation time: 3
------------------------
SafeList
Insertion time: 55
Sum: -1610778624
Calculation time: 12
SafeList
Insertion time: 54
Sum: -1610778624
Calculation time: 10
SafeList
Insertion time: 68
Sum: -1610778624
Calculation time: 10

We know from previous parts that UnsafeList was faster with huge number of elements. Here we have only 100k and we can see the UnsafeList implemented in a “safe way” is way slower. My output for 20kk elements:

------------------------
Array
Insertion time: 3412
Sum: -1023623168
Calculation time: 109
Array
Insertion time: 3437
Sum: -1023623168
Calculation time: 114
Array
Insertion time: 3441
Sum: -1023623168
Calculation time: 108
------------------------
List
Insertion time: 3310
Sum: -1023623168
Calculation time: 148
List
Insertion time: 3563
Sum: -1023623168
Calculation time: 118
List
Insertion time: 3413
Sum: -1023623168
Calculation time: 148
------------------------
UnsafeList
Insertion time: 1184
Sum: -1023623168
Calculation time: 174
UnsafeList
Insertion time: 1304
Sum: -1023623168
Calculation time: 210
UnsafeList
Insertion time: 1278
Sum: -1023623168
Calculation time: 177
------------------------
SafeList
Insertion time: 6714
Sum: -1023623168
Calculation time: 573
SafeList
Insertion time: 7168
Sum: -1023623168
Calculation time: 584
SafeList
Insertion time: 6746
Sum: -1023623168
Calculation time: 593

So you can see that the UnsafeList is faster (~1 second versus ~3.5). SafeList on the other hand is much slower (almost 7 seconds). However, no unsafe keyword.

]]>
https://blog.adamfurmanek.pl/2021/02/27/custom-memory-allocation-in-c-part-14/feed/ 1
.NET Inside Out Part 26 – Multiple identity inheritance in C# https://blog.adamfurmanek.pl/2021/02/20/net-inside-out-part-26/ https://blog.adamfurmanek.pl/2021/02/20/net-inside-out-part-26/#respond Sat, 20 Feb 2021 09:00:23 +0000 https://blog.adamfurmanek.pl/?p=3767 Continue reading .NET Inside Out Part 26 – Multiple identity inheritance in C#]]>

This is the twentieth sixth part of the .NET Inside Out series. For your convenience you can find other parts in the table of contents in Part 1 – Virtual and non-virtual calls in C#

We know C# has multiple signature (interface) and implementation inheritance. The latter doesn’t support full polymorphic invocations, though, but we already fixed it. We also know how to emulate state inheritance in Java and that can be almost directly translated to C#. Today we’ll see how to hack multiple identity inheritance in C#.

Word of warning: this is super hacky and requires a lot of attention (just like most things on this blog, though).

Let’s start with the following classes:

class Base1
{
	public int field;

	public void PrintInt()
	{
		Console.WriteLine(field);
	}
}

class Base2
{
	public float field;

	public void PrintFloat()
	{
		Console.WriteLine(field);
	}
}

class Base3
{
	public short field1;
	public short field2;

	public void PrintFields()
	{
		Console.WriteLine(field1);
		Console.WriteLine(field2);
	}
}

class Base4
{
	public string field;

	public void PrintString()
	{
		Console.WriteLine(field);
	}
}

They have the same physical structure in .NET Framework 4.5 on x86 architecture. Each instance has sync block (4 bytes), method handle (4 bytes), fields occupying 4 bytes (either one field like integer/float/string or two fields taking 2 bytes each). We’d like to create a class which can behave like any of these, just like with inheritance.

The idea is simple: we’ll create a fake class with matching size and holders for fields of each base class. We’ll dynamically change type as needed (morph the instance) and save/restore fields.

First, we need to have holders for base instances:

public class CurrentState
{
	public Dictionary holders;
	public Type lastRealType;

	public CurrentState(params object[] inheritedTypes)
	{
		holders = inheritedTypes.ToDictionary(t =&gt; t.GetType(), t =&gt; t);
	}
}

Now we need to have an interface to access the state in whichever type we are now:

interface MultipleBase
{
	CurrentState CurrentState();
}

Now we need to create subclasses with the state:

class FakeChild1 : Base1, MultipleBase
{
	// Types don't matter, size does
	public CurrentState currentState;

	public CurrentState CurrentState() =&gt; currentState;
}

class FakeChild2 : Base2, MultipleBase
{
	// Types don't matter, size does
	public CurrentState currentState;

	public CurrentState CurrentState() =&gt; currentState;
}

class FakeChild3 : Base3, MultipleBase
{
	// Types don't matter, size does
	public CurrentState currentState;

	public CurrentState CurrentState() =&gt; currentState;
}

class FakeChild4 : Base4, MultipleBase
{
	// Types don't matter, size does
	public CurrentState currentState;

	public CurrentState CurrentState() =&gt; currentState;
}

Notice how each subclass inherits the fields from the base class and also adds one more field for the state. Also, we inherit so we can use the subclass as a base class as needed.

Now it’s the time for the morphing logic:

static class MultipleBaseExtensions
{
	public static RealType Morph(this MultipleBase self) where FakeType : class where RealType : class
	{
		object holder;
		var currentState = self.CurrentState();
		var lastRealType = currentState.lastRealType;

		if (lastRealType != null)
		{
			holder = currentState.holders[lastRealType];

			foreach (var field in lastRealType.GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance))
			{
				field.SetValue(holder, field.GetValue(self));
			}
		}

		ChangeType(typeof(FakeType), self);
		holder = currentState.holders[typeof(RealType)];
		foreach (var field in typeof(RealType).GetFields(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance))
		{
			field.SetValue(self, field.GetValue(holder));
		}

		currentState.lastRealType = typeof(RealType);
		return (RealType)self;
	}

	private static void ChangeType(Type t, MultipleBase self)
	{
		unsafe
		{
			TypedReference selfReference = __makeref(self);
			*(IntPtr*)*(IntPtr*)*(IntPtr*)&amp;selfReference = t.TypeHandle.Value;
		}
	}
}

We extract the current type (after last morphing) and we save all the fields on the side to the specific holder. Then, we change the type (morph) and then restore fields for new type using holder instance. We change the type by modifying the method handle in place.

It’s crucial here to understand that we assume that all instances have the same size and that the currentState field is always in the same place. We need to have the same size in each of the fake subclasses to support proper heap scanning. Otherwise GC will crash. We need to have currentState field in the same place otherwise we won’t find it after morphing.

Now the demo:

MultipleBase child = new FakeChild1
{
	currentState = new CurrentState(new Base1(), new Base2(), new Base3(), new Base4())
};

So we start with instance of any fake subclass and create holders as needed. Next, we morph and modify fields:

Console.WriteLine("Base1");
Base1 base1 = child.Morph();
base1.field = 123;
base1.PrintInt();
Console.WriteLine();

Console.WriteLine("Base2");
Base2 base2 = child.Morph();
base2.field = 456.0f;
base2.PrintFloat();
Console.WriteLine();

Console.WriteLine("Base3");
Base3 base3 = child.Morph();
base3.field1 = 789;
base3.field2 = 987;
base3.PrintFields();
Console.WriteLine();

Console.WriteLine("Base4");
Base4 base4 = child.Morph();
base4.field = "Abrakadabra";
base4.PrintString();
Console.WriteLine();

Console.WriteLine("Base3 again");
base3 = child.Morph();
base3.PrintFields();
Console.WriteLine();

Console.WriteLine("Base2 again");
base2 = child.Morph();
base2.PrintFloat();
Console.WriteLine();

Console.WriteLine("Base1 again");
base1 = child.Morph();
base1.PrintInt();

Output:

Base1
123

Base2
456

Base3
789
987

Base4
Abrakadabra

Base3 again
789
987

Base2 again
456

Base1 again
123

So you can see that we can morph the instance and change fields, and then we can morph back and restore fields. Obviously, multithreading scenario here would be pretty tricky. However, we sort of hacked the instance to support multiple base classes in a sort of generic way.

]]>
https://blog.adamfurmanek.pl/2021/02/20/net-inside-out-part-26/feed/ 0
Async Wandering Part 12 — Fibers with generics https://blog.adamfurmanek.pl/2021/01/30/async-wandering-part-12/ https://blog.adamfurmanek.pl/2021/01/30/async-wandering-part-12/#respond Sat, 30 Jan 2021 09:00:11 +0000 https://blog.adamfurmanek.pl/?p=3743 Continue reading Async Wandering Part 12 — Fibers with generics]]>

This is the twelfth part of the Async Wandering series. For your convenience you can find other parts in the table of contents in Part 1 – Why creating Form from WinForms in unit tests breaks async?

Today we are going to color our functions in different way.

Last time we saw how to return values from each function using monads. However, all we need is just an ability to run the code in some context if we expect that code may be asynchronous. If we know it’s going to be synchronous then there is no reason to go through any monads. Instead of reverse-colouring functions, we may use generics to propagate the context and call it as needed.

public abstract class Builder
{
	public abstract Monad Build();
}

public class IdBuilder : Builder
{
	public override Monad Build()
	{
		return new Id();
	}
}

public class AsyncBuilder : Builder
{
	public override Monad Build()
	{
		return new Async();
	}
}

public interface Monad
{
	U Map(T value, Func lambda);
	void Complete(object t);
}

public class Id : Monad
{
	private object t;

	public U Map(T value, Func lambda)
	{
		this.t = value;
		lock (this)
		{
			while (t == null)
			{
				Monitor.Wait(this);
			}
		}

		return lambda((T)this.t);
	}

	public void Complete(object t)
	{
		lock (this)
		{
			this.t = t;
			Monitor.PulseAll(this);
		}
	}
}

public class Async : Monad
{
	private object t;
	private int current;

	public U Map(T value, Func lambda)
	{
		this.t = value;
		if (t == null)
		{
			this.current = HKTMonadFiberAsync.current;
			byte b;
			HKTMonadFiberAsync.readyToGo.TryRemove(this.current, out b);
			HKTMonadFiberAsync.helper.Switch(0);
		}

		return lambda((T)this.t);
	}

	public void Complete(object t)
	{
		this.t = t;
		HKTMonadFiberAsync.readyToGo.TryAdd(this.current, 0);
	}
}

Super similar to the code from the last part. However, this time we don’t hold the value in the monad, we pass it as a parameter and run it through the context.

How do we use it? This way:

private static void RunInternal()
{
	WhereAmI("Before nesting");

	RunInternalNested<AsyncBuilder>();

	WhereAmI("After nesting");
}

private static void RunInternalNested() where T: Builder, new()
{
	WhereAmI("Before creating delay");

	Delay<T>(2000);

	WhereAmI("After sleeping");

	var data = Data<T>("Some string");
	
	WhereAmI($"After creating data {data}");
}

private static void Delay(int timeout) where T : Builder, new()
{
	var context = new T().Build();
	var timer = new Timer(_ => context.Complete(new object()), null, timeout, Timeout.Infinite);
	GC.KeepAlive(timer);
	context.Map((object)null, _ => timeout);
}

private static U Data(U d) where T: Builder, new()
{
	var context = new T().Build();
	return context.Map(d, _ => d);
}

notice that call to Delay passes the generic parameter indicating the context. We can also wrap any value through the context, just like Task.FromResult if needed. And the output is as expected:

Thread 1 Time 8/12/2020 5:16:21 PM: Start - HKTMonadFiberAsync
Thread 1 Time 8/12/2020 5:16:21 PM: Before nesting
Thread 1 Time 8/12/2020 5:16:21 PM: Before creating delay
Thread 1 Time 8/12/2020 5:16:21 PM: Side job
Thread 1 Time 8/12/2020 5:16:23 PM: After sleeping
Thread 1 Time 8/12/2020 5:16:23 PM: After creating data Some string
Thread 1 Time 8/12/2020 5:16:23 PM: After nesting
Thread 1 Time 8/12/2020 5:16:23 PM: End - HKTMonadFiberAsync

See that the side job was executed when we were sleeping. But if we change line 5 to RunInternalNested< IdBuilder>();, we get this:

Thread 1 Time 8/12/2020 5:17:10 PM: Start - HKTMonadFiberAsync
Thread 1 Time 8/12/2020 5:17:10 PM: Before nesting
Thread 1 Time 8/12/2020 5:17:10 PM: Before creating delay
Thread 1 Time 8/12/2020 5:17:12 PM: After sleeping
Thread 1 Time 8/12/2020 5:17:12 PM: After creating data Some string
Thread 1 Time 8/12/2020 5:17:12 PM: After nesting
Thread 1 Time 8/12/2020 5:17:12 PM: Side job
Thread 1 Time 8/12/2020 5:17:12 PM: End - HKTMonadFiberAsync

So the side job is executed after the main one finishes which is a synchronous execution.

This way we have no colors, no static state, just a generic parameter which could be optimized by the compiler. We can go even further and get rid of boxing:

public abstract class Builder
{
	public abstract Monad Build();
}

public class IdBuilder : Builder
{
	public override Monad Build()
	{
		return new Id();
	}
}

public class AsyncBuilder : Builder
{
	public override Monad Build()
	{
		return new Async();
	}
}

public interface Monad
{
	U Map<U>(T value, Func lambda);
	void Complete(T t);
}

public class Id : Monad
{
	private T t;

	public U Map<U>(T value, Func lambda)
	{
		this.t = value;
		lock (this)
		{
			while (t == null)
			{
				Monitor.Wait(this);
			}
		}

		return lambda(this.t);
	}

	public void Complete(T t)
	{
		lock (this)
		{
			this.t = t;
			Monitor.PulseAll(this);
		}
	}
}

public class Async : Monad
{
	private T t;
	private int current;

	public U Map<U>(T value, Func lambda)
	{
		this.t = value;
		if (t == null)
		{
			this.current = HKTMonadFiberAsync.current;
			byte b;
			HKTMonadFiberAsync.readyToGo.TryRemove(this.current, out b);
			HKTMonadFiberAsync.helper.Switch(0);
		}

		return lambda(this.t);
	}

	public void Complete(T t)
	{
		this.t = t;
		HKTMonadFiberAsync.readyToGo.TryAdd(this.current, 0);
	}
}

Bonus points for getting sort of Higher Kinded Type in C# without doing the Brand transformation.

]]>
https://blog.adamfurmanek.pl/2021/01/30/async-wandering-part-12/feed/ 0
Types and Programming Languages Part 3 — Finally during termination https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/ https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/#comments Sat, 23 Jan 2021 09:00:13 +0000 https://blog.adamfurmanek.pl/?p=3729 Continue reading Types and Programming Languages Part 3 — Finally during termination]]>

This is the third part of the Types and Programming Languages series. For your convenience you can find other parts in the table of contents in Part 1 — Do not return in finally

Let’s take the following code:

try{
	throw new Exception("Exception 1");
}finally{
	// cleanup
}

Let’s say there is no catch block anywhere on this thread. What’s going to happen?

That depends on the platform. For instance C# finally documentation says:

Within a handled exception, the associated finally block is guaranteed to be run. However, if the exception is unhandled, execution of the finally block is dependent on how the exception unwind operation is triggered. That, in turn, is dependent on how your computer is set up.

so the finally block may not be executed. JVM guarantees finally is executed according to this.

But the things are even more interesting because they may depend on the exception type. For instance, .NET has HandleProcessCorruptedStateException attribute:

Corrupted process state exceptions are exceptions that indicate that the state of a process has been corrupted. We do not recommend executing your application in this state.

By default, the common language runtime (CLR) does not deliver these exceptions to managed code, and the try/catch blocks (and other exception-handling clauses) are not invoked for them. If you are absolutely sure that you want to maintain your handling of these exceptions, you must apply the HandleProcessCorruptedStateExceptionsAttribute attribute to the method whose exception-handling clauses you want to execute. The CLR delivers the corrupted process state exception to applicable exception clauses only in methods that have both the HandleProcessCorruptedStateExceptionsAttribute and SecurityCriticalAttribute attributes.

So your application may survive but not all finally blocks may get executed.

Now similar question arises when instead of throwing exception you exit your application by calling exit(). Is the finally going to be run?

Why would we care? Because we typically release resources in the finally block. If these resources are local to the process then it’s not a big deal, but once you start using interprocess things (like system-wide mutexes) then it’s important to release them otherwise the other user may not know if the protected state is corrupted or not.

Not to mention that unhandled exception may (.NET) or may not (JVM) take whole application down.

Takeaway? Always put a global try-catch handler on the thread.

]]>
https://blog.adamfurmanek.pl/2021/01/23/types-and-programming-languages-part-3/feed/ 1