Random IT Utensils https://blog.adamfurmanek.pl IT, operating systems, maths, and more. Sat, 09 Nov 2024 21:41:56 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.2 Async Wandering Part 15 — How async in C# tricks you and how to order async continuations https://blog.adamfurmanek.pl/2024/11/09/async-wandering-part-15/ https://blog.adamfurmanek.pl/2024/11/09/async-wandering-part-15/#respond Sat, 09 Nov 2024 21:10:33 +0000 https://blog.adamfurmanek.pl/?p=5101 Continue reading Async Wandering Part 15 — How async in C# tricks you and how to order async continuations]]>

This is the fifteenth part of the Async Wandering series. For your convenience you can find other parts in the table of contents in Part 1 – Why creating Form from WinForms in unit tests breaks async?

You probably heard that async is all about not blocking the operating system level thread. This is the fundamental principle of asynchronous programming. If you block the operating system level thread, you lose all the benefits of the asynchronous code.

You also need to keep in mind how to write your C# code. You probably heard that you should keep async all the way up. This is rather easy to keep because the compiler takes care of that. What’s slightly harder to remember is to keep ConfigureAwait(false) all the way down. If you don’t do it this way, the compiler won’t help you and you may run into some nasty deadlocking issues, especially if you use some weird SynchronizationContext.

Last but not least, you probably know that the asynchronous code is only useful if your code is IO-bound. You probably heard that many times. However, what might be very surprising is that C# actually does a lot to make your application work even if your code is CPU-bound and you still use async. This is very misleading and may let you believe that you know async, whereas you only know async in C#. Let’s see why.

There is plenty of no threads!

One of the best articles about async is C# is titled There Is No Thread. Stephen Cleary shows that it’s all about continuations and juggling the lambdas to run your code when some IO-bound operation finishes. I even used this title in my Internals of Async talk in which I explain all the internals of synchronization contexts, continuations, and the machinery behind the scenes.

However, it’s only a figure of speech. At the very end of the day, we need to have some thread to run the code. Depending on your synchronization context, there may be some dedicated thread to run the continuations (like in desktop or Blazor applications), or we can use threads from the thread pool. If you think carefully about the asynchronous code, you should notice that this is the place where C# either bites you hard (and causes many deadlocks) or saves your application even if you are doing something very wrong. How? Because C# uses many threads.

By default, C# uses the thread pool to run continuations. The thread pool runs some not-so-trivial algorithm to spawn new threads when there is plenty of work to be done. This is not part of the asynchronous programming paradigm per se. This is just the implementation detail of C#’s asynchronous code which heavily impacts how your applications scale. Other languages don’t do it in the same way and what works well in C# may fail badly somewhere else. For instance, Python’s asyncio uses just one thread even though Python supports multithreading. While this is just an implementation detail, it have tremendous performance implications. Let’s see why.

One thread can kill you

Let’s take a typical message processing flow. We take a message from the service bus, refresh the lease periodically, and process the message in the background. Let’s say that our flow is IO-bound and we use async to benefit from non-blocking thread instead of spawning multiple threads. Let’s simulate the system. You can find the whole code in this gist.
We start with a message that will store when we received it, when we refreshed the lease for the last time, the identifier of the message, and the final status (if we lost the message or finished successfully):

public class Message
{
	public DateTime LastRefreshTime { get; set; }
	public DateTime ReceiveTime { get; set; }
	public bool WasLost { get; set; }
	public bool WasFinished { get; set; }
	public int Id { get; set; }
}

Now, we want to configure timings in our application. We can specify how long it takes to receive the message from the bus, how many operations we need to perform on each message, and how long they all take:

public class Settings
{
	public int MessagesCount { get; set; }
	public TimeSpan ReceivingDuration { get; set; }
	public TimeSpan ProcessingCPUDuration { get; set; }
	public TimeSpan ProcessingIODuration { get; set; }
	public int ProcessingLoops { get; set; }
	public TimeSpan RefreshDelay { get; set; }
	public TimeSpan RefreshDuration { get; set; }
	public TimeSpan MessageLeaseTimeout { get; set; }
}

Finally, we have some statistics showing how we did:

public class Stats
{
	public int GeneratedMessages { get; set; }
	public int ProcessedSuccessfully { get; set; }
	public int Lost { get; set; }
}

Let’s now see the scaffodling code:

public static async Task Simulate()
{
	Initialize();
	StartMonitoringThread();
	await RunLoop();
}

private static void Initialize()
{
	CurrentStats = new Stats();
	CurrentSettings = new Settings
	{
		ReceivingDuration = TimeSpan.FromMilliseconds(250),
		RefreshDelay = TimeSpan.FromSeconds(2),
		RefreshDuration = TimeSpan.FromMilliseconds(250),
		MessageLeaseTimeout = TimeSpan.FromSeconds(5),
		ProcessingCPUDuration = TimeSpan.FromMilliseconds(100),
		ProcessingIODuration = TimeSpan.FromMilliseconds(1000),
		ProcessingLoops = 20,
		MessagesCount = 100
	};
}

private static void StartMonitoringThread()
{
	Thread monitoringThread = new Thread(() =>
	{
		while (true)
		{
			Thread.Sleep(TimeSpan.FromSeconds(3));
			Log($"Received messages {CurrentStats.GeneratedMessages}, " +
				$"success {CurrentStats.ProcessedSuccessfully}, " +
				$"failed {CurrentStats.Lost}, " +
				$"still running {CurrentStats.GeneratedMessages - CurrentStats.ProcessedSuccessfully - CurrentStats.Lost}");
		}
	});
	monitoringThread.IsBackground = true;
	monitoringThread.Start();
}

We have the Simulate method that runs the magic. It starts by initializing the timings and setting up some monitoring thread to print statistics every three seconds.

When it comes to the timings: we will run 20 loops for each message. In each loop’s iteration, we will do some CPU-bound operation (taking 100 milliseconds), and then some IO-bound operation (taking 1000 milliseconds). We can see that the CPU operation is 10 times shorter than the IO-bound one.

Finally, we have the heart of our system:

private static async Task RunLoop()
{
	while (true)
	{
		var message = await ReceiveMessage();
		if (message == null) continue;

		KeepLease(message);
		ProcessMessage(message);
	}
}

We receive the message, keep refreshing the lease, and process the message. Some error-handling code is omitted for brevity.

Receiving the message is rather straightforward – we check if we have more messages in the queue, then take one, otherwise we simply return:

public static async Task<Message?> ReceiveMessage()
{
	await Task.Yield();
	await Task.Delay(CurrentSettings.ReceivingDuration);

	if (CurrentSettings.MessagesCount-- > 0)
	{
		CurrentStats.GeneratedMessages++;
		Message message = new Message
		{
			LastRefreshTime = DateTime.Now,
			WasLost = false,
			Id = CurrentStats.GeneratedMessages,
			ReceiveTime = DateTime.Now
		};
		Log($"New message received with id {message.Id}");
		return message;
	}
	else
	{
		return null;
	}
}

Keeping a lease is also clear – we wait for some time, then refresh the lease and check if we made it on time:

public static async Task KeepLease(Message message)
{
	await Task.Yield();

	while (message.WasFinished == false) // This is unsafe according to memory model
	{
		await Task.Delay(CurrentSettings.RefreshDelay);

		if (DateTime.Now > message.LastRefreshTime + CurrentSettings.MessageLeaseTimeout)
		{
			message.WasLost = true;
			CurrentStats.Lost++;
			Log($"Lost lease for message {message.Id}");
			return;
		}
		else
		{
			await Task.Delay(CurrentSettings.RefreshDuration);
			Log($"Refreshed lease for message {message.Id}");
			message.LastRefreshTime = DateTime.Now;
		}
	}

	CurrentStats.ProcessedSuccessfully++;
}

Finally, the heart of our message processing. We simply run a loop and do the work:

public static async Task ProcessMessage(Message message)
{
	await Task.Yield();

	for (int part = 0; part < CurrentSettings.ProcessingLoops && message.WasLost == false; ++part)
	{
		Thread.Sleep(CurrentSettings.ProcessingCPUDuration); // CPU-bound part

		await Task.Delay(CurrentSettings.ProcessingIODuration); // IO-bound part
	}

	message.WasFinished = true;
	if (!message.WasLost)
	{
		Log($"Finished message with id {message.Id} in {DateTime.Now - message.ReceiveTime}");
	}
}

Notice that we block the thread for the CPU-bound operation and use await for the IO-bound one.

We also have this logging method that prints the timestamp, thread ID, and the message:

public static void Log(string message)
{
	Console.WriteLine($"{DateTime.Now}\t{Thread.CurrentThread.ManagedThreadId}\t{message}");
}

Let’s run the code, let it go for a while, and then see what happens:

11/9/2024 12:05:06 PM   4       New message received with id 1
11/9/2024 12:05:06 PM   6       New message received with id 2
11/9/2024 12:05:06 PM   6       New message received with id 3
11/9/2024 12:05:07 PM   8       New message received with id 4
11/9/2024 12:05:07 PM   8       New message received with id 5
11/9/2024 12:05:07 PM   4       New message received with id 6
11/9/2024 12:05:08 PM   4       New message received with id 7
11/9/2024 12:05:08 PM   4       New message received with id 8
11/9/2024 12:05:08 PM   11      New message received with id 9
11/9/2024 12:05:08 PM   8       Refreshed lease for message 1
11/9/2024 12:05:08 PM   12      New message received with id 10
11/9/2024 12:05:08 PM   11      Refreshed lease for message 2
11/9/2024 12:05:09 PM   12      New message received with id 11
11/9/2024 12:05:09 PM   7       Received messages 11, success 0, failed 0, still running 11
11/9/2024 12:05:09 PM   11      Refreshed lease for message 3
11/9/2024 12:05:09 PM   11      New message received with id 12
11/9/2024 12:05:09 PM   12      Refreshed lease for message 4
11/9/2024 12:05:09 PM   6       New message received with id 13
11/9/2024 12:05:09 PM   8       Refreshed lease for message 5
11/9/2024 12:05:09 PM   12      New message received with id 14
11/9/2024 12:05:10 PM   12      Refreshed lease for message 6
11/9/2024 12:05:10 PM   12      New message received with id 15
11/9/2024 12:05:10 PM   11      Refreshed lease for message 7
11/9/2024 12:05:10 PM   11      New message received with id 16
11/9/2024 12:05:10 PM   6       Refreshed lease for message 8
11/9/2024 12:05:10 PM   12      New message received with id 17
11/9/2024 12:05:10 PM   11      Refreshed lease for message 9
11/9/2024 12:05:10 PM   8       New message received with id 18
11/9/2024 12:05:10 PM   11      Refreshed lease for message 1
11/9/2024 12:05:11 PM   4       Refreshed lease for message 10
11/9/2024 12:05:11 PM   13      New message received with id 19
11/9/2024 12:05:11 PM   12      Refreshed lease for message 2
11/9/2024 12:05:11 PM   13      Refreshed lease for message 11
11/9/2024 12:05:11 PM   11      New message received with id 20
11/9/2024 12:05:11 PM   6       Refreshed lease for message 3
11/9/2024 12:05:11 PM   4       Refreshed lease for message 12
11/9/2024 12:05:11 PM   13      New message received with id 21
11/9/2024 12:05:11 PM   13      Refreshed lease for message 4
11/9/2024 12:05:11 PM   4       Refreshed lease for message 13
11/9/2024 12:05:11 PM   4       New message received with id 22
11/9/2024 12:05:12 PM   4       Refreshed lease for message 5
11/9/2024 12:05:12 PM   6       Refreshed lease for message 14
11/9/2024 12:05:12 PM   13      New message received with id 23
11/9/2024 12:05:12 PM   7       Received messages 23, success 0, failed 0, still running 23
11/9/2024 12:05:12 PM   4       Refreshed lease for message 6
11/9/2024 12:05:12 PM   4       Refreshed lease for message 15
...
11/9/2024 12:05:50 PM   15      Finished message with id 84 in 00:00:22.3550821
11/9/2024 12:05:50 PM   15      Refreshed lease for message 83
11/9/2024 12:05:50 PM   4       Refreshed lease for message 100
11/9/2024 12:05:50 PM   8       Finished message with id 85 in 00:00:22.3554494
11/9/2024 12:05:50 PM   15      Refreshed lease for message 84
11/9/2024 12:05:50 PM   20      Refreshed lease for message 92
11/9/2024 12:05:50 PM   20      Finished message with id 86 in 00:00:22.3882717
11/9/2024 12:05:50 PM   8       Refreshed lease for message 93
11/9/2024 12:05:51 PM   4       Refreshed lease for message 85
11/9/2024 12:05:51 PM   20      Finished message with id 87 in 00:00:22.3452990
11/9/2024 12:05:51 PM   4       Refreshed lease for message 94
11/9/2024 12:05:51 PM   20      Refreshed lease for message 86
11/9/2024 12:05:51 PM   7       Received messages 100, success 86, failed 0, still running 14
11/9/2024 12:05:51 PM   14      Finished message with id 88 in 00:00:22.3968974
11/9/2024 12:05:51 PM   13      Refreshed lease for message 87
11/9/2024 12:05:51 PM   4       Refreshed lease for message 95
11/9/2024 12:05:51 PM   4       Refreshed lease for message 88
11/9/2024 12:05:51 PM   14      Finished message with id 89 in 00:00:22.3782384
11/9/2024 12:05:51 PM   4       Refreshed lease for message 96
11/9/2024 12:05:51 PM   15      Finished message with id 90 in 00:00:22.3557212
11/9/2024 12:05:51 PM   15      Refreshed lease for message 89
11/9/2024 12:05:52 PM   13      Refreshed lease for message 97
11/9/2024 12:05:52 PM   20      Refreshed lease for message 90
11/9/2024 12:05:52 PM   15      Finished message with id 91 in 00:00:22.4805351
11/9/2024 12:05:52 PM   15      Refreshed lease for message 98
11/9/2024 12:05:52 PM   20      Refreshed lease for message 91
11/9/2024 12:05:52 PM   15      Refreshed lease for message 99
11/9/2024 12:05:52 PM   15      Finished message with id 92 in 00:00:22.3979587
11/9/2024 12:05:52 PM   4       Refreshed lease for message 100
11/9/2024 12:05:52 PM   20      Finished message with id 93 in 00:00:22.3374987
11/9/2024 12:05:53 PM   4       Refreshed lease for message 92
11/9/2024 12:05:53 PM   4       Finished message with id 94 in 00:00:22.3451488
11/9/2024 12:05:53 PM   4       Refreshed lease for message 93
11/9/2024 12:05:53 PM   13      Refreshed lease for message 94
11/9/2024 12:05:53 PM   13      Finished message with id 95 in 00:00:22.3784563
11/9/2024 12:05:53 PM   13      Finished message with id 96 in 00:00:22.3800325
11/9/2024 12:05:53 PM   13      Refreshed lease for message 95
11/9/2024 12:05:54 PM   4       Finished message with id 97 in 00:00:22.3312738
11/9/2024 12:05:54 PM   20      Refreshed lease for message 96
11/9/2024 12:05:54 PM   7       Received messages 100, success 96, failed 0, still running 4
11/9/2024 12:05:54 PM   13      Finished message with id 98 in 00:00:22.3502617
11/9/2024 12:05:54 PM   4       Refreshed lease for message 97
11/9/2024 12:05:54 PM   13      Finished message with id 99 in 00:00:22.3527442
11/9/2024 12:05:54 PM   4       Refreshed lease for message 98
11/9/2024 12:05:54 PM   4       Finished message with id 100 in 00:00:22.3675039
11/9/2024 12:05:54 PM   13      Refreshed lease for message 99
11/9/2024 12:05:55 PM   13      Refreshed lease for message 100
11/9/2024 12:05:57 PM   7       Received messages 100, success 100, failed 0, still running 0

We can see that all messaged were processed successfully in around 50 seconds. Processing a message was taking around 22 seconds which makes perfect sense since we had 20 iterations taking around 1100 milliseconds each. No failures, all was good.

Let’s now increase the CPU-bound operation time to 1 second (to match the IO-bound part). This is what happens:

11/9/2024 12:06:57 PM   8       New message received with id 1
11/9/2024 12:06:57 PM   11      New message received with id 2
11/9/2024 12:06:58 PM   11      New message received with id 3
11/9/2024 12:06:58 PM   4       New message received with id 4
11/9/2024 12:06:58 PM   13      New message received with id 5
11/9/2024 12:06:58 PM   13      New message received with id 6
11/9/2024 12:06:59 PM   14      New message received with id 7
11/9/2024 12:06:59 PM   11      New message received with id 8
11/9/2024 12:06:59 PM   4       New message received with id 9
11/9/2024 12:06:59 PM   15      Refreshed lease for message 1
11/9/2024 12:06:59 PM   4       New message received with id 10
11/9/2024 12:07:00 PM   12      Refreshed lease for message 2
11/9/2024 12:07:00 PM   12      New message received with id 11
11/9/2024 12:07:00 PM   7       Received messages 11, success 0, failed 0, still running 11
11/9/2024 12:07:00 PM   6       Refreshed lease for message 3
11/9/2024 12:07:00 PM   16      New message received with id 12
11/9/2024 12:07:00 PM   14      Refreshed lease for message 4
11/9/2024 12:07:00 PM   14      New message received with id 13
11/9/2024 12:07:00 PM   15      Refreshed lease for message 5
11/9/2024 12:07:00 PM   4       New message received with id 14
11/9/2024 12:07:01 PM   13      Refreshed lease for message 6
11/9/2024 12:07:01 PM   12      New message received with id 15
11/9/2024 12:07:01 PM   6       Refreshed lease for message 7
11/9/2024 12:07:01 PM   16      New message received with id 16
11/9/2024 12:07:01 PM   14      Refreshed lease for message 8
11/9/2024 12:07:01 PM   4       Refreshed lease for message 9
11/9/2024 12:07:02 PM   13      Refreshed lease for message 1
11/9/2024 12:07:02 PM   12      Refreshed lease for message 10
11/9/2024 12:07:02 PM   12      New message received with id 17
11/9/2024 12:07:02 PM   6       Refreshed lease for message 2
11/9/2024 12:07:02 PM   16      Refreshed lease for message 11
11/9/2024 12:07:02 PM   14      Refreshed lease for message 3
11/9/2024 12:07:02 PM   14      Refreshed lease for message 12
11/9/2024 12:07:02 PM   4       Refreshed lease for message 4
11/9/2024 12:07:03 PM   13      Refreshed lease for message 13
11/9/2024 12:07:03 PM   13      Refreshed lease for message 5
11/9/2024 12:07:03 PM   12      Refreshed lease for message 14
11/9/2024 12:07:03 PM   12      New message received with id 18
11/9/2024 12:07:03 PM   7       Received messages 18, success 0, failed 0, still running 18
11/9/2024 12:07:03 PM   16      Refreshed lease for message 6
11/9/2024 12:07:03 PM   16      Refreshed lease for message 15
...
11/9/2024 12:08:38 PM   21      Finished message with id 88 in 00:00:42.4181405
11/9/2024 12:08:38 PM   42      Finished message with id 90 in 00:00:41.9138898
11/9/2024 12:08:38 PM   8       Refreshed lease for message 91
11/9/2024 12:08:38 PM   28      Refreshed lease for message 85
11/9/2024 12:08:38 PM   28      Refreshed lease for message 88
11/9/2024 12:08:39 PM   24      Finished message with id 85 in 00:00:43.9296927
11/9/2024 12:08:39 PM   16      Finished message with id 93 in 00:00:40.6588087
11/9/2024 12:08:39 PM   7       Received messages 100, success 88, failed 0, still running 12
11/9/2024 12:08:39 PM   21      Refreshed lease for message 93
11/9/2024 12:08:39 PM   23      Refreshed lease for message 94
11/9/2024 12:08:39 PM   16      Refreshed lease for message 92
11/9/2024 12:08:40 PM   23      Refreshed lease for message 97
11/9/2024 12:08:40 PM   16      Refreshed lease for message 99
11/9/2024 12:08:40 PM   21      Refreshed lease for message 96
11/9/2024 12:08:40 PM   42      Refreshed lease for message 98
11/9/2024 12:08:40 PM   16      Refreshed lease for message 90
11/9/2024 12:08:40 PM   42      Refreshed lease for message 95
11/9/2024 12:08:40 PM   42      Refreshed lease for message 100
11/9/2024 12:08:40 PM   24      Finished message with id 94 in 00:00:41.3801588
11/9/2024 12:08:40 PM   8       Finished message with id 92 in 00:00:41.9121719
11/9/2024 12:08:40 PM   8       Finished message with id 95 in 00:00:41.1802881
11/9/2024 12:08:40 PM   24      Finished message with id 96 in 00:00:40.9180345
11/9/2024 12:08:40 PM   24      Refreshed lease for message 91
11/9/2024 12:08:40 PM   52      Refreshed lease for message 85
11/9/2024 12:08:41 PM   24      Finished message with id 98 in 00:00:41.3325727
11/9/2024 12:08:41 PM   21      Finished message with id 91 in 00:00:43.1595798
11/9/2024 12:08:42 PM   24      Refreshed lease for message 94
11/9/2024 12:08:42 PM   42      Refreshed lease for message 92
11/9/2024 12:08:42 PM   24      Refreshed lease for message 99
11/9/2024 12:08:42 PM   8       Refreshed lease for message 97
11/9/2024 12:08:42 PM   42      Refreshed lease for message 96
11/9/2024 12:08:42 PM   24      Finished message with id 97 in 00:00:42.6127819
11/9/2024 12:08:42 PM   42      Refreshed lease for message 98
11/9/2024 12:08:42 PM   42      Finished message with id 99 in 00:00:40.4341505
11/9/2024 12:08:42 PM   7       Received messages 100, success 95, failed 0, still running 5
11/9/2024 12:08:42 PM   21      Refreshed lease for message 95
11/9/2024 12:08:42 PM   42      Refreshed lease for message 100
11/9/2024 12:08:43 PM   42      Refreshed lease for message 91
11/9/2024 12:08:43 PM   8       Finished message with id 100 in 00:00:41.1363357
11/9/2024 12:08:44 PM   21      Refreshed lease for message 97
11/9/2024 12:08:44 PM   52      Refreshed lease for message 99
11/9/2024 12:08:44 PM   52      Refreshed lease for message 100
11/9/2024 12:08:45 PM   7       Received messages 100, success 100, failed 0, still running 0

This time it took nearly 2 minutes to process all the messages. Each message is now taking around 40 seconds. Still, all worked.

Let’s now talk about threads. You can see that the examples use multiple threads to handle the messages. In the second execution, there were around 60 active messages at one time, so this created many threads (we can see that at least 50 threads were created based on the log above). Our application scales well and we can’t complain. Seems like async is doing a really good job!

However, what would happen if we moved this code to some other asynchronous platform? For instance, to Python’s asyncio that uses only single thread? We can emulate that in C# by running the code above in a WinForms context that forces continuations to go through one thread. Let’s change the CPU-bound operation duration to 100 milliseconds (to the original value) and let’s run this from the WinForms app now:

11/9/2024 12:19:20 PM   1       New message received with id 1
11/9/2024 12:19:20 PM   1       New message received with id 2
11/9/2024 12:19:20 PM   1       New message received with id 3
11/9/2024 12:19:21 PM   1       New message received with id 4
11/9/2024 12:19:21 PM   1       New message received with id 5
11/9/2024 12:19:22 PM   1       New message received with id 6
11/9/2024 12:19:22 PM   1       New message received with id 7
11/9/2024 12:19:22 PM   1       Refreshed lease for message 1
11/9/2024 12:19:22 PM   1       New message received with id 8
11/9/2024 12:19:22 PM   11      Received messages 8, success 0, failed 0, still running 8
11/9/2024 12:19:23 PM   1       Refreshed lease for message 2
11/9/2024 12:19:23 PM   1       New message received with id 9
11/9/2024 12:19:23 PM   1       Refreshed lease for message 3
11/9/2024 12:19:23 PM   1       New message received with id 10
11/9/2024 12:19:23 PM   1       Refreshed lease for message 4
11/9/2024 12:19:24 PM   1       Refreshed lease for message 5
11/9/2024 12:19:24 PM   1       New message received with id 11
11/9/2024 12:19:24 PM   1       Refreshed lease for message 6
11/9/2024 12:19:24 PM   1       New message received with id 12
11/9/2024 12:19:25 PM   1       Refreshed lease for message 7
11/9/2024 12:19:25 PM   1       Refreshed lease for message 1
11/9/2024 12:19:25 PM   1       Refreshed lease for message 8
11/9/2024 12:19:25 PM   1       New message received with id 13
11/9/2024 12:19:25 PM   11      Received messages 13, success 0, failed 0, still running 13
11/9/2024 12:19:26 PM   1       Refreshed lease for message 2
11/9/2024 12:19:26 PM   1       Refreshed lease for message 9
...
11/9/2024 12:21:15 PM   1       Refreshed lease for message 58
11/9/2024 12:21:15 PM   1       Refreshed lease for message 62
11/9/2024 12:21:15 PM   1       Finished message with id 46 in 00:00:42.5373955
11/9/2024 12:21:16 PM   1       Refreshed lease for message 49
11/9/2024 12:21:16 PM   1       Refreshed lease for message 51
11/9/2024 12:21:16 PM   1       Refreshed lease for message 53
11/9/2024 12:21:17 PM   1       New message received with id 65
11/9/2024 12:21:17 PM   1       Refreshed lease for message 55
11/9/2024 12:21:17 PM   1       Refreshed lease for message 46
11/9/2024 12:21:17 PM   1       Refreshed lease for message 57
11/9/2024 12:21:17 PM   1       Refreshed lease for message 59
11/9/2024 12:21:17 PM   11      Received messages 65, success 43, failed 3, still running 19
11/9/2024 12:21:17 PM   1       Refreshed lease for message 61
11/9/2024 12:21:17 PM   1       Refreshed lease for message 48
...
11/9/2024 12:22:53 PM   11      Received messages 100, success 90, failed 3, still running 7
11/9/2024 12:22:53 PM   1       Refreshed lease for message 97
11/9/2024 12:22:53 PM   1       Refreshed lease for message 99
11/9/2024 12:22:54 PM   1       Refreshed lease for message 94
11/9/2024 12:22:54 PM   1       Refreshed lease for message 96
11/9/2024 12:22:54 PM   1       Finished message with id 95 in 00:00:32.1189187
11/9/2024 12:22:54 PM   1       Refreshed lease for message 98
11/9/2024 12:22:54 PM   1       Refreshed lease for message 100
11/9/2024 12:22:55 PM   1       Refreshed lease for message 95
11/9/2024 12:22:56 PM   1       Finished message with id 96 in 00:00:31.0536654
11/9/2024 12:22:56 PM   1       Refreshed lease for message 99
11/9/2024 12:22:56 PM   1       Refreshed lease for message 97
11/9/2024 12:22:56 PM   11      Received messages 100, success 92, failed 3, still running 5
11/9/2024 12:22:56 PM   1       Refreshed lease for message 96
11/9/2024 12:22:57 PM   1       Refreshed lease for message 98
11/9/2024 12:22:57 PM   1       Refreshed lease for message 100
11/9/2024 12:22:57 PM   1       Finished message with id 97 in 00:00:30.1211740
11/9/2024 12:22:58 PM   1       Refreshed lease for message 99
11/9/2024 12:22:58 PM   1       Refreshed lease for message 97
11/9/2024 12:22:58 PM   1       Finished message with id 98 in 00:00:29.1238261
11/9/2024 12:22:59 PM   1       Refreshed lease for message 98
11/9/2024 12:22:59 PM   1       Refreshed lease for message 100
11/9/2024 12:22:59 PM   11      Received messages 100, success 95, failed 3, still running 2
11/9/2024 12:22:59 PM   1       Finished message with id 99 in 00:00:28.1643467
11/9/2024 12:23:00 PM   1       Refreshed lease for message 99
11/9/2024 12:23:00 PM   1       Finished message with id 100 in 00:00:27.2119750
11/9/2024 12:23:01 PM   1       Refreshed lease for message 100
11/9/2024 12:23:02 PM   11      Received messages 100, success 97, failed 3, still running 0

It wasn’t that bad and we can see that we indeed ran on a single thread. First, notice that now it took nearly 4 minutes to complete. That’s understandable as we now run things on a single thread. Also, notice that each message was taking around 30-40 seconds to complete. That is much longer than before. This is because messages compete for the CPU time and we don’t have any parallelism. It’s also worth noting that we lost 3 messages. That’s not that bad. The system overscaled just a bit and couldn’t deal with the load but the stabilized and finished processing.

Let’s now increase the CPU-bound duration to 1 second and try again:

11/9/2024 12:24:26 PM   1       New message received with id 1
11/9/2024 12:24:27 PM   1       New message received with id 2
11/9/2024 12:24:29 PM   12      Received messages 2, success 0, failed 0, still running 2
11/9/2024 12:24:29 PM   1       New message received with id 3
11/9/2024 12:24:29 PM   1       Refreshed lease for message 1
11/9/2024 12:24:32 PM   12      Received messages 3, success 0, failed 0, still running 3
11/9/2024 12:24:32 PM   1       Refreshed lease for message 2
11/9/2024 12:24:33 PM   1       New message received with id 4
11/9/2024 12:24:34 PM   1       Lost lease for message 3
11/9/2024 12:24:34 PM   1       Refreshed lease for message 1
11/9/2024 12:24:35 PM   12      Received messages 4, success 0, failed 1, still running 3
11/9/2024 12:24:38 PM   12      Received messages 4, success 0, failed 1, still running 3
11/9/2024 12:24:39 PM   1       Refreshed lease for message 2
11/9/2024 12:24:39 PM   1       New message received with id 5
11/9/2024 12:24:39 PM   1       Lost lease for message 4
11/9/2024 12:24:39 PM   1       Refreshed lease for message 1
11/9/2024 12:24:41 PM   12      Received messages 5, success 0, failed 2, still running 3
11/9/2024 12:24:43 PM   1       New message received with id 6
11/9/2024 12:24:44 PM   12      Received messages 6, success 0, failed 2, still running 4
11/9/2024 12:24:46 PM   1       Refreshed lease for message 5
...
11/9/2024 12:31:48 PM   1       Lost lease for message 98
11/9/2024 12:31:51 PM   12      Received messages 99, success 7, failed 88, still running 4
11/9/2024 12:31:52 PM   1       Refreshed lease for message 97
11/9/2024 12:31:52 PM   1       Refreshed lease for message 92
11/9/2024 12:31:52 PM   1       Refreshed lease for message 86
11/9/2024 12:31:54 PM   12      Received messages 99, success 7, failed 88, still running 4
11/9/2024 12:31:54 PM   1       New message received with id 100
11/9/2024 12:31:54 PM   1       Lost lease for message 99
11/9/2024 12:31:54 PM   1       Finished message with id 86 in 00:01:04.5795457
11/9/2024 12:31:57 PM   12      Received messages 100, success 7, failed 89, still running 4
11/9/2024 12:31:57 PM   1       Refreshed lease for message 86
11/9/2024 12:31:57 PM   1       Refreshed lease for message 97
11/9/2024 12:31:58 PM   1       Refreshed lease for message 92
11/9/2024 12:31:59 PM   1       Lost lease for message 100
11/9/2024 12:32:00 PM   12      Received messages 100, success 8, failed 90, still running 2
11/9/2024 12:32:01 PM   1       Refreshed lease for message 97
11/9/2024 12:32:02 PM   1       Refreshed lease for message 92
11/9/2024 12:32:03 PM   12      Received messages 100, success 8, failed 90, still running 2
11/9/2024 12:32:04 PM   1       Refreshed lease for message 97
11/9/2024 12:32:05 PM   1       Refreshed lease for message 92
11/9/2024 12:32:06 PM   12      Received messages 100, success 8, failed 90, still running 2
11/9/2024 12:32:08 PM   1       Refreshed lease for message 97
11/9/2024 12:32:09 PM   12      Received messages 100, success 8, failed 90, still running 2
11/9/2024 12:32:09 PM   1       Refreshed lease for message 92
11/9/2024 12:32:11 PM   1       Refreshed lease for message 97
11/9/2024 12:32:12 PM   12      Received messages 100, success 8, failed 90, still running 2
11/9/2024 12:32:12 PM   1       Finished message with id 92 in 00:00:55.4921671
11/9/2024 12:32:13 PM   1       Refreshed lease for message 92
11/9/2024 12:32:14 PM   1       Refreshed lease for message 97
11/9/2024 12:32:15 PM   12      Received messages 100, success 9, failed 90, still running 1
11/9/2024 12:32:17 PM   1       Refreshed lease for message 97
11/9/2024 12:32:18 PM   12      Received messages 100, success 9, failed 90, still running 1
11/9/2024 12:32:19 PM   1       Refreshed lease for message 97
11/9/2024 12:32:21 PM   12      Received messages 100, success 9, failed 90, still running 1
11/9/2024 12:32:21 PM   1       Refreshed lease for message 97
11/9/2024 12:32:24 PM   1       Refreshed lease for message 97
11/9/2024 12:32:24 PM   12      Received messages 100, success 9, failed 90, still running 1
11/9/2024 12:32:27 PM   1       Refreshed lease for message 97
11/9/2024 12:32:27 PM   12      Received messages 100, success 9, failed 90, still running 1
11/9/2024 12:32:28 PM   1       Finished message with id 97 in 00:00:50.4062632
11/9/2024 12:32:29 PM   1       Refreshed lease for message 97
11/9/2024 12:32:30 PM   12      Received messages 100, success 10, failed 90, still running 0

And here things start to collapse. It took us 8 minutes to process all messages, each of them was taking around 1 minute, and we failed to process 90 out of one hundred. We lost 90% of all of the messages. Our system became unreliable just because we increased the CPU-bound part of the message processing. But why did it break the application exactly? What happened?

You don’t control the priority of continuations

Our application runs three distinct operations in total:

  • Take the message from the queue
  • Refresh the lease of the message
  • Do some processing of the message

Every single time we await the task, we release the thread and let it do something else. Once the IO-bound operation finishes, we schedule it to run on the same thread. However, the order of continuations doesn’t reflect the importance of what we should do.

In order to keep the system stable, we need to refresh the leases. Therefore, if there is any continuation that wants to refresh the lease (the continuation in KeepLease method), it should run before everything else.

Once we don’t have any continuations for refreshing the leases, we should run continuations for message processing. Obviously, if some KeepLease continuation gets scheduled, it should preempt other continuations.

Finally, when we have no continuations for refreshing the leases or processing the messages, we should run the continuation for getting new message from the queue. In other words, we receive a new message only when we have some idle CPU time that we can use to process something more.

Unfortunately, the async in C# doesn’t let you easily prioritize the continuations. However, this is not a problem most of the times because C# uses multiple threads! Once a continuation is free to run, the thread pool will grow to run the continuation earlier if possible. This is not part of the async programming paradigm and you can’t take it for granted. However, when we run things on a single thread, then continuations have no priorities and message processing continuations may stop lease refreshing continuations from running. Even worse, we may run continuation that receives new message from the bus even though we are already overloaded.

Depending on the nature of your platform (be it C# with different synchronization context, Python with single-threaded asyncio, or JavaScript with one and only one thread), you may get different results. Your application may scale well or may fail badly.

Let’s fix it

We can fix this issue in many ways. Conceptually, we need three different queues: the first one represents the lease refreshments, the second is for message processing, and the third is for getting new message from the bus. We would then have one processor that would check each of the queues in order and execute the operations accordingly. Unfortunately, rewriting the application from async paradigm to a consumer with multiple queues is not straightforward.

Instead, we can reorder the continuations. The trick is to introduce a priority for each continuation. We do the following:

  1. We store a desired priority of continuations running on a thread
  2. When a continuation wants to run, it checks if the desired priority is equal to the priority of the continuation
  3. If it’s equal, then the continuation resets the desired priority to some invalid value and continues
  4. Otherwise, the continuation bumps the priority if possible and lets other continuations run

The protocol is based on the following idea: some continuation sets the desired priority to be at least the priority of the continuation and then lets other continuations to run. If there are other continuations of lower priority, they will simply release the CPU and reschedule themselves. If there are continuations of some higher priority, they will bump the desired priority. And if there are no continuations, then th original continuation will finally get the CPU, run the code, and reset the priority, so other continuations can run the same dance over and over. Here is the code:

public static async Task Prioritize(int priority)
{
	var cookie = Guid.NewGuid();

	while (true)
	{
		if (CurrentSettings.DesiredPriority == priority && CurrentSettings.Cookie == cookie)
		{
			CurrentSettings.DesiredPriority = -1;
			CurrentSettings.Cookie = Guid.Empty;
			return;
		}
		else
		{
			if (CurrentSettings.DesiredPriority < priority)
			{
				CurrentSettings.DesiredPriority = priority;
				CurrentSettings.Cookie = cookie;
				await Task.Yield();
				continue;
			}
			else
			{
				await Task.Yield();
				continue;
			}
		}
	}
}

We need to run this method every single time when we run await. For instance, KeepLease becomes this:

public static async Task KeepLease(Message message)
{
	await Task.Yield();
	await Prioritize(3);

	while (message.WasFinished == false) // This is unsafe according to memory model
	{
		await Task.Delay(CurrentSettings.RefreshDelay);
		await Prioritize(3);

		if (DateTime.Now > message.LastRefreshTime + CurrentSettings.MessageLeaseTimeout)
		{
			message.WasLost = true;
			CurrentStats.Lost++;
			Log($"Lost lease for message {message.Id}");
			return;
		}
		else
		{
			await Task.Delay(CurrentSettings.RefreshDuration);
			await Prioritize(3);
			Log($"Refreshed lease for message {message.Id}");
			message.LastRefreshTime = DateTime.Now;
		}
	}

	CurrentStats.ProcessedSuccessfully++;
}

You can find the full snippet here. Let’s see it in action:

11/9/2024 12:35:42 PM   1       New message received with id 1
11/9/2024 12:35:43 PM   1       New message received with id 2
11/9/2024 12:35:45 PM   12      Received messages 2, success 0, failed 0, still running 2
11/9/2024 12:35:45 PM   1       Refreshed lease for message 1
11/9/2024 12:35:46 PM   1       Refreshed lease for message 2
11/9/2024 12:35:48 PM   12      Received messages 2, success 0, failed 0, still running 2
11/9/2024 12:35:48 PM   1       Refreshed lease for message 1
11/9/2024 12:35:49 PM   1       Refreshed lease for message 2
11/9/2024 12:35:51 PM   12      Received messages 2, success 0, failed 0, still running 2
11/9/2024 12:35:51 PM   1       Refreshed lease for message 1
11/9/2024 12:35:52 PM   1       Refreshed lease for message 2
11/9/2024 12:35:53 PM   1       New message received with id 3
11/9/2024 12:35:54 PM   12      Received messages 3, success 0, failed 0, still running 3
11/9/2024 12:35:55 PM   1       Refreshed lease for message 1
11/9/2024 12:35:56 PM   1       Refreshed lease for message 3
11/9/2024 12:35:57 PM   12      Received messages 3, success 0, failed 0, still running 3
11/9/2024 12:35:58 PM   1       Refreshed lease for message 1
11/9/2024 12:35:58 PM   1       Refreshed lease for message 2
11/9/2024 12:35:59 PM   1       Refreshed lease for message 3
11/9/2024 12:36:00 PM   12      Received messages 3, success 0, failed 0, still running 3
11/9/2024 12:36:01 PM   1       Refreshed lease for message 1
11/9/2024 12:36:02 PM   1       Refreshed lease for message 3
11/9/2024 12:36:03 PM   12      Received messages 3, success 0, failed 0, still running 3
...
11/9/2024 1:09:00 PM    1       Refreshed lease for message 100
11/9/2024 1:09:00 PM    1       Refreshed lease for message 98
11/9/2024 1:09:02 PM    12      Received messages 100, success 97, failed 0, still running 3
11/9/2024 1:09:03 PM    1       Refreshed lease for message 99
11/9/2024 1:09:03 PM    1       Refreshed lease for message 100
11/9/2024 1:09:04 PM    1       Refreshed lease for message 98
11/9/2024 1:09:05 PM    1       Finished message with id 98 in 00:00:46.5274062
11/9/2024 1:09:05 PM    12      Received messages 100, success 97, failed 0, still running 3
11/9/2024 1:09:06 PM    1       Refreshed lease for message 100
11/9/2024 1:09:06 PM    1       Refreshed lease for message 99
11/9/2024 1:09:07 PM    1       Refreshed lease for message 98
11/9/2024 1:09:08 PM    12      Received messages 100, success 98, failed 0, still running 2
11/9/2024 1:09:09 PM    1       Refreshed lease for message 100
11/9/2024 1:09:09 PM    1       Refreshed lease for message 99
11/9/2024 1:09:11 PM    12      Received messages 100, success 98, failed 0, still running 2
11/9/2024 1:09:12 PM    1       Refreshed lease for message 99
11/9/2024 1:09:12 PM    1       Refreshed lease for message 100
11/9/2024 1:09:14 PM    12      Received messages 100, success 98, failed 0, still running 2
11/9/2024 1:09:15 PM    1       Refreshed lease for message 99
11/9/2024 1:09:15 PM    1       Refreshed lease for message 100
11/9/2024 1:09:17 PM    1       Finished message with id 99 in 00:00:51.5793525
11/9/2024 1:09:17 PM    1       Refreshed lease for message 99
11/9/2024 1:09:17 PM    1       Refreshed lease for message 100
11/9/2024 1:09:17 PM    12      Received messages 100, success 99, failed 0, still running 1
11/9/2024 1:09:19 PM    1       Refreshed lease for message 100
11/9/2024 1:09:20 PM    12      Received messages 100, success 99, failed 0, still running 1
11/9/2024 1:09:22 PM    1       Refreshed lease for message 100
11/9/2024 1:09:23 PM    12      Received messages 100, success 99, failed 0, still running 1
11/9/2024 1:09:25 PM    1       Refreshed lease for message 100
11/9/2024 1:09:26 PM    12      Received messages 100, success 99, failed 0, still running 1
11/9/2024 1:09:27 PM    1       Refreshed lease for message 100
11/9/2024 1:09:29 PM    12      Received messages 100, success 99, failed 0, still running 1
11/9/2024 1:09:29 PM    1       Refreshed lease for message 100
11/9/2024 1:09:30 PM    1       Finished message with id 100 in 00:00:54.5227973
11/9/2024 1:09:32 PM    1       Refreshed lease for message 100
11/9/2024 1:09:32 PM    12      Received messages 100, success 100, failed 0, still running 0

We can see that the code runs much slower and it takes 35 minutes to complete. However, all messages are processed successfully and the code scales automatically. We don’t need to manually control the thread pool size, but the application simply processes fewer or more messages depending on the actual CPU-bound processing time.

Summary

async programming is very hard. We were told many times that it’s as simple as putting async here and await there. C# did a lot to get rid of deadlocks as nearly every platform now uses no synchronization context (as compared with old ASP.NET which had its own context or all the desktop apps that were running with a single thread). Also, C# uses a thread pool and can fix many programmer’s mistakes that can limit the scalability.

However, asynchronous programming can be implemented in many other ways. You can’t assume that it will use many threads or that the coroutine will be triggered immediately. Many quirks can decrease the performance. Python’s asyncio is a great example of how asynchronous programming can work much differently, especially if you take the Python’s performance into consideration. What’s IO-bound in C#, can easily become CPU-bound in Python because Python is way slower.

]]>
https://blog.adamfurmanek.pl/2024/11/09/async-wandering-part-15/feed/ 0
RPA Part 1 — Sharing Word/Excel between objects in Blue Prism https://blog.adamfurmanek.pl/2024/09/09/rpa-part-1/ https://blog.adamfurmanek.pl/2024/09/09/rpa-part-1/#respond Mon, 09 Sep 2024 16:22:45 +0000 https://blog.adamfurmanek.pl/?p=5086 Continue reading RPA Part 1 — Sharing Word/Excel between objects in Blue Prism]]> It’s quite common for Blue Prism developers to use MS Word VBO or similar to interact with Word instances in workflows. However, a problem arises when we want to extend the object with some custom functionality that utilizes Word COM object. The issue is that MS Word VBO doesn’t expose the Word instance and we can only use the handle. Due to that, Blue Prism developers often extend this object and add actions directly in it. This breaks many principles and is very nasty. Let’s see why and how to fix that.

Why extending MS Word VBO is not the best idea

Nothing stops us from extending the object. However, this is not ideal because:

  • IT administrators become scared of updating the object since it is hard to migrate the newly added actions
  • We stop understanding what’s in the object. Apart from new actions, someone may have actually modified existing ones and broken the object. We can’t trust it anymore
  • This breaks Open Closed principle – objects should be open for extension but closed for modification
  • MS Word VBO is written in VB.NET. If we want to use C#, we can’t do that as we can’t mix languages inside an object
  • It’s hard to share functionality between projects. Ideally, we’d like to be able to share our actions between projects and separate them from actions that we don’t want to share. This is much harder if we put these actions directly on the object
  • Actions are much further from the business context. If we need a specific action for one of our processes, then it’s probably better to keep the action with other actions in the process-specific object

There are probably many more reasons to avoid that. Generally, apart from technical limitations (like being unable to use C#), this is just a matter of best practices and good design.

Why it’s hard to put actions outside of the object

First and foremost, Word instance is a COM object. Without going into details, it’s an object that is globally visible in the operating system and exposes many APIs. You can call this object from Blue Prism, Visual Basic for Applications, PowerShell, Visual Basic Script, C#, Java, and much more.

However, actions in Blue Prism objects cannot return any type they want. They can deal with numbers, texts, collections, and some other things. However, they can’t return an arbitrary object type. This means that the action cannot return a Word instance. To work that around, MS Word Object returns a handle which is an integer (a number). So when we call MS Word.Create Instance, then the method returns a number that identifies the word instance held behind the scenes. If we then call a method like MS Word.Open, the MS Word Object must find the actual Word instance based on the number.

Technically, it works in the following way. When creating an instance, this is the code that is executed:

Dim word as Object = CreateObject("Word.Application")

' Create a GUID with which we can kill the instance later
' if we have to play hardball to get rid of it.
word.Caption = System.Guid.NewGuid().ToString().ToUpper()

handle = GetHandle(word)

We create an instance of the Word object, and then we get the handle for it. This is GetHandle:

' Gets the handle for a given instance
'
' If the instance is not yet held, then it is added to the 
' 	map and a handle is assigned to it. It is also set as the
' 	'current' instance, accessed with a handle of zero in the
' 	below methods.
'
' Either way, the handle which identifies the instance is returned
'
' @param Instance The instance for which a handle is required
'
' @return The handle of the instance
Protected Function GetHandle(Instance As Object) As Integer

	If Instance Is Nothing Then
		Throw New ArgumentNullException("Tried to add an empty instance")
	End If

	' Check if we already have this instance - if so, return it.
	If InstanceMap.ContainsKey(Instance) Then
		CurrentInstance = Instance
		Return InstanceMap(Instance)
	End If

	Dim key as Integer
	For key = 1 to Integer.MaxValue
		If Not HandleMap.ContainsKey(key)
			HandleMap.Add(key, Instance)
			InstanceMap.Add(Instance, key)
			Me.CurrentInstance = Instance
			Return key
		End If
	Next key

	Return 0

End Function

We first check if the Instance is empty. It is not, as we just created it. We then check if the instance is already in the InstanceMap collection that holds all the Word instances we already created. At this point, it is not there. In that case, we simply iterate over all numbers from 1 going up, and then we find the first unused number. We add the instance to the HandleMap with key equal to 1, and then we return 1 as the handle that the user will use later on.

Let’s now say that you call Open. This is what it does behind the scenes:

' Just ensure that the handle references a valid instance
HandleMissing = (GetInstance(handle) is Nothing)

This calls GetInstance which looks like this:

' Gets the instance corresponding to the given handle, setting
' 	the instance as the 'current' instance for future calls
'
' A value of 0 will provide the 'current' instance, which
' 	is set each time an instance is added or accessed.
'
' This will return Nothing if the given handle does not
' correspond to a registered instance, or if the current
' instance was closed and the reference has not been updated.
'
' @param Handle The handle representing the instance required,
' 		or zero to get the 'current' instance.
Protected Function GetInstance(Handle As Integer) As Object

	Dim Instance As Object = Nothing
	
	If Handle = 0 Then
		If CurrentInstance Is Nothing Then
			' Special case - getting the current instance when the
			' instance is not set, try and get a current open instance.
			' If none there, create a new one and assign a handle as if
			' CreateInstance() had been called
		'	Try
		'		Instance = GetObject(,"Word.Application")
		'	Catch ex as Exception ' Not running
		'		Instance = Nothing
		'	End Try
		'	If Instance Is Nothing Then
				Create_Instance(Handle)
				' Instance = CreateObject("Word.Application")
				' Force the instance into the maps.
				' GetHandle(Instance)
				' CurrentInstance should now be set.
				' If it's not, we have far bigger problems
		'	End If
		End If
		Return CurrentInstance
	End If
	
	If Not HandleMap.ContainsKey(Handle)
            Throw New ArgumentException("A valid attachment to the application cannot been detected. Please check that Blue Prism is attached to an instance of the application.")
	End If

	Instance = HandleMap(Handle)
	If Not Instance Is Nothing Then
		CurrentInstance = Instance
	End If
	Return Instance

End Function

We just check if the handle is in the HandleMap and then store it in the CurrentInstance. So this is how we get the Word instance back from the integer passed by the user.

To use the Word instance in some other object, we would need to access the Word instance. However, we can’t just return it from the action as Blue Prism doesn’t support returning any type. We could also try to get access to HandleMap and then extract the instance. This is doable but far from straightforward. However, we can use some clever tricks to get access.

One of possible workarounds – truly global variable in Blue Prism

The idea is to create a global dictionary for sharing any data between any objects in Blue Prism process. The solution will work like this:

  1. We create a code block that creates a globally-accessible dictionary storing string-object pairs
  2. We add one action to MS Word VBO to put the Word COM instance in the dictionary based on the handle
  3. In some other object (like our custom MS Word Object), we extract the com instance from the globally-accessible dictionary and use it accordingly

Let’s see that in action.

First, let’s create an object CustomWordObject.

We need these two external references added:

System.Windows.Forms.dll

and

C:\Program Files (x86)\Microsoft Visual Studio\Shared\Visual Studio Tools for Office\PIA\Office15\Microsoft.Office.Interop.Word.dll

You may need to adjust the path to the Word dll based on your Word version.

Now, let’s create an action named CreateGlobalDictionary with the following content:

<process name="__selection__CustomWordObject" type="object" runmode="Exclusive"><stage stageid="58774a62-94b0-4409-b265-8a5bd53ef49b" name="Start" type="Start"><subsheetid>16965f9d-eb5e-44bc-bb24-fb2732756063</subsheetid><loginhibit /><display x="15" y="-105" /><onsuccess>1e2e0509-5b3b-4b0b-8d20-dcc5b5e91743</onsuccess></stage><stage stageid="ced18beb-abce-4eac-92d4-e627fa99a391" name="End" type="End"><subsheetid>16965f9d-eb5e-44bc-bb24-fb2732756063</subsheetid><loginhibit /><display x="15" y="90" /></stage><stage stageid="1e2e0509-5b3b-4b0b-8d20-dcc5b5e91743" name="Create global dictionary" type="Code"><subsheetid>16965f9d-eb5e-44bc-bb24-fb2732756063</subsheetid><loginhibit /><display x="15" y="-15" w="90" h="30" /><onsuccess>ced18beb-abce-4eac-92d4-e627fa99a391</onsuccess><code><![CDATA[try{
	var name = new System.Reflection.AssemblyName("GlobalAssemblyForSharingData");
	var assemblyBuilder = System.Reflection.Emit.AssemblyBuilder.DefineDynamicAssembly(name, System.Reflection.Emit.AssemblyBuilderAccess.Run);
	var moduleBuilder = assemblyBuilder.DefineDynamicModule(name.Name ?? "GlobalAssemblyForSharingDataModule");
	var typeBuilder = moduleBuilder.DefineType("GlobalType", System.Reflection.TypeAttributes.Public);
	var fieldBuilder = typeBuilder.DefineField("GlobalDictionary", typeof(System.Collections.Generic.Dictionary<string, object>), System.Reflection.FieldAttributes.Public | System.Reflection.FieldAttributes.Static);
	Type t = typeBuilder.CreateType();
	System.Reflection.FieldInfo fieldInfo = t.GetField("GlobalDictionary");
	fieldInfo.SetValue(null, new System.Collections.Generic.Dictionary<string, object>());
}catch(Exception e){
	System.Windows.Forms.MessageBox.Show(e.ToString());
}]]></code></stage></process>

Specifically, there is a code block that does the following:

try{
	var name = new System.Reflection.AssemblyName("GlobalAssemblyForSharingData");
	var assemblyBuilder = System.Reflection.Emit.AssemblyBuilder.DefineDynamicAssembly(name, System.Reflection.Emit.AssemblyBuilderAccess.Run);
	var moduleBuilder = assemblyBuilder.DefineDynamicModule(name.Name ?? "GlobalAssemblyForSharingDataModule");
	var typeBuilder = moduleBuilder.DefineType("GlobalType", System.Reflection.TypeAttributes.Public);
	var fieldBuilder = typeBuilder.DefineField("GlobalDictionary", typeof(System.Collections.Generic.Dictionary<string, object>), System.Reflection.FieldAttributes.Public | System.Reflection.FieldAttributes.Static);
	Type t = typeBuilder.CreateType();
	System.Reflection.FieldInfo fieldInfo = t.GetField("GlobalDictionary");
	fieldInfo.SetValue(null, new System.Collections.Generic.Dictionary<string, object>());
}catch(Exception e){
	System.Windows.Forms.MessageBox.Show(e.ToString());
}

We use some .NET magic to dynamically create an assembly named GlobalAssemblyForSharingData, add a type named GlobalType to it, add a global field named GlobalDictionary, and initialize it accordingly.

Next, let’s add the action to MS Word VBO named ExportWord with the following content:

<process name="__selection__MS Word" type="object" runmode="Background"><stage stageid="f44e379f-73b9-4e3a-8f3a-304011753dbc" name="Start" type="Start"><subsheetid>5600ecd3-0257-4793-8c9b-58fba32fc417</subsheetid><loginhibit /><display x="15" y="-105" /><inputs><input type="number" name="handle" stage="handle" /></inputs><onsuccess>064c4864-b486-4ff5-9e44-0d9b0e11c003</onsuccess></stage><stage stageid="2ef37518-b895-403e-93ee-76b9743c2cf0" name="End" type="End"><subsheetid>5600ecd3-0257-4793-8c9b-58fba32fc417</subsheetid><loginhibit /><display x="15" y="90" /><outputs><output type="text" name="word" stage="word" /></outputs></stage><stage stageid="064c4864-b486-4ff5-9e44-0d9b0e11c003" name="Export Word" type="Code"><subsheetid>5600ecd3-0257-4793-8c9b-58fba32fc417</subsheetid><loginhibit /><display x="15" y="-45" /><inputs><input type="number" name="handle" expr="[handle]" /></inputs><outputs><output type="text" name="word" stage="word" /></outputs><onsuccess>2ef37518-b895-403e-93ee-76b9743c2cf0</onsuccess><code><![CDATA[Dim identifier = Guid.NewGuid().ToString()
Dim globalDictionary As System.Collections.Generic.Dictionary(Of String, Object) = Nothing


For Each assembly In System.AppDomain.CurrentDomain.GetAssemblies()
	Try
		For Each type In assembly.GetTypes()
			If type.Name = "GlobalType" Then
				globalDictionary = CType(type.GetField("GlobalDictionary").GetValue(Nothing), System.Collections.Generic.Dictionary(Of String, Object))
			End If
		Next

	Catch e As Exception
	End Try
		
Next

globalDictionary(identifier) = GetInstance(handle)

word = identifier]]></code></stage><stage stageid="e0455cc2-086d-4c6a-a222-6d18dbe4d221" name="handle" type="Data"><subsheetid>5600ecd3-0257-4793-8c9b-58fba32fc417</subsheetid><display x="90" y="-105" /><datatype>number</datatype><initialvalue /><private /><alwaysinit /></stage><stage stageid="dfd065c2-23f1-4821-a3a9-109a62bbac87" name="word" type="Data"><subsheetid>5600ecd3-0257-4793-8c9b-58fba32fc417</subsheetid><display x="90" y="-45" /><datatype>text</datatype><initialvalue /><private /><alwaysinit /></stage></process>

Let’s see the code in detail:

Dim identifier = Guid.NewGuid().ToString()
Dim globalDictionary As System.Collections.Generic.Dictionary(Of String, Object) = Nothing


For Each assembly In System.AppDomain.CurrentDomain.GetAssemblies()
	Try
		For Each type In assembly.GetTypes()
			If type.Name = "GlobalType" Then
				globalDictionary = CType(type.GetField("GlobalDictionary").GetValue(Nothing), System.Collections.Generic.Dictionary(Of String, Object))
			End If
		Next

	Catch e As Exception
	End Try
		
Next

globalDictionary(identifier) = GetInstance(handle)

word = identifier

We list all the assemblies, then we find all the types, then we find the type with the global dictionary. Next, we get the dictionary and put the COM object in it. Finally, we return the identifier.

With the regular actions on the MS Word Object, we need to identify the Word instance by using an integer handle. For our custom actions, we will use a string identifier that serves the same purpose. Just like the regular actions accept handle and other parameters, we’ll accept the same with only a different identifier type.

Lastly, we create a new action in our custom object. Let’s say that we would like to show the instance of the Word application. This is the action:

<process name="__selection__CustomWordObject" type="object" runmode="Exclusive"><stage stageid="026cdc08-34e3-4474-a338-8b3b43d93076" name="Start" type="Start"><subsheetid>c76f8dc4-c63e-4820-b619-475abe9c3adc</subsheetid><loginhibit /><display x="15" y="-105" /><inputs><input type="text" name="identifier" stage="identifier" /></inputs><onsuccess>659b998f-463b-401a-b72f-e7f809dc5151</onsuccess></stage><stage stageid="5a5fb9ef-562a-457d-945f-432e8ac1610c" name="End" type="End"><subsheetid>c76f8dc4-c63e-4820-b619-475abe9c3adc</subsheetid><loginhibit /><display x="15" y="90" /></stage><stage stageid="25067651-11ab-463e-9610-9e3ae43a732b" name="identifier" type="Data"><subsheetid>c76f8dc4-c63e-4820-b619-475abe9c3adc</subsheetid><display x="90" y="-105" /><datatype>text</datatype><initialvalue /><private /><alwaysinit /></stage><stage stageid="659b998f-463b-401a-b72f-e7f809dc5151" name="Show Word" type="Code"><subsheetid>c76f8dc4-c63e-4820-b619-475abe9c3adc</subsheetid><loginhibit /><display x="15" y="-45" /><inputs><input type="text" name="identifier" expr="[identifier]" /></inputs><onsuccess>5a5fb9ef-562a-457d-945f-432e8ac1610c</onsuccess><code><![CDATA[System.Collections.Generic.Dictionary<string, object> globalDictionary = null;
		foreach(var assembly in System.AppDomain.CurrentDomain.GetAssemblies()){
			try{
				foreach(var type in assembly.GetTypes()){
					if(type.Name == "GlobalType"){
						globalDictionary = (System.Collections.Generic.Dictionary<string, object>)type.GetField("GlobalDictionary").GetValue(null);
					}
				}
			}catch(Exception e){
			}
		}
((Microsoft.Office.Interop.Word.ApplicationClass)globalDictionary[identifier]).Visible = true;]]></code></stage></process>

And here is the code specifically:

System.Collections.Generic.Dictionary<string, object> globalDictionary = null;
		foreach(var assembly in System.AppDomain.CurrentDomain.GetAssemblies()){
			try{
				foreach(var type in assembly.GetTypes()){
					if(type.Name == "GlobalType"){
						globalDictionary = (System.Collections.Generic.Dictionary<string, object>)type.GetField("GlobalDictionary").GetValue(null);
					ac	}
				}
			}catch(Exception e){
			}
		}
((Microsoft.Office.Interop.Word.ApplicationClass)globalDictionary[identifier]).Visible = true;

You can see that it’s the same code for obtaining the dictionary as before. We list all the types, we then find the dictionary and extract the COM instance based on the identifier obtained when exporting the instance. We then cast the object to the known interface and then simply change the property to show the Word application.

This is how we would use it in the workflow:

<process name="__selection__Test - MS Word"><stage stageid="991ba627-4d4e-4035-b5bb-a91943c289b0" name="Start" type="Start"><display x="15" y="-150" /><onsuccess>24a17572-5a8c-4ad1-8119-36c634d3cc75</onsuccess></stage><stage stageid="480cfc9d-c4c1-4556-958f-17c471c330ef" name="End" type="End"><display x="15" y="240" /></stage><stage stageid="24a17572-5a8c-4ad1-8119-36c634d3cc75" name="MS Word::Create Instance" type="Action"><loginhibit onsuccess="true" /><display x="15" y="-60" w="120" h="30" /><outputs><output type="number" name="handle" friendlyname="handle" stage="handle" /></outputs><onsuccess>54e185fe-aa0e-4796-b4d9-ab85cf14aee2</onsuccess><resource object="MS Word" action="Create Instance" /></stage><stage stageid="25f56258-8ced-4ea2-8ade-b4bb515e1cd1" name="handle" type="Data"><display x="210" y="-60" /><datatype>number</datatype><initialvalue /><private /><alwaysinit /></stage><stage stageid="54e185fe-aa0e-4796-b4d9-ab85cf14aee2" name="MS Word::Open" type="Action"><loginhibit onsuccess="true" /><display x="15" y="0" w="90" h="30" /><inputs><input type="number" name="handle" friendlyname="handle" expr="" /><input type="text" name="File Name" friendlyname="File Name" expr="&quot;C:\Users\user\Documents\Word.docx&quot;" /></inputs><outputs><output type="text" name="Document Name" friendlyname="Document Name" stage="" /></outputs><onsuccess>8cfb4920-496f-4db5-8098-ee503aec2b89</onsuccess><resource object="MS Word" action="Open" /></stage><stage stageid="8cfb4920-496f-4db5-8098-ee503aec2b89" name="CustomWordObject:CreateGlobalDictionary" type="Action"><loginhibit onsuccess="true" /><display x="15" y="60" w="180" h="30" /><onsuccess>6c70b7da-9c20-461e-97e3-99b3617d2109</onsuccess><resource object="CustomWordObject" action="CreateGlobalDictionary" /></stage><stage stageid="6c70b7da-9c20-461e-97e3-99b3617d2109" name="MS Word::ExportWord" type="Action"><loginhibit onsuccess="true" /><display x="15" y="120" w="90" h="30" /><inputs><input type="number" name="handle" friendlyname="handle" expr="" /></inputs><outputs><output type="text" name="word" friendlyname="word" stage="word" /></outputs><onsuccess>16f8e48f-b508-4bad-b79c-7e38fa5f7cf8</onsuccess><resource object="MS Word" action="ExportWord" /></stage><stage stageid="61ba0748-5329-421d-9180-a260d53e2aee" name="word" type="Data"><display x="210" y="120" /><datatype>text</datatype><initialvalue /><private /><alwaysinit /></stage><stage stageid="16f8e48f-b508-4bad-b79c-7e38fa5f7cf8" name="CustomWordObject::ShowWord" type="Action"><loginhibit onsuccess="true" /><display x="15" y="180" w="150" h="30" /><inputs><input type="text" name="identifier" friendlyname="identifier" expr="[word]" /></inputs><onsuccess>480cfc9d-c4c1-4556-958f-17c471c330ef</onsuccess><resource object="CustomWordObject" action="ShowWord" /></stage></process>

You can see that we first open the Word instance and open the file using the regular Word object. We then call CreateGlobalDictionary and ExportWord to export the instance. Finally, we call ShowWord from our custom object. This way, you can add any action in your business object and not touch the Blue Prism’s stock object anymore.

Further improvements

It’s worth noting that this solution provides a truly global dictionary for sharing any data between any objects. Nothing stops us from sharing other things. Just don’t abuse the mechanism.

It’s also worth noticing that there is a big code duplication. We could put it in a NuGet package to reuse the code easily.

]]>
https://blog.adamfurmanek.pl/2024/09/09/rpa-part-1/feed/ 0
Pitless Pit Part 1 — Furmanek Test for consciousness https://blog.adamfurmanek.pl/2024/09/08/pitless-pit-part-1/ https://blog.adamfurmanek.pl/2024/09/08/pitless-pit-part-1/#respond Sun, 08 Sep 2024 20:45:11 +0000 https://blog.adamfurmanek.pl/?p=5081 Continue reading Pitless Pit Part 1 — Furmanek Test for consciousness]]> Today I’m going to propose a framework for determining whether a being is conscious. I call it the Furmanek Test (just like the famous Turing Test). The test is based on identifying paradoxes among problems that can be proved unsolvable.

General Idea

According to Wikipedia, “a paradox is a logically self-contradictory statement or a statement that runs contrary to one’s expectation. It is a statement that, despite apparently valid reasoning from true or apparently true premises, leads to a seemingly self-contradictory or a logically unacceptable conclusion.” The crucial part I consider here is that the statement is contrary to one’s expectations.

I propose that we use “having expectations” as a clue whether something is conscious or not. There are many reasons why it’s useful:

  • A problem may have an unexpected result that we don’t consider a paradox. This might be because we didn’t have expectations (in that case the solution is “surprising” rather than “unexpected”) or our expectations were not reasonable (we made mistakes or we didn’t have enough knowledge of the domain to formulate viable expectations)
  • Expectations can be explained and justified to some extent. We can explain why we expect something to happen by providing examples or analogies even though we may not be able to define the reasons for having expectations formally. Just like with feelings, we can describe them even though we may not be able to define them scientifically
  • Paradoxes are not trivial. It takes some maturity and knowledge to really understand paradoxes and why they are “unexpected”
  • It’s typically easy to solve the problem (or show that there is no solution). The challenge is not in solving the problem. It’s rather in explaining why the solution is strange/surprising/unexpected and how to actually solve the paradox (which typically takes formalizing a domain model or changing assumptions)

That being said, we can use paradoxes to analyze the reasoning. We can’t easily formalize what constitutes a paradox, but we can often tell whether something is a paradox or not.

Furmanek Test for consciousness

The test goes as follows:

  1. We define a set of simple problems. Some of them are unsolvable, some of them can be easily shown that there are no solutions, and some of them are paradoxes
  2. We ask the being to identify which problems are paradoxes
  3. We assess the answers. If the being identifies all paradoxes and doesn’t have any false positives, then we conclude it’s conscious

Simple as that. Let’s now discuss some important remarks:

  • We should have many problems (for instance a hundred), 10% of which are paradoxes. This is to make it very unlikely that the being identifies the paradoxes by pure chance
  • Obviously, the problems should be novel ones so that the being doesn’t know them from the Internet (which is especially important if we’re talking about an AI-based being that is trained on the publicly available materials)
  • Paradoxes should be widely accepted as such. Not many question that the Barber Paradox is a paradox while quite a few suggest that Wild Card Poker paradox is not a paradox at all

Obviously, just like with an IQ test, we have a general idea how to perform the test, but we need to make sure that the actual implementation of the test is of a high quality.

Discussion

There are many reasons why this test may not work well. I consider only some of the arguments here.

Consciousness versus human-like consciousness

The first thing to consider is whether this test identifies consciousness in general or some human-specific consciousness. It may be quite reasonable to assume that while we perceive some problems as paradoxes, other beings may not agree. The concept of paradox is based on “expectations”. If some other beings have different expectations, then the paradox isn’t a paradox anymore.

This can be obviously generalized to cultural-dependent consciousness. We may have different expectations based on our culture, language, senses, education, age, or even if we write left-to-right. The actual test should be as resilient to these aspects as possible.

Levels of consciousness

Second, it’s questionable whether humans’ consciousness is a subset of mammals’ consciousness is a subset of animals’ consciousness is a subset of Earth-beings’ consciousness. This test may be very narrow in terms of that it actually determines the human-like consciousness instead of the general one.

Formulation of the problems

We need to formulate the problems in a way that they don’t suggest assumptions or rely on some specific reasoning. For instance, with the barber paradox, we explicitly ask if “the barber shaves himself”. We don’t ask something based on maths (like “is there a set of people that shave themselves”) or require specific formal formulation. At the same time, the problems must be defined using some language that shouldn’t impact the answer. Generally, we need to solve very similar issues as with the IQ tests.

Asking to identify paradoxes or not

In the test formulation, I specify that we ask the being to identify the paradoxes. However, we may not be that specific. We may ask the being to analyze the problems and share some thoughts. If they do realize there are paradoxes, then we may differentiate between the “levels of consciousness”.

]]>
https://blog.adamfurmanek.pl/2024/09/08/pitless-pit-part-1/feed/ 0
Bit Twiddling Part 6 — Stop RDP from detaching GUI when the client disconnects https://blog.adamfurmanek.pl/2024/06/05/bit-twiddling-part-6/ https://blog.adamfurmanek.pl/2024/06/05/bit-twiddling-part-6/#respond Wed, 05 Jun 2024 11:06:31 +0000 https://blog.adamfurmanek.pl/?p=5057 Continue reading Bit Twiddling Part 6 — Stop RDP from detaching GUI when the client disconnects]]>

This is the sixth part of the Bit Twiddling series. For your convenience you can find other parts in the table of contents in Par 1 — Modifying Android application on a binary level

Today we’re going to solve the problem when the remote server disconnects GUI when we lock the workstation or disconnect.

Let’s start by explaining the problem a little bit more. Imagine that you connect like this:

Local ---> Remote

If you now lock the Local workstation (with WIN+L) or disconnect, then the GUI on Remote will break. The user on Remote will not be logged out, the applications will continue to work, but the UI will not be there. This effectively breaks things like VNC or apps that take screenshots or click on the screen.

Similarly, the same happens when you connection with a jump host:

Local ---> Windows Server 2019 ---> Remote

If you now disconnect Local or lock Local workstation, then the GUI on Remote will break. However, it’s different if you use Windows Server 2016:

Local ---> Windows Server 2016 ---> Remote

If you now disconnect Local or lock the workstation, then the GUI on Remote will still be there. This suggests that something has changed in Windows Server 2019. This is not surprising as this is the time when Microsoft implemented RemoteFX heavily into their RDP to support cameras, USB redirection, improve security and much more.

At this point we have one solution – just use the Windows Server 2016 jump host and it fixes the issue. However, this way you won’t use the camera (it doesn’t work in 2016 typically) and you need to keep another machine along the way (this could be a virtual machine on your Remote, though). Let’s see if we can do better.

Fixing the workstation lock

The first thing to notice is that if you suspend the mstsc.exe on Local (with Task Manager or debugger) and then lock the workstation, then Remote doesn’t lose the UI. This suggests that disconnecting the GUI is an explicit action of the RDP implementation. However, we can easily fix that.

mstsc.exe registers for session notifications with WTSRegisterSessionNotification. We can see that with ApiMonitor:

We can simply hook this method to not register for any notifications. We can use that with the following WinDBG script:

.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
bu	WTSAPI32!WTSRegisterSessionNotification	""e @rip 0x48 0xc7 0xc0 0x01 0x00 0x00 0x00 0xC3; qd""
g

The shell code we put here is simply:

move rax, 1
ret

You can run mstsc.exe, attach the debugger, add the breakpoint, resume the application, and then connect to the server. The debugger will detach shortly after you open the connection and you can check that now you can lock the workstation and the GUI will still be there.

Fixing the disconnection

Fixing the case on disconnect is slightly harder. Once again, when we pause the mstsc.exe, then the remote session still works even if we keep mstsc.exe paused for a long time. This suggests that the session is up as long as the connection is alive. Even though we paused the mstsc.exe application, the operating system keeps the connection up on the TCP level. We can exhibit that.

The idea is to run a proxy on the Remote. We’ll then connect like this:

Local ---> Proxy ---> Remote

The proxy must just route the connection to the RDP server (on the port 3389),but do not lose the connection to Remote when the connection from Local disappears. This way we can simply kill the mstsc.exe on Local, and then the connection will still be alive and the GUI will persist. However, if you close mstsc.exe gracefully, then the client will terminate the GUI on the remote server. Therefore, just kill the client. You can also put your machine to sleep and the UI should still work.

]]>
https://blog.adamfurmanek.pl/2024/06/05/bit-twiddling-part-6/feed/ 0
Bit Twiddling Part 5 — Fixing audio latency in mstsc.exe (RDP) https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/ https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/#respond Thu, 30 May 2024 10:34:31 +0000 https://blog.adamfurmanek.pl/?p=5036 Continue reading Bit Twiddling Part 5 — Fixing audio latency in mstsc.exe (RDP)]]>

This is the fifth part of the Bit Twiddling series. For your convenience you can find other parts in the table of contents in Par 1 — Modifying Android application on a binary level

Today we’re going to fix the audio latency in mstsc.exe. Something that people really ask about on the Internet and there is no definite solution. I’ll show how to hack mstsc.exe to fix the latency. First, I’ll explain why existing solutions do not work and what needs to be done, and at the end of this post I provide an automated PowerShell script that does the magic.

What is the issue

First, a disclaimer. I’ll describe what I suspect is happening. I’m not an expert of the RDP and I didn’t see the source code of mstsc.exe. I’m just guessing what happens based on what I see.

RDP supports multiple channels. There is a separate channel for video and another one for audio. Unfortunately, these two channels are not synchronized which means that they are synchronized on the client on a best effort basis (or rather sheer luck). To understand why it’s hard to synchronize two independent streams, we need to understand how much different they are.

Video is something that doesn’t need to be presented “for some time”. We can simply take the last video frame and show it to the user. If frames are buffered, we just process them as fast as possible to present the last state. You can see that mstsc.exe does exactly that by connecting to a remote host, playing some video, and then suspending the mstsc.exe process with ProcessExplorer. When you resume it after few seconds, you’ll see the video moving much faster until the client catches up with the latest state.

When it comes to audio, things are much different. You can’t just jump to the latest audio because you’d loose the content as it would be unintelligible. Each audio piece has its desired length. When you get delayed, you could just skip the packets (and lose some audio), play it faster (which would make it less intelligible for some time), or just play it as it goes. However, to decide what to do, you would need to understand whether the piece you want to play is delayed or not. You can’t tell that without timestamps or time markers, and I believe RDP doesn’t send those.

As long as you’re getting audio packets “on time”, there is no issue. You just play them. The first part is if you get them “in time”. This depends on your network quality and on the server that sends you the sound. From my experience, Windows Server is better when it comes to speed of sending updates. I can see my mouse moving faster and audio delayed less often when I connect to the Windows Server than Windows 10 (or other client edition). Therefore, use Windows Server where possible. Just keep in mind that you’ll need CAL licenses to send microphone and camera to the server which is a bummer (client edition lets you do that for free). Making the packets to be sent as fast as possible is the problem number one.

However, there is another part to that. If your client gets delayed for whatever reason (CPU spike, overload, or preemption), your sound will effectively slow down. You will just play it “later” even though you received it “on time”. Unfortunately, RDP cannot detect whether this happened because there are no timestamps in the stream. As far as I can tell, this is the true reason why your sound is often delayed. You can get “no latency audio” over the Internet and I had it many times. However, the longer you run the client, the higher the chance that you’ll go out of sync. This is the problem number two.

Therefore, we need to “resync” the client. Let’s see how to do it.

Why existing solutions don’t work and what fix we need

First, let me explain why existing solutions won’t work. Many articles on the Internet tell you to change the audio quality to High in Group Policy and enforce the quality in your rdp.file by setting audioqualitymode:i:2. This fixes the problem number one (although I don’t see much difference to be honest), but it doesn’t address the problem number two.

Some other articles suggest other fixes on the remote side. All these fixes have one thing in common – they don’t fix the client. If the mstsc.exe client cannot catch up when it gets delayed, then the only thing you can do is to reset the audio stream. Actually, this is how you can fix the delay easily – just restart the audio service:

net stop audiosrv & timeout 3 & net start audiosrv

I add the timeout to give some time to clear the buffer on the client. Once the buffer is empty, we restart the service and then the audio should be in sync. Try this to verify if it’s physically possible to deliver the audio “on time” in your networking conditions.

Unfortunately, restarting the audio service have many issues. First, it resets the devices on the remote end, so your audio streams may break and you’ll need to restart them. Second, this simply takes time. You probably don’t want to lose a couple of seconds of audio and microphone when you’re presenting (and unfortunately, you’ll get delayed exactly during that time).

What we need is to fix the client to catch up when there is a delay. However, how can we do that when we don’t have any timestamps? Well, the solution is simple – just drop some audio packages (like 10%) periodically to sync over time. This will decrease the audio quality to some extent and won’t fix the delay immediately, but after few seconds we’ll get back on track. Obviously, you could implement some better heuristics and solutions. There is one problem, though – we need to do that in the mstsc.exe itself. And here comes the low level magic. Let’s implement the solution that drops the audio frames to effectively “resync” the audio.

How to identify

We need to figure out how the sound is played. Let’s take ApiMonitor to trace the application. Luckily enough, it seems that the waveOutPrepareHeader is used, as we can see in the screenshot:

Let’s break there win WinDBG:

bu 0x00007ffb897d3019

kb

 # RetAddr               : Args to Child                                                           : Call Site														
00 00007ffb`897d1064     : 00000252`6d68beb8 00000252`6d68beb8 00000000`000014ac 00000252`6d68be00 : mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d														
01 00007ffb`897e7ee8     : 00000000`00000000 00000073`bfc7fcb9 00000252`6d68be00 00000252`7e69df80 : mstscax!CRdpWinAudioWaveoutPlayback::vcwaveWritePCM+0xec														
02 00007ffb`898e73bf     : 00000000`00000001 00000000`00000003 00000073`bfc7d055 00000073`00001000 : mstscax!CRdpWinAudioWaveoutPlayback::RenderThreadProc+0x2c8														
03 00007ffc`02e57344     : 00000000`000000ac 00000252`6d68be00 00000000`00000000 00000000`00000000 : mstscax!CRdpWinAudioWaveoutPlayback::STATIC_ThreadProc+0xdf														
04 00007ffc`046826b1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14														
05 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

We can see a method named mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite. Let’s see it:

u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d		
									
mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:											
00007ffb`897d2fac 48895c2410      mov     qword ptr [rsp+10h],rbx											
00007ffb`897d2fb1 55              push    rbp											
00007ffb`897d2fb2 56              push    rsi											
00007ffb`897d2fb3 57              push    rdi											
00007ffb`897d2fb4 4883ec40        sub     rsp,40h											
00007ffb`897d2fb8 488bf2          mov     rsi,rdx											
00007ffb`897d2fbb 488bd9          mov     rbx,rcx											
00007ffb`897d2fbe 488b0543287500  mov     rax,qword ptr [mstscax!WPP_GLOBAL_Control (00007ffb`89f25808)]											
00007ffb`897d2fc5 488d2d3c287500  lea     rbp,[mstscax!WPP_GLOBAL_Control (00007ffb`89f25808)]											
00007ffb`897d2fcc 483bc5          cmp     rax,rbp											
00007ffb`897d2fcf 740a            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x2f (00007ffb`897d2fdb)											
00007ffb`897d2fd1 f6401c01        test    byte ptr [rax+1Ch],1											
00007ffb`897d2fd5 0f85e8000000    jne     mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x117 (00007ffb`897d30c3)											
00007ffb`897d2fdb 83bbc000000000  cmp     dword ptr [rbx+0C0h],0											
00007ffb`897d2fe2 7418            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x50 (00007ffb`897d2ffc)											
00007ffb`897d2fe4 488b8bb8000000  mov     rcx,qword ptr [rbx+0B8h]											
00007ffb`897d2feb 4885c9          test    rcx,rcx											
00007ffb`897d2fee 740c            je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x50 (00007ffb`897d2ffc)											
00007ffb`897d2ff0 48ff15a9295c00  call    qword ptr [mstscax!_imp_EnterCriticalSection (00007ffb`89d959a0)]											
00007ffb`897d2ff7 0f1f440000      nop     dword ptr [rax+rax]											
00007ffb`897d2ffc 488b4b70        mov     rcx,qword ptr [rbx+70h]											
00007ffb`897d3000 4885c9          test    rcx,rcx											
00007ffb`897d3003 0f84f2000000    je      mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x14f (00007ffb`897d30fb)											
00007ffb`897d3009 41b830000000    mov     r8d,30h											
00007ffb`897d300f 488bd6          mov     rdx,rsi											
00007ffb`897d3012 48ff15e7a87900  call    qword ptr [mstscax!_imp_waveOutPrepareHeader (00007ffb`89f6d900)]											
00007ffb`897d3019 0f1f440000      nop     dword ptr [rax+rax]

Okay, we can see that this method passes the audio packet to the API. When we look at WAVEHDR structure, we can see that it has the following fields:

typedef struct wavehdr_tag {
  LPSTR              lpData;
  DWORD              dwBufferLength;
  DWORD              dwBytesRecorded;
  DWORD_PTR          dwUser;
  DWORD              dwFlags;
  DWORD              dwLoops;
  struct wavehdr_tag  *lpNext;
  DWORD_PTR          reserved;
} WAVEHDR, *LPWAVEHDR;

This is exactly what we see in ApiMonitor. Seems like the dwBufferLength is what we might want to change. When we shorten this buffer, we’ll effectively make the audio last shorter. We can do that for some of the packets to not break the quality much, and then all should be good.

We can verify that this works with this breakpoint:

bp 00007ffb`897d3012 "r @$t0 = poi(rdx + 8); r @$t1 = @$t0 / 2; ed rdx+8 @$t1; g"

Unfortunately, this makes the client terribly slow. We need to patch the code in place. Effectively, we need to inject a shellcode.

First, we need to allocate some meory with VirtualAllocEx via .dvalloc.

.dvalloc 1024

The debugger allocates the memory. In my case the address is 25fb8960000.

The address of the WinAPI function is in the memory, so we need to remember to extract the pointer from the address:

00007ffb`897d3012 48ff15e7a87900  call    qword ptr [mstscax!_imp_waveOutPrepareHeader (00007ffb`89f6d900)]

Now we need to do two things: first, we need to patch the call site to call our shellcode instead of [mstscax!_imp_waveOutPrepareHeader (00007ffb89f6d900)]. Second, we need to construct a shell code that fixes the audio packet for some of the packets, and then calls [mstscax!_imp_waveOutPrepareHeader (00007ffb89f6d900)] correctly.

To do the first thing, we can do the absolute jump trick. We put the address in the rax register, push it on the stack, and then return. This is the code:

mov rax, 0x25fb8960000	;Move the address to the register
push rax		;Push the address on the stack
ret			;Return. This takes the address from the stuck and jumps
nop			;Nops are just to not break the following instructions when you disassemble with u
nop
nop
nop

We can compile the code with Online assembler and we should get a shell code. We can then put it in place with this line:

e	mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+5d		0x48 0xB8 0x00 0x00 0x96 0xb8 0x5f 0x02 0x00 0x00 0x50 0xC3 0x90 0x90 0x90 0x90

Unfortunately, this patch is long. We need to break few lines and then restore them in the shellcode. So our shell code starts with the lines that we broke:

mov r8d, 0x30	;Preserved code
mov rdx,rsi	;Preserved code

Next, we need to preserve our working registers:

push rbx
push rdx

Okay, now we can do the logic. We want to modify the buffer length. However, we don’t want to do it for all of the packets. We need some source of randomness, like time, random values, or something else. Fortunately, the WAVEHDR structure has the field dwUser which may be “random enough” for our needs. Let’s take that value modulo some constant, and then change the packet length only for some cases.

First, let’s preserve the buffer length for the sake of what we do later:

mov rax, [rdx + 8]	;Load buffer length (that's the second field of the structure)
push rax		;Store buffer length on the stack

Now, let’s load dwUser and divide it by some constant like 23:

mov rax, [rdx + 16]	;Load dwUser which is the fourth field
mov rbx, 23		;Move the constant to the register
xor rdx, rdx		;Clear upper divisor part
div rbx			;Divide

Now, we can restore the buffer length to rax:

pop rax	;Restore buffer length

At this point we have rax with the buffer length, and rdx with the remainder. We can now compare the reminder and skip the code modifying the pucket length if needed:

cmp rdx, 20	;Compare with 20	
jbe 0x17	;Skip the branch

We can see that we avoid the buffer length modification if the remainder is at most 20. Effectively, we have 20/22 = 0.909% chance that we won’t modify the package. This means that we modify something like 9% of the packages, assuming the dwUser has a good distribution. The code is written in this way so you can tune the odds of changing the packet.

Now, let’s modify the package. We want to divide the buffer length by 2, however, we want to keep it generic to be able to experiment with other values:

mov rbx, 1	;Move 1 to rbx to multiply by 1/2
xor rdx, rdx	;Clear remainder
mul rbx		;Multiply
mov rbx, 2	;Store 2 to rbx to multiply by 1/2
xor rdx, rdx	;Clear remainder
div rbx		;Divide

You can play with other values, obviously. From my experiments, halving the value works the best.

Now it’s rather easy. rax has the new buffer length or the original one if we decided not to modify it. Let’s restore other registers:

pop rdx
pop rbx

Let’s update the buffer length:

mov [rdx + 8], rax

Now, we need to prepare the jump addresses. First, we want to put the original return address of the method mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite which is mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+0x6d:

mov rax, 0x00007ffb897d3019
push rax

Now, we can jump to the WinAPI method:

push rbx
mov rbx, 0x00007ffbe6c6a860
mov rax, [rbx]
pop rbx
push rax
ret

That’s it. The final shellcode looks like this:

mov r8d, 0x30
mov rdx,rsi
push rbx
push rdx
mov rax, [rdx + 8]
push rax
mov rax, [rdx + 16]
mov rbx, 23
xor rdx, rdx
div rbx
pop rax
cmp rdx, 20
jbe 0x17
mov rbx, 1
xor rdx, rdx
mul rbx
mov rbx, 2
xor rdx, rdx
div rbx
pop rdx
pop rbx
mov [rdx + 8], rax
mov rax, 0x00007ffb897d3019
push rax
push rbx
mov rbx, 0x00007ffbe6c6a860
mov rax, [rbx]
pop rbx
push rax
ret

We can implant it with this:

e	25f`b8960000		0x41 0xB8 0x30 0x00 0x00 0x00 0x48 0x89 0xF2 0x53 0x52 0x48 0x8B 0x42 0x08 0x50 0x48 0x8B 0x42 0x10 0x48 0xC7 0xc3 0x17 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x58 0x48 0x83 0xFA 0x14 0x76 0x1A 0x48 0xC7 0xC3 0x01 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xE3 0x48 0xC7 0xC3 0x02 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x5A 0x5B 0x48 0x89 0x42 0x08 0x48 0xB8 0x19 0x30 0x7D 0x89 0xFB 0x7F 0x00 0x00 0x50 0x48 0xBB 0x60 0xA8 0xC6 0xE6 0xFB 0x7F 0x00 0x00 0x50 0xC3

Automated fix

We can now fix the code automatically. We need to do the following:

  • Start mstsc.exe
  • Find it’s process ID
  • Attach the debugger and find all the addresses: free memory, mstsc.exe method, WinAPI method
  • Construct the shellcode
  • Attach the debugger and patch the code
  • Detach the debugger

We can do all of that with PowerShell. Here is the code:

Function Run-Mstsc($rdpPath, $cdbPath, $numerator, $denominator){
	$id = get-random
	$code = @"
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;
	
namespace MstscPatcher
{
	public class Env$id {
		public static void Start() {
			Process[] processes = Process.GetProcessesByName("mstsc");
			RunProcess("mstsc.exe", "$rdpPath");
			Thread.Sleep(3000);
			Process[] processes2 = Process.GetProcessesByName("mstsc");
			var idToPatch = processes2.Select(p => p.Id).OrderBy(i => i).Except(processes.Select(p => p.Id).OrderBy(i => i)).First();
			Patch(idToPatch);
		}
		
		public static void Patch(int id){
			Console.WriteLine(id);
			var addresses = RunCbd(id, @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
!address
u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+100
.dvalloc 1024
qd
			");
			
			string freeMemoryAddress = addresses.Where(o => o.Contains("Allocated 2000 bytes starting at")).First().Split(' ').Last().Trim();
			Console.WriteLine("Free memory: " + freeMemoryAddress);
			
			var patchAddress = addresses.SkipWhile(o => !o.StartsWith("mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:"))
				.SkipWhile(o => !o.Contains("r8d,30h"))
				.First().Split(' ').First().Trim();
			Console.WriteLine("Patch address: " + patchAddress);
			var returnAddress = (Convert.ToUInt64(patchAddress.Replace(((char)96).ToString(),""), 16) + 0x10).ToString("X").Replace("0x", "");
			Console.WriteLine("Return address: " + returnAddress);
			
			var winApiAddress = addresses.SkipWhile(o => !o.Contains("[mstscax!_imp_waveOutPrepareHeader"))
				.First().Split('(')[1].Split(')')[0].Trim();
			Console.WriteLine("WinAPI address: " + winApiAddress);
			
			Func<string, IEnumerable<string>> splitInPairs = address => address.Where((c, i) => i % 2 == 0).Zip(address.Where((c, i) => i % 2 == 1), (first, second) => first.ToString() + second.ToString());			
			Func<string, string> translateToBytes = address => string.Join(" ", splitInPairs(address.Replace(((char)96).ToString(), "").PadLeft(16, '0')).Reverse().Select(p => "0x" + p));
						
			var finalScript = @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
e	" + patchAddress + @"	0x48 0xB8 " + translateToBytes(freeMemoryAddress) + @" 0x50 0xC3 0x90 0x90 0x90 0x90	
e	" + freeMemoryAddress + @"	0x41 0xB8 0x30 0x00 0x00 0x00 0x48 0x89 0xF2 0x53 0x52 0x48 0x8B 0x42 0x08 0x50 0x48 0x8B 0x42 0x10 0x48 0xC7 0xc3 $denominator 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x58 0x48 0x83 0xFA $numerator 0x76 0x1A 0x48 0xC7 0xC3 0x01 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xE3 0x48 0xC7 0xC3 0x02 0x00 0x00 0x00 0x48 0x31 0xD2 0x48 0xF7 0xF3 0x5A 0x5B 0x48 0x89 0x42 0x08 0x48 0xB8 " + translateToBytes(returnAddress) + @" 0x50 0x53 0x48 0xBB " + translateToBytes(winApiAddress) + @" 0x48 0x8B 0x03 0x5B 0x50 0xC3	
qd
			";
			Console.WriteLine(finalScript);
			RunCbd(id, finalScript);
		}
		
		public static string[] RunCbd(int id, string script) {
			Console.WriteLine(script);
			File.WriteAllText("mstsc.txt", script);
			string output = "";
			Process process = RunProcess("$cdbPath", "-p " + id + " -cf mstsc.txt", text => output += text + "\n");
			process.WaitForExit();
			File.Delete("mstsc.txt");
			
			return output.Split('\n');
		}
		
		public static Process RunProcess(string fileName, string arguments, Action<string> outputReader = null){
			ProcessStartInfo startInfo = new ProcessStartInfo
			{
				FileName = fileName,
				Arguments = arguments,
				UseShellExecute = outputReader == null,
				RedirectStandardOutput = outputReader != null,
				RedirectStandardError = outputReader != null
			};
			
			if(outputReader != null){
				var process = new Process{
					StartInfo = startInfo
				};
				process.OutputDataReceived += (sender, args) => outputReader(args.Data);
				process.ErrorDataReceived += (sender, args) => outputReader(args.Data);

				process.Start();
				process.BeginOutputReadLine();
				process.BeginErrorReadLine();
				return process;
			}else {
				return Process.Start(startInfo);
			}
		}
	}
}
"@.Replace('$id', $id)
	$assemblies = ("System.Core","System.Xml.Linq","System.Data","System.Xml", "System.Data.DataSetExtensions", "Microsoft.CSharp")
	Add-Type -referencedAssemblies $assemblies -TypeDefinition $code -Language CSharp
	iex "[MstscPatcher.Env$id]::Start()"
}


Run-Mstsc "my_rdp_settings.rdp".Replace("\", "\\") "cdb.exe".Replace("\", "\\") "0x14" "0x17"

We compile some C# code on the fly. First, we find existing mstsc.exe instances (line 16), then run the new instance (line 17), wait a bit for the mstsc.exe to spawn a child process, and then find the id (lines 19-20). We can then patch the existing id.

First, we look for addresses. We do all the manual steps we did above to find the memory address, and two function addresses. The script is in lines 27-32. Notice that I load symbols as we need them and CDB may not have them configured on the box.

We can now parse the output. We first extract the allocated memory in lines 35-36.

Next, we look for the call site. We dump the whole method, and then find the first occurrence of mov 8d,30h. That’s our call site. This is in lines 38-41.

Next, we calculate the return address which is 16 bytes further. This is in lines 42-43.

Finally, I calculate the WinAPI method address. I extract the location of the pointer for the method (lines 45-47).

Next, we need to construct the shell code. This is exactly what we did above. We just need to format addresses properly (this is in helper methods in lines 49-50), and then build the script (lines 52-558). We can run it and that’s it. The last thing is customization of the values. You can see in line 108 that I made two parameters to change numerator and denominator for the odds of modifying the package. This way you can easily change how many packets are broken. The more you break, the faster you resynchronize, however, the worse the sound is.

That’s it. I verified that on Windows 10 22H2 x64, Windows 11 22H2 x64, Windows Server 2016 1607 x64, and Windows Server 2019 1809 x64, and it worked well. Your mileage may vary, however, the approach should work anywhere. Generally, to make this script work somewhere else, you just need to adjust how we find the call site, the return address, and the address of the WinAPI function. Assuming that the WinAPI is still called via the pointer stored in memory, then you won’t need to touch the machine code payload.

Below is the script for x86 bit (worked on Windows 10 10240 x86). Main differences are in how we access the data structure as the pointer is on the stack (and not in the register).

Function Run-Mstsc($rdpPath, $cdbPath, $numerator, $denominator){
	$id = get-random
	$code = @"
using System;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;
	
namespace MstscPatcher
{
	public class Env$id {
		public static void Start() {
			Process[] processes = Process.GetProcessesByName("mstsc");
			RunProcess("mstsc.exe", "$rdpPath");
			Thread.Sleep(3000);
			Process[] processes2 = Process.GetProcessesByName("mstsc");
			var idToPatch = processes2.Select(p => p.Id).OrderBy(i => i).Except(processes.Select(p => p.Id).OrderBy(i => i)).First();
			Patch(idToPatch);
		}
		
		public static void Patch(int id){
			Console.WriteLine(id);
			var addresses = RunCbd(id, @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
!address
u mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite+100
.dvalloc 1024
qd
			");
			
			string freeMemoryAddress = addresses.Where(o => o.Contains("Allocated 2000 bytes starting at")).First().Split(' ').Last().Trim();
			Console.WriteLine("Free memory: " + freeMemoryAddress);
			
			var patchAddress = addresses.SkipWhile(o => !o.StartsWith("mstscax!CRdpWinAudioWaveoutPlayback::vcwaveOutWrite:"))
				.SkipWhile(o => !(o.Contains("dword ptr [ebx+3Ch]") && o.Contains("push")))
				.First().Split(' ').First().Trim();
			Console.WriteLine("Patch address: " + patchAddress);
			var returnAddress = (Convert.ToUInt64(patchAddress.Replace(((char)96).ToString(),""), 16) + 0x9).ToString("X").Replace("0x", "");
			Console.WriteLine("Return address: " + returnAddress);
			
			var winApiAddress = addresses.SkipWhile(o => !o.Contains("[mstscax!_imp__waveOutPrepareHeader"))
				.First().Split('(')[1].Split(')')[0].Trim();
			Console.WriteLine("WinAPI address: " + winApiAddress);
			
			Func<string, IEnumerable<string>> splitInPairs = address => address.Where((c, i) => i % 2 == 0).Zip(address.Where((c, i) => i % 2 == 1), (first, second) => first.ToString() + second.ToString());			
			Func<string, string> translateToBytes = address => string.Join(" ", splitInPairs(address.Replace(((char)96).ToString(), "").PadLeft(8, '0')).Reverse().Select(p => "0x" + p));
						
			var finalScript = @"
.sympath srv*C:\tmp*http://msdl.microsoft.com/download/symbols
.reload
e	" + patchAddress + @"	0xB8 " + translateToBytes(freeMemoryAddress) + @" 0x50 0xC3 0x90 0x90	
e	" + freeMemoryAddress + @"	0xFF 0x73 0x3C 0x53 0x52 0x8B 0x53 0x3C 0x52 0x8B 0x42 0x04 0x50 0x8B 0x42 0x0C 0xBB $denominator 0x00 0x00 0x00 0x31 0xD2 0xF7 0xF3 0x58 0x83 0xFA $numerator 0x76 0x12 0xBB 0x01 0x00 0x00 0x00 0x31 0xD2 0xF7 0xE3 0xBB 0x02 0x00 0x00 0x00 0x31 0xD2 0xF7 0xF3 0x5A 0x89 0x42 0x04 0x5A 0x5B 0xB8 " + translateToBytes(returnAddress) + @" 0x50 0x53 0xBB " + translateToBytes(winApiAddress) + @" 0x8B 0x03 0x5B 0x50 0xC3	
qd
			";
			Console.WriteLine(finalScript);
			RunCbd(id, finalScript);
		}
		
		public static string[] RunCbd(int id, string script) {
			Console.WriteLine(script);
			File.WriteAllText("mstsc.txt", script);
			string output = "";
			Process process = RunProcess("$cdbPath", "-p " + id + " -cf mstsc.txt", text => output += text + "\n");
			process.WaitForExit();
			File.Delete("mstsc.txt");
			
			return output.Split('\n');
		}
		
		public static Process RunProcess(string fileName, string arguments, Action<string> outputReader = null){
			ProcessStartInfo startInfo = new ProcessStartInfo
			{
				FileName = fileName,
				Arguments = arguments,
				UseShellExecute = outputReader == null,
				RedirectStandardOutput = outputReader != null,
				RedirectStandardError = outputReader != null
			};
			
			if(outputReader != null){
				var process = new Process{
					StartInfo = startInfo
				};
				process.OutputDataReceived += (sender, args) => outputReader(args.Data);
				process.ErrorDataReceived += (sender, args) => outputReader(args.Data);

				process.Start();
				process.BeginOutputReadLine();
				process.BeginErrorReadLine();
				return process;
			}else {
				return Process.Start(startInfo);
			}
		}
	}
}
"@.Replace('$id', $id)
	$assemblies = ("System.Core","System.Xml.Linq","System.Data","System.Xml", "System.Data.DataSetExtensions", "Microsoft.CSharp")
	Add-Type -referencedAssemblies $assemblies -TypeDefinition $code -Language CSharp
	iex "[MstscPatcher.Env$id]::Start()"
}

Run-Mstsc "my_rdp_settings.rdp".Replace("\", "\\") "cdb.exe".Replace("\", "\\") "0x14" "0x17"

Some keyword to make this easier to find on the Internet

Her some keywords to make this article easier to find on the Internet.

audio latency in mstsc.exe
audio latency in rdp
audio delay in mstsc.exe
audio delay in rdp
laggy sound in rdp
sound desynchronized in rdp
sound latency in rdp
slow audio
how to fix audio in rdp

Enjoy!

]]>
https://blog.adamfurmanek.pl/2024/05/30/bit-twiddling-part-5/feed/ 0
Serializing collections with Jackson in Scala and renaming the nested element https://blog.adamfurmanek.pl/2024/05/20/serializing-collections-with-jackson-in-scala-and-renaming-the-nested-element/ https://blog.adamfurmanek.pl/2024/05/20/serializing-collections-with-jackson-in-scala-and-renaming-the-nested-element/#respond Mon, 20 May 2024 07:55:06 +0000 https://blog.adamfurmanek.pl/?p=5030 Continue reading Serializing collections with Jackson in Scala and renaming the nested element]]> This is a workaround for XML collection duplicated element names that I was fixing recently.

We use case classes in Scala and the Jackson library for XML serialization. We have a class that has a list of some nested class. When serializing it, we would like to name the nested element properly.

We start with these classes:

case class Person(val Name: String)
 
@JsonPropertyOrder(Array("People"))
case class Base
(
	val People: List[Person]
)

When we try to serialize it, we’ll get something like this:

< Base>
    < People>
        < People>
            < Name>Some name here</Name>
        </People>
    </People>
</Base>

You can see that we have Base -> People -> People instead of Base -> People -> Person. We can try to fix it with regular annotations:

case class Base
(
	@JacksonXmlElementWrapper(localName = "People")
	@JacksonXmlProperty(localName = "Person")
	val People: List[Person]
)

It now serializes correctly. However, when you try to deserialize it, you get the following exception:

Exception in thread "main" com.fasterxml.jackson.databind.JsonMappingException: Could not find creator property with name 'Person'

This fails because we use JacksonXmlProperty to rename the property and this gets handled incorrectly.

The issue is with how Scala implements case classes. Fields are always stored as private fields with getters (and setters if you use var), and the annotations are propagated to the constructor parameters.

My way of fixing it was the following:

  • Do not propagate annotations to the constructor
  • Create another constructor that has a different signature that the default one
  • Mark the constructor as @JsonCreator

This way, Jackson creates the object with default values using the new constructor, and then sets the fields with reflection. So, this is how the code should look like:

case class Base
(
	@(JacksonXmlElementWrapper @field @getter)(localName = "People")
	@(JacksonXmlProperty @field @getter)(localName = "Person")
	val People: List[Person]
) {
   @JsonCreator
   def this(
	v1: List[Person]
	ignored: String
   ) = {
    this(v1)
   }
}

You could go with some different constructor if needed (like a parameterless one and pass nulls explicitly).

This works with Jackson 2.13.0.

]]>
https://blog.adamfurmanek.pl/2024/05/20/serializing-collections-with-jackson-in-scala-and-renaming-the-nested-element/feed/ 0
Bit Twiddling Part 4 — Disabling CTRL+ALT+HOME in mstsc.exe (Windows RDP client) https://blog.adamfurmanek.pl/2024/05/20/bit-twiddling-part-4/ https://blog.adamfurmanek.pl/2024/05/20/bit-twiddling-part-4/#respond Mon, 20 May 2024 07:40:05 +0000 https://blog.adamfurmanek.pl/?p=5025 Continue reading Bit Twiddling Part 4 — Disabling CTRL+ALT+HOME in mstsc.exe (Windows RDP client)]]>

This is the fourth part of the Bit Twiddling series. For your convenience you can find other parts in the table of contents in Par 1 — Modifying Android application on a binary level

Today we’re going to disable CTRL+ALT+HOME shortcut in mstsc. We want to disable the shortcut so it doesn’t “unfocus” the RDP session but still works inside the connection. I needed that for easier work in nested RDPs and I couldn’t find any decent RDP client for Windows that wouldn’t have this shortcut. The only one I found was Thincast but it lacked some other features as well.

Let’s begin.

Finding the entry point

The hardest part is as always finding “the place”. I took API Monitor which is like strace for Windows. I started the mstsc.exe, connected to some machine, pressed CTRL+ALT+HOME and observed what happened. After some digging here and there I found that when I press CTRL+ALT+HOME, the WM_USER+19 message is sent inside the application:

This is clearly a hint what’s going on. We can see that the key combination is captured by mstscax.dll which is the ActiveX part of the RDP. We also found the call stack and the address 0x000007ffec2d82fe2.

Analyzing the code

Now we can use WinDBG or whatever other debugger to figure out what’s going on. I attached to the running mstsc.exe, added a breakpoint with bu 0x00007ffec2d82fe2 and then pressed CTLR+ALT+HOME. As expected, the breakpoint was hit and I could observe the call stack:

0:003> kb
 # RetAddr               : Args to Child                                                           : Call Site
00 00007ffe`c2d27655     : 00000000`00000000 00000000`00000000 00000000`00000024 00000000`00008000 : mstscax!CTSCoreEventSource::InternalFireAsyncNotification+0xca
01 00007ffe`c2d262a5     : 00000000`00000100 00000000`00000000 00000000`00000024 00000135`bc82ea28 : mstscax!CTSInput::IHPostMessageToMainWindow+0x1c5
02 00007ffe`c2d261c8     : 00000000`00000001 00000000`01470001 00000000`003f0b28 00000000`00000113 : mstscax!CTSInput::IHInputCaptureWndProc+0x85
03 00007fff`5a04ef75     : 00000000`00000001 00000033`4faff2d0 00000000`003f0b28 00000000`80000022 : mstscax!CTSInput::IHStaticInputCaptureWndProc+0x58
04 00007fff`5a04e69d     : 00000000`00000000 00007ffe`c2d26170 00000033`4f5d3800 00007fff`3f646aae : USER32!UserCallWinProcCheckWow+0x515
05 00007fff`3f64ab32     : 00007ffe`c2d26170 00000000`00000000 00000000`ffffffff 00007fff`3f646a35 : USER32!DispatchMessageWorker+0x49d
06 00007fff`3f643997     : 00000000`00000000 00000000`00000000 00000135`bc830b00 00007fff`00000000 : apimonitor_drv_x64+0xab32
07 00000135`bc8fb89f     : 00000135`bc25da6a 00000000`00000001 00000000`00000001 00000000`00000001 : apimonitor_drv_x64+0x3997
08 00000135`bc25da6a     : 00000000`00000001 00000000`00000001 00000000`00000001 00000135`bd33e0e0 : 0x00000135`bc8fb89f
09 00000000`00000001     : 00000000`00000001 00000000`00000001 00000135`bd33e0e0 00007ffe`c2cd761c : 0x00000135`bc25da6a
0a 00000000`00000001     : 00000000`00000001 00000135`bd33e0e0 00007ffe`c2cd761c 00000033`4faff480 : 0x1
0b 00000000`00000001     : 00000135`bd33e0e0 00007ffe`c2cd761c 00000033`4faff480 00000135`bba40000 : 0x1
0c 00000135`bd33e0e0     : 00007ffe`c2cd761c 00000033`4faff480 00000135`bba40000 00000000`00000083 : 0x1
0d 00007ffe`c2cd761c     : 00000033`4faff480 00000135`bba40000 00000000`00000083 00000000`7ffef000 : 0x00000135`bd33e0e0
0e 00007ffe`c2cd7444     : 00000000`00000000 4e478f48`d4f63727 00000000`000051c0 0000aa52`0fad086d : mstscax!PAL_System_CondWait+0x1cc
0f 00007ffe`c2cf0f75     : 00000000`00000400 00000000`00000000 00000033`4faff8c0 00000135`bd35a720 : mstscax!CTSThreadInternal::ThreadSignalWait+0x34
10 00007ffe`c2cf1f8d     : 00000000`00000000 00000000`00000000 00000135`bd35a720 00000000`00000400 : mstscax!CTSThread::internalMsgPump+0x6d
11 00007ffe`c2d8693c     : 00000000`00000000 00007ffe`c2cee22d 00000135`bd344640 00007ffe`c2fdf3f0 : mstscax!CTSThread::internalThreadMsgLoop+0x14d
12 00007ffe`c3128550     : 00007ffe`c3475808 00000033`4faff8c0 00000000`00000000 00000135`ba29b8c0 : mstscax!CTSThread::ThreadMsgLoop+0x1c
13 00007ffe`c2fdeca8     : 00000135`bd35a720 00000135`ba29b8c0 00000135`bd35a720 00000135`ba29b748 : mstscax!CSND::SND_Main+0x148
14 00007ffe`c2fe73c2     : 00000135`bd35d100 00000135`bd35a720 00000033`4f67e3f0 00000000`00000000 : mstscax!CTSThread::TSStaticThreadEntry+0x258
15 00007fff`5a1f7344     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : mstscax!PAL_System_Win32_ThreadProcWrapper+0x32
16 00007fff`5b7a26b1     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0x14
17 00000000`00000000     : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x21

Perfect! We can see the method names and they are really useful. We can immediately see that some IHStaticInputCaptureWndProc method captures the regular message from the OS, then it calls the window procedure IHInputCaptureWndProc, and then the method posts the message to the main mstsc.exe window in IHPostMessageToMainWindow.

This shows us the way. We can also ask the WinDBG to get the call stack with addresses to find out that mstscax!CTSInput::IHInputCaptureWndProc+0x85 is mstscax+0x562a5. We can now dump the code:

0:004> u mstscax+0x562a5-0x85 mstscax+0x562a5
mstscax+0x56220:
00007ffe`c2d26220 4053            push    rbx
00007ffe`c2d26222 55              push    rbp
00007ffe`c2d26223 56              push    rsi
00007ffe`c2d26224 57              push    rdi
00007ffe`c2d26225 4154            push    r12
00007ffe`c2d26227 4155            push    r13
00007ffe`c2d26229 4156            push    r14
00007ffe`c2d2622b 4157            push    r15
00007ffe`c2d2622d 4881ece8000000  sub     rsp,0E8h
00007ffe`c2d26234 488b05cdc37500  mov     rax,qword ptr [mstscax!DllUnregisterServer+0x6f1308 (00007ffe`c3482608)]
00007ffe`c2d2623b 4833c4          xor     rax,rsp
00007ffe`c2d2623e 48898424d0000000 mov     qword ptr [rsp+0D0h],rax
00007ffe`c2d26246 33ed            xor     ebp,ebp
00007ffe`c2d26248 0f57c0          xorps   xmm0,xmm0
00007ffe`c2d2624b 448bf5          mov     r14d,ebp
00007ffe`c2d2624e 896c245c        mov     dword ptr [rsp+5Ch],ebp
00007ffe`c2d26252 f30f7f442470    movdqu  xmmword ptr [rsp+70h],xmm0
00007ffe`c2d26258 498bf1          mov     rsi,r9
00007ffe`c2d2625b 418bd8          mov     ebx,r8d
00007ffe`c2d2625e 4c8be2          mov     r12,rdx
00007ffe`c2d26261 488bf9          mov     rdi,rcx
00007ffe`c2d26264 488b059df57400  mov     rax,qword ptr [mstscax!DllUnregisterServer+0x6e4508 (00007ffe`c3475808)]
00007ffe`c2d2626b 4c8d2d96f57400  lea     r13,[mstscax!DllUnregisterServer+0x6e4508 (00007ffe`c3475808)]
00007ffe`c2d26272 4c8bbc2450010000 mov     r15,qword ptr [rsp+150h]
00007ffe`c2d2627a 493bc5          cmp     rax,r13
00007ffe`c2d2627d 740a            je      mstscax+0x56289 (00007ffe`c2d26289)
00007ffe`c2d2627f f6401c01        test    byte ptr [rax+1Ch],1
00007ffe`c2d26283 0f85c6010000    jne     mstscax+0x5644f (00007ffe`c2d2644f)
00007ffe`c2d26289 39af88030000    cmp     dword ptr [rdi+388h],ebp
00007ffe`c2d2628f 0f85f1020000    jne     mstscax+0x56586 (00007ffe`c2d26586)
00007ffe`c2d26295 4d8bcf          mov     r9,r15
00007ffe`c2d26298 4c8bc6          mov     r8,rsi
00007ffe`c2d2629b 8bd3            mov     edx,ebx
00007ffe`c2d2629d 488bcf          mov     rcx,rdi
00007ffe`c2d262a0 e8eb110000      call    mstscax+0x57490 (00007ffe`c2d27490)
00007ffe`c2d262a5 85c0            test    eax,eax

Great. We can see that the line 00007ffe`c2d262a0 calls some internal method and then tests if the output of the call is equal to 0 (this is the meaning of test eax,eax). We can now comment out this line and see what happens (mind the empty line):

a 00007ffe`c2d262a0
xor eax,eax
nop
nop
nop

g

We come back to mstsc.exe and we can see that CTRL+ALT+HOME doesn’t unfocus the window anymore! We can also check that the combination is handled properly inside the connection so it still unfocuses nested RDPs.

Automation

Automating this is pretty straightforward with cdb. We could modify the dll in place, but that would affect all the RDP connections we ever make. If we want to disable CTRL+ALT+HOME for some of them only, then this is what we can do:

First, create the file mstsc.txt with the following code:

a mstscax+0x562a0
xor eax, eax
nop
nop
nop

qd

Next, create batch file no_home.bat

cdb.exe -p %1 -cf mstsc.txt

Finally, run it like this:

no_home.bat PROCESS_ID_OF_MSTSC_YOU_WANT_TO_HACK

Changing shortcut to something else

How about we change shortcut instead of getting rid of it entirely? Let’s disassemble the method a bit more:

0:001> u mstscax+0x57655-0x1c5 mstscax+0x57655
mstscax!CTSInput::IHPostMessageToMainWindow:
00007ffe`c2d27490 48895c2408      mov     qword ptr [rsp+8],rbx
00007ffe`c2d27495 48896c2410      mov     qword ptr [rsp+10h],rbp
00007ffe`c2d2749a 4889742418      mov     qword ptr [rsp+18h],rsi
00007ffe`c2d2749f 57              push    rdi
00007ffe`c2d274a0 4156            push    r14
00007ffe`c2d274a2 4157            push    r15
00007ffe`c2d274a4 4883ec30        sub     rsp,30h
00007ffe`c2d274a8 33db            xor     ebx,ebx
00007ffe`c2d274aa 4d8bf1          mov     r14,r9
00007ffe`c2d274ad 498be8          mov     rbp,r8
00007ffe`c2d274b0 8bf2            mov     esi,edx
00007ffe`c2d274b2 488bf9          mov     rdi,rcx
00007ffe`c2d274b5 399958030000    cmp     dword ptr [rcx+358h],ebx
00007ffe`c2d274bb 0f8582000000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0xb3 (00007ffe`c2d27543)
00007ffe`c2d274c1 81fa00010000    cmp     edx,100h
00007ffe`c2d274c7 0f8489000000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0xc6 (00007ffe`c2d27556)
00007ffe`c2d274cd b901000000      mov     ecx,1
00007ffe`c2d274d2 8d82fcfeffff    lea     eax,[rdx-104h]
00007ffe`c2d274d8 3bc1            cmp     eax,ecx
00007ffe`c2d274da 0f47cb          cmova   ecx,ebx
00007ffe`c2d274dd 399f24030000    cmp     dword ptr [rdi+324h],ebx
00007ffe`c2d274e3 0f84c3040000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x51c (00007ffe`c2d279ac)
00007ffe`c2d274e9 448bc3          mov     r8d,ebx
00007ffe`c2d274ec 81fe04010000    cmp     esi,104h
00007ffe`c2d274f2 0f84c7040000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x52f (00007ffe`c2d279bf)
00007ffe`c2d274f8 8bd3            mov     edx,ebx
00007ffe`c2d274fa 85c9            test    ecx,ecx
00007ffe`c2d274fc 0f85d1040000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x543 (00007ffe`c2d279d3)
00007ffe`c2d27502 8bc3            mov     eax,ebx
00007ffe`c2d27504 85d2            test    edx,edx
00007ffe`c2d27506 0f8558010000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x1d4 (00007ffe`c2d27664)
00007ffe`c2d2750c 85c0            test    eax,eax
00007ffe`c2d2750e 0f8550010000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x1d4 (00007ffe`c2d27664)
00007ffe`c2d27514 8bc3            mov     eax,ebx
00007ffe`c2d27516 4585c0          test    r8d,r8d
00007ffe`c2d27519 0f85c8040000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x557 (00007ffe`c2d279e7)
00007ffe`c2d2751f 85c0            test    eax,eax
00007ffe`c2d27521 0f85c0040000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x557 (00007ffe`c2d279e7)
00007ffe`c2d27527 488b6c2458      mov     rbp,qword ptr [rsp+58h]
00007ffe`c2d2752c 8bc3            mov     eax,ebx
00007ffe`c2d2752e 488b5c2450      mov     rbx,qword ptr [rsp+50h]
00007ffe`c2d27533 488b742460      mov     rsi,qword ptr [rsp+60h]
00007ffe`c2d27538 4883c430        add     rsp,30h
00007ffe`c2d2753c 415f            pop     r15
00007ffe`c2d2753e 415e            pop     r14
00007ffe`c2d27540 5f              pop     rdi
00007ffe`c2d27541 c3              ret
00007ffe`c2d27542 cc              int     3
00007ffe`c2d27543 81fe13010000    cmp     esi,113h
00007ffe`c2d27549 0f851f010000    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x1de (00007ffe`c2d2766e)
00007ffe`c2d2754f bb01000000      mov     ebx,1
00007ffe`c2d27554 ebd1            jmp     mstscax!CTSInput::IHPostMessageToMainWindow+0x97 (00007ffe`c2d27527)
00007ffe`c2d27556 8b8128030000    mov     eax,dword ptr [rcx+328h]
00007ffe`c2d2755c be00800000      mov     esi,8000h
00007ffe`c2d27561 483be8          cmp     rbp,rax
00007ffe`c2d27564 0f8470010000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x24a (00007ffe`c2d276da)
00007ffe`c2d2756a 8b8744030000    mov     eax,dword ptr [rdi+344h]
00007ffe`c2d27570 483be8          cmp     rbp,rax
00007ffe`c2d27573 0f84dc010000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x2c5 (00007ffe`c2d27755)
00007ffe`c2d27579 8b8748030000    mov     eax,dword ptr [rdi+348h]
00007ffe`c2d2757f 483be8          cmp     rbp,rax
00007ffe`c2d27582 0f8440020000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x338 (00007ffe`c2d277c8)
00007ffe`c2d27588 8b874c030000    mov     eax,dword ptr [rdi+34Ch]
00007ffe`c2d2758e 483be8          cmp     rbp,rax
00007ffe`c2d27591 0f84a4020000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x3ab (00007ffe`c2d2783b)
00007ffe`c2d27597 8b8750030000    mov     eax,dword ptr [rdi+350h]
00007ffe`c2d2759d 483be8          cmp     rbp,rax
00007ffe`c2d275a0 0f8408030000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x41e (00007ffe`c2d278ae)
00007ffe`c2d275a6 4883fd24        cmp     rbp,24h
00007ffe`c2d275aa 0f8471030000    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x491 (00007ffe`c2d27921)
00007ffe`c2d275b0 4883fd2d        cmp     rbp,2Dh
00007ffe`c2d275b4 0f856dffffff    jne     mstscax!CTSInput::IHPostMessageToMainWindow+0x97 (00007ffe`c2d27527)
00007ffe`c2d275ba 8d4de5          lea     ecx,[rbp-1Bh]
00007ffe`c2d275bd 48ff15acec5b00  call    qword ptr [mstscax!_imp_GetKeyState (00007ffe`c32e6270)]
00007ffe`c2d275c4 0f1f440000      nop     dword ptr [rax+rax]
00007ffe`c2d275c9 6685c6          test    si,ax
00007ffe`c2d275cc 0f8455ffffff    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x97 (00007ffe`c2d27527)
00007ffe`c2d275d2 8d4de4          lea     ecx,[rbp-1Ch]
00007ffe`c2d275d5 48ff1594ec5b00  call    qword ptr [mstscax!_imp_GetKeyState (00007ffe`c32e6270)]
00007ffe`c2d275dc 0f1f440000      nop     dword ptr [rax+rax]
00007ffe`c2d275e1 6685c6          test    si,ax
00007ffe`c2d275e4 0f843dffffff    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x97 (00007ffe`c2d27527)
00007ffe`c2d275ea e8c581fcff      call    mstscax!CClientUtilsWin32::UT_IsRunningInAppContainer (00007ffe`c2cef7b4)
00007ffe`c2d275ef 85c0            test    eax,eax
00007ffe`c2d275f1 0f8430ffffff    je      mstscax!CTSInput::IHPostMessageToMainWindow+0x97 (00007ffe`c2d27527)
00007ffe`c2d275f7 488b050ae27400  mov     rax,qword ptr [mstscax!WPP_GLOBAL_Control (00007ffe`c3475808)]
00007ffe`c2d275fe 4c8d3d03e27400  lea     r15,[mstscax!WPP_GLOBAL_Control (00007ffe`c3475808)]
00007ffe`c2d27605 493bc7          cmp     rax,r15
00007ffe`c2d27608 7430            je      mstscax!CTSInput::IHPostMessageToMainWindow+0x1aa (00007ffe`c2d2763a)
00007ffe`c2d2760a f6401c01        test    byte ptr [rax+1Ch],1
00007ffe`c2d2760e 742a            je      mstscax!CTSInput::IHPostMessageToMainWindow+0x1aa (00007ffe`c2d2763a)
00007ffe`c2d27610 80781904        cmp     byte ptr [rax+19h],4
00007ffe`c2d27614 7224            jb      mstscax!CTSInput::IHPostMessageToMainWindow+0x1aa (00007ffe`c2d2763a)
00007ffe`c2d27616 e82de60000      call    mstscax!RdpWppGetCurrentThreadActivityIdPrefix (00007ffe`c2d35c48)
00007ffe`c2d2761b 488b0de6e17400  mov     rcx,qword ptr [mstscax!WPP_GLOBAL_Control (00007ffe`c3475808)]
00007ffe`c2d27622 4c8d0557ca5c00  lea     r8,[mstscax!WPP_f5f71bb7bac236b27f26969128cc1e12_Traceguids (00007ffe`c32f4080)]
00007ffe`c2d27629 448bc8          mov     r9d,eax
00007ffe`c2d2762c bafd000000      mov     edx,0FDh
00007ffe`c2d27631 488b4910        mov     rcx,qword ptr [rcx+10h]
00007ffe`c2d27635 e85ecdfcff      call    mstscax!WPP_SF_D (00007ffe`c2cf4398)
00007ffe`c2d2763a 488b8f60050000  mov     rcx,qword ptr [rdi+560h]
00007ffe`c2d27641 4533c0          xor     r8d,r8d
00007ffe`c2d27644 488b01          mov     rax,qword ptr [rcx]
00007ffe`c2d27647 418d5010        lea     edx,[r8+10h]
00007ffe`c2d2764b 488b4048        mov     rax,qword ptr [rax+48h]
00007ffe`c2d2764f ff1523f75b00    call    qword ptr [mstscax!_guard_dispatch_icall_fptr (00007ffe`c32e6d78)]
00007ffe`c2d27655 c7871404000001000000 mov dword ptr [rdi+414h],1

The most interesting line is:

00007ffe`c2d275a6 4883fd24        cmp     rbp,24h

0x24 is the code key of HOME. We can replace it with something else, like insert which is 2D:

e mstscax+0x575A9 0x2D

And there you go. You can now press CTRL+ALT+HOME to unfocus nested RDP and CTRL+ALT+INSERT to unfocus the outer one. This gives you two levels of unfocusing.

]]>
https://blog.adamfurmanek.pl/2024/05/20/bit-twiddling-part-4/feed/ 0
Availability Anywhere Part 27 — Tools for remote work from laptops and XR/VR/AR https://blog.adamfurmanek.pl/2024/05/09/availability-anywhere-part-27/ https://blog.adamfurmanek.pl/2024/05/09/availability-anywhere-part-27/#respond Thu, 09 May 2024 09:54:10 +0000 https://blog.adamfurmanek.pl/?p=5018 Continue reading Availability Anywhere Part 27 — Tools for remote work from laptops and XR/VR/AR]]>

This is the twentieth seventh part of the Availability Anywhere series. For your convenience you can find other parts in the table of contents in Part 1 – Connecting to SSH tunnel automatically in Windows

Last time we discussed how to work remotely. Let’s now look at that from a different perspective and see various tools that we can use and how they work.

Sessions, input, and output

Before seeing the tools let’s understand some basics of how computers work. This will be important to understand what tools we need and how they should work. I’ll be talking mostly from the perspective of Windows OS, but the same applies to other systems.

Whenever you try to use a computer, you need to have a session. Session is an object that controls your human interfaces, most importantly input and output. Depending on the version of Windows, you may be able to create many sessions.

The main session (the one you have when you “sit at a computer”) is called CONSOLE. You can see it with qwinsta command:

>qwinsta
 SESSIONNAME       USERNAME                 ID  STATE   TYPE        DEVICE
 services                                    0  Disc
 console                                     1  Conn

CONSOLE is what you create when you use your computer “normally”. This session handles your keyboard, mouse, and (most importantly) your monitors.

You can create more sessions. Whenever you use mstsc.exe, you connect to a computer and create a session:

SESSIONNAME       USERNAME                 ID  STATE   TYPE        DEVICE
 rdp-tcp#112       user                     2  Active

Obviously, when you create a new session, it must have some way of getting input (mouse and keyboard) and producing output (monitors).

It’s important to understand that one physical monitor can be in one session only (at least I have no idea how to use it in many sessions and I don’t think that would make sense). After all, you just have “one picture” on the monitor and there is no point in showing many sessions in it. Sure, you could fake it to some extent maybe to do picture-in-picture, but I really don’t see a decent use case for that.

While all of that sounds super clear, good understanding is crucial for picking the right tools.

What do we really need

Let’s now think what we really need to do the remote work.

Regarding sessions:

  • Do we want to connect to an existing session? In other words, do we want to connect to the state that the “person sitting next to the computer” sees?
  • Do we want to create a new session? This will start things from scratch just like we “restarted” the computer

However, the more important question we need to consider is how do we want to interact with the session?

  • Do we want to control it at the destination (“remote”)?
  • Do we want to control it from both places (“remote” and “local”)? This is “shared” control.
  • Or maybe only “local” should control it? This is “takeover” control.
  • Do you want the keyboard shortcuts pressed on “local” to be sent to “remote”?

If you read this carefully, you may realize some interesting facts that may change the way you look for the remote work software:

  • If you want to create a session and control it from “local” then you need RDP-like (mstsc.exe-like) application
  • If you want to connect to the existing session and control it from “local” then you need VNC-like software
  • If you want to connect to the existing session and control it from “remote” then you basically need a screen share

While all these things may seem rather trivial, they have crucial impact on what we’re actually trying to achieve. Let’s see the typical scenarios:

  • You want to connect to a remote server (like VDI, Azure VM, etc.) – in that case you want to create a new session and you want to control it from “local” which is RDP-like software
  • You want to help someone else with their computer – in that case you connect to the existing session and you want to let “remote” control it which is a screen share
  • You want to play a game on your VR goggles from your PVCR – in that case you connect to the existing session and you want to control it from “local” or from both places which is a VNC-like software
  • You want to use VR goggles to work from your laptop – in that case you connect to the existing session and you want to let “remote” control it which is a screen share

Notice that a regular “monitor” can be thought of a remote connection to the existing session that is controlled from “remote” so it’s basically a screen share. However, you could have a TV that you use to connect to the existing session and control it from “local” with your remote control (no pun intended) which is a VNC-like software.

Multiple monitors

Now we can ask the question of the monitors. This is crucial as here are many ways we can configure our machines.
Let’s assume that our “local” has 3 screens and our “remote” has 2 physical screens. So the geometry in “remote” may look like (1920, 1080, 0, 0) and (1920, 1080, 1920, 0) (which is (width, height, left, top)). Similarly, geometry in “local” may be (1920, 1080, 0, 0) + (1920, 1080, -1920, 0) + (1920, 1080, 1920, 1080).

Let’s consider two scenarios.

Creating new session in remote

If we create a new session in remote then we need to provide monitors for it. Effectively, it doesn’t matter what monitors are there in “remote” because we won’t use them. The only important thing that matters here is that we have 3 monitors in “local”. To understand what’s going on, we now need to decide on how many virtual monitors we want to create in “remote” and how we want to map them in “local”. There are typically the following options:

  • 1:1 – You create one monitor in “remote” and you map it to one monitor in “local” – this is how most RDP clients work
  • 1:n – You create one monitor in “remote” and you show it on multiple monitors in “local” – this is how xfreerdp workd for me. Sometimes this is called “span”
  • n:n – You create as many monitors in “remote” as you have in “local” – this is how mstsc.exe works when you select “use all displays”. Sometimes this is called “multimon”
  • m:m – You create fewer monitors in “remote” then you have in “local” but each monitor is mapped to one monitor. This is how mstsc.exe works when you modify the .rdp file directly
  • m:n – You create some monitors in “remote” and you map them to some configuration in “local”. I don’t know any software that would do this. Notice that this doesn’t mean that you use entire “local” monitor. You could come up with any geometry that you need.

Whenever I talk about “multiple monitors” I mean the n:n or m:m scenario from the above.

Connecting to he existing session in remote

In this scenario the “remote” has 2 monitors. You now want to map them somehow in your machine. You have the following options:

  • 1:1 – You show only one “remote” monitor in one “local”
  • m:1 – You show all “remote” monitors in one “local” (typically one window) – this is how most VNC apps work
  • m:m – You show “remote” monitors separately in “local”. This rarely works on Windows
  • m:n – You configure custom geometry. I’m not aware of any software that does that

Let’s now see the software for various scenarios.

Software

Applications that create new session (RDP-like) (and therefore “local” must control the session):

I’m not aware of an RDP-like application that would work in VR goggles.

VNC-like applications:

  • TightVNC, TigerVNC, UltraVNC and similar – they typically support 1:1 and “shared” control. You can often change the geometry on the server (I didn’t see any client that would let me change the geometry, though).
  • vSpatial – this supports n:n, and “shared” or “takeover”
  • noVNC – this is a web VNC client. It supports 1:1 and “shared”. It’s just an HTML file with some JS, so you can easily adjust it to your needs
  • Immersed – this supports n:n and “shared”
  • Virtual Desktop, Multirooms – they support 1:1 and “shared”

Screen share applications:

  • vdo.ninja – this shares one screen and works in browser, so you can share as many screens as you need
  • Google Meets, Teams, Amazon Chime, Zoom, Webex and many others – they share one screen and work in browser or in dedicated apps. You can open up many bridges to share many screens

And now the most important part (kind of TL;DR):

  • You want to have the “true” RDP experience with multiple monitors – mstsc.exe or Thincast Remote Desktop Client
  • You want to interact with some remote session from your computer – any VNC client that works for your case (and your mileage may vary significantly) – TightVNC, UltraVNC, TigerVNC, noVNC, many more. Also, vSpatial
  • You just want to see the remote session only – any screen share or VNC. Specifically, you don’t need Immersed or other “built-in” VR apps that stream the screen. You can just open up a browser and use vdo.ninja!

Being that said. If you just want to use your goggles as a “monitor” for your laptop, then you don’t need any CPU/GPU-heavy applications for that. Just share the screen in browser and it works.

And if you want to offload that even more, then use RDP + Screen share:

  • Take your road runner and connect over RDP to a remote machine. This way you can create a session there with monitors (and you can use IndirectDisplayDriver or BetterDisplay to create virtual monitors)
  • Use your VR goggles to connect to see the screen shared from the remote machine (with screen share solutions like browser, VNC, vSpatial or whatever else)

This way your road runner does nearly nothing. It just connects to the remote machine. Similarly, your VR goggles don’t work hard. They just run a browser or something light.

]]>
https://blog.adamfurmanek.pl/2024/05/09/availability-anywhere-part-27/feed/ 0
Availability Anywhere Part 26 — Working remotely like a pro https://blog.adamfurmanek.pl/2024/04/24/availability-anywhere-part-26/ https://blog.adamfurmanek.pl/2024/04/24/availability-anywhere-part-26/#respond Wed, 24 Apr 2024 19:59:55 +0000 https://blog.adamfurmanek.pl/?p=5004 Continue reading Availability Anywhere Part 26 — Working remotely like a pro]]>

This is the twentieth sixth part of the Availability Anywhere series. For your convenience you can find other parts in the table of contents in Part 1 – Connecting to SSH tunnel automatically in Windows

Over the years I improved my remote work setup significantly. This blog post summarizes what various approaches I took and how I configured my setup over the years. One note before moving on – I assume you don’t have any “hardware” but the computer (so no programmable boards, weird mobile devices, confidential hardware etc.).

Level 1

First approach is what most of us consider “remote work”. You just take your corporate laptop with you and you work remotely. You probably have a corporate VPN that lets you connect to the company’s resources from wherever you are.

Pros:

  • You can do everything the same way you did while in the office

Cons:

  • You can’t work from devices that are not onboarded with your corporate infrastructure – mobile phones, tablets, private laptops, VR goggles, etc.
  • You need to take the corporate laptop with you anywhere you go
  • You work from a device that is probably hot and loud working crazy on your office stuff (we all know the bloatware they install on the machines…)
  • You can rarely pick your hardware, so your laptop is probably big and bulky (and yes, Macbook Air is big and bulky)
  • You may not be able to connect to the VPN in some places due to weird networking conditions

While this is enough to actually do the work, I don’t like this approach. First, I always want to be able to be online (that doesn’t mean that I’m online all the time) no matter where I am and what I have on me. Specifically, I want to be able to access the infrastructure and do all my stuff using my mobile phone, private laptop, etc. I mean, I want to be able to do coding or video processing from my mobile phone while on a train (yes, I did that).

Second, I don’t like taking my corporate machine with me. This is especially troublesome when I need to travel for conferences for which I need to take more hardware with me (like 2 laptops and many mobile phones). Taking another laptop on the plane is just cumbersome. This is sometimes even dangerous depending on where you go and when they want to control your corporate machines. Not to mention, that your corporate policy may prohibit you from taking the corporate laptop to some places.

Third, I don’t like when I can’t pick my hardware. My laptop must be light and quiet, especially when I sit in a deckchair or lie down.

Level 1.5

Sometimes you cannot connect to the VPN while working remotely. This is due to weird networking, policy, or simply geolocation block.

You can fix that by getting a networking device that will connect to some other VPN and expose a tunnel for you. Typical solution is to get a router with built-in VPN. You can see this tutorial to learn how to do that.

However, I don’t recommend taking a router for that. Just get yourself a decent Windows x86 tablet, for instance Dell Venue 8 Pro and install SoftEther VPN. First, such a tablet is a “full Windows machine” so you can do whatever is needed to get connected (think about captive portals or weird passwords). Second, since this is a tablet, you can take it with you wherever you need.

Level 2

Let’s not bring the corporate machine with us at all. Get yourself a corporate virtual machine (typically called virtual desktop or VDI) and do all the work from it. Next, to work remotely, you just need to connect to the VDI from your device (which we’ll call road runner) using RDP + SSH. To do that, you may need to onboard your road runner with your corporate VPN. However, there are many solutions to work that around, so from now on we assume your road runner is not onboarded with the corporate bloatware.

Pros:

  • You can do almost everything the same way you did while in the office
  • You can work from your laptop
  • You can take your hardware anywhere you wish

Cons:

  • You may still be unable to work from mobile phones, tablets, VR goggles, etc.
  • You may need to run video conferencing software on your road runner (which may not be possible)
  • Your road runner may still get hot and loud when taking crazy video call with greenscreen effects
  • You may not be able to do some corporate tasks – think about GPU-heavy things like video processing

This is much better now since we can connect from our road runner, so we can get a light laptop like Acer Swift 5 which is 1kg/2.2lb (your Macbook Air is at least 25% heavier).

When it comes to video conferences, they may be tricky. First, your webcam may not work over RDP. Second, RDP introduces noticeable audio delay. To fix that, you may need to run the conference software locally (I recommend doing that in browser). While it solves the problem, it also makes your road runner loud and hot.

Last but not least, if your company cannot prepare a VDI for you, you can create it on your own. Just use Sysinternal’s disk2vhd. Keep in mind it may cause some troubles with Active Domain and IdP in general, but it works.

Level 3

Let’s finally address the problem that we can’t work from any device. Let’s add a jump host. Get yourself a decent Windows machine and put it somewhere on the Internet. I generally recommend Hetzner (but whatever you choose, just get yourself a dedicated host). And BTW, choose Intel CPU instead of AMD (your virtual machines will thank you for that). Next, connect to that machine with plain RDP. This way, you can connect to it using any road runner – be it laptop, mobile phone, tablet, or whatever else. And since it can be a mobile phone, just get yourself a decent foldable. Yes, that really makes a difference when you’re working remotely.

Pros:

  • You can do almost everything the same way you did while in the office
  • You can work from nearly any device
  • You can take your hardware anywhere you wish

Cons:

  • You may need to run video conferencing software on your road runner (which may not be possible)
  • Your road runner may still get hot and loud when taking crazy video call with greenscreen effects
  • You may not be able to do some corporate tasks – think about GPU-heavy things like video processing
  • You may have hard time connecting to the corporate VPN

When dealing with corporate VPN, use things like SSLH or FileProxy for avoiding VPN without split tunneling (also known as TCP over File System). There are many more tricks and improvements you can apply. I cover them in my talk.

Level 4

Let’s now deal with more devices and the GPU. I use two solutions for that: vSpatial and NoMachine.

First, vSpatial lets you connect to your jump host with browser and application for VR goggles. This way, you can work from basically any device you can think of. Another bonus is that vSpatial lets you forward audio and camera, so you can take your meetings from the jump host. However, as of today, vSpatial supports only 20 FPS for your webcam, so people will notice that something is off. We’ll deal with that in a second.

Second, NoMachine lets you connect to the jump host over VNC-like protocol which lets you use GPU easily. You can now do basically anything you can think of.

Pros:

  • You can do everything the same way you did while in the office (assuming your VDI can do the things your corporate laptop could)
  • You can work from any device
  • You can take your hardware anywhere you wish

Cons:

  • You may need to run video conferencing software on your road runner (which may not be possible)
  • Your road runner may still get hot and loud when taking crazy video call with greenscreen effects
  • You may have hard time connecting to the corporate VPN

Nothing stops you from having more jump hosts. For instance, you can host a dedicated Mac mini in Cyberlynk and connect to it.

Also, if you need more screens with your road runner, you can try portable monitor. You can also emulate screens with IddSampleDriver or its Rust version, or Better Display. This way you can literally move between machines with not a single window changing its position.

Level 5

Let’s finally deal with our road runners becoming loud and hot. We basically want to offload whole video processing to the jump host so that our device doesn’t need to encode/decode any video conferencing stuff.

To do that, use VDO.Ninja. This lets you “forward” your audio and camera to any jump host. Just use OBS or Splitcam on the jumphost to receive the video, apply any greenscreen effects you need, and forward it to the conference call.

Pros:

  • You can do everything the same way you did while in the office (assuming your VDI can do the things your corporate laptop could)
  • You can work from any device
  • You can take your hardware anywhere you wish
  • Your hardware stays cool and quiet no matter what you do

Cons:

  • You may have hard time connecting to the corporate VPN

This is how my final setup looks like. I don’t do anything on my road runner. Nothing. I just connect to the jump host with RDP + vSpatial + NoMachine and I forward my audio + video with VDO.Ninja. No matter if I code, process videos, record stuff, attend conference call (or many of them), remove my webcam background, or anything else – my road runner has constant and predictable load. I can also switch to any other road runner easily.

Double-thin-client architecture

Yet another trick you can apply is what I call “double-thin-client” architecture. Instead of booting up your road runner natively, you can use native boot and get yourself a portable SSD drive like Samsung T7.

This way, you can install your operating system once, and carry it with you wherever you go. You just plug the drive to the USB, boot the operating system from the drive, and you have your environment up and running. You can then replace your road runner with another machine easily. I actually did that many times. My Windows installation survived like 4 different laptops already. Best thing is, you can simply clone the drive to another one when you get a new SSD, so you can upgrade your hardware in place with no failures.

You can use similar approach for your jump host. Do not install things locally. Just boot from VHD and this way you can easily take backup of your dedicated host.

Other improvements

There are many more things you can consider for improving your remote work. Some ideas include Bluetooth devices (obviously) and VNC server installed on a road runner, so you can walk around your home and control the device from your mobile phone. This can be useful when you’re already dialed into a call and you don’t want to rejoin from another device. Simply take your Bluetooth headset with you to another room and watch the screen of your road runner from a mobile phone.

Another idea is automatically capturing meeting notes. Since you take calls on your jump host, you can run some notetaker AI application that will automatically capture whatever you need and send it forward.

You can use Firefox with Multi-Account Containers and Tree Style Tab. They really make your work way faster. Too bad it’s just for Firefox.

You may also want to use VirtuaWin to have many desktops. I find it much better to use than “tasks” in Windows, mostly thanks to keyboard shortcuts. And if you need to run applications that need to actively interact with your desktop (like Puppeteer or UIPath), just use Desktops.

Tricks for mobile phones

You can also work on your mobile phone. First, you can get a silicone keyboard to type easily. This is really great and you can actually work from your mobile phone. I once travelled for two days with my mobile phone only and I was still able to do coding and debugging.

You can dial in over GSM. This may simplify the way you take online meetings.

Next, you can run many clones of one application with Multiple Accounts. This way, you can separate your corporate stuff from your personal one on your road runner.

You can virtualize mobile phone inside a mobile phone. Just consider applications like VMOS or Virtual Master – Android Clone. You can try Limbo x86 PC Emulator.

You can turn your mobile phone into a mobile workstation with Samsung DeX.

Short note on Apple devices

I sometimes get asked why I don’t use Apple devices. If you watch my conference recording, you can see that I always use Windows and Android. This is surprising to some people because they consider me a “power user” and yet it seems like I don’t use Apple devices. Let me give you some reasons.

First, I do use them. I really like my Mac with silicon chip as it’s really great CPU + GPU in one box. However, I don’t use it as a road runner for these reasons:

  • It’s heavy. It’s over 25% heavier than my Windows road runner. While it’s not a big deal as it’s still relatively light, I just don’t want to carry additional kilograms
  • It is loud and hot. I don’t like that it gets so hot when I typical stuff. This is especially irritating when working in bed
  • It doesn’t support booting from VHD which I need for my public speaking stuff
  • Most importantly, Macs do not support many screens easily. Right now, I have 5 displays on my desk (not counting the monitor one). When I travel, I often plug additional screens to my laptop. I just don’t accept that Macbook Air doesn’t handle 3 external screens

When it comes to iPhone, I have these reasons to prefer Android:

  • iPhone doesn’t record calls. This is a major obstacle for which there is no easy workaround
  • iPhone can’t run many instances of an app. I want to run them independently because I need to log in with multiple accounts. Some applications support that natively, but I need a generic solution
  • Foldable phone is really a game changer
  • I have some “sophisticated” applications running on my mobile phone, for instance web server and Node.js applications that automate things around text messages. I simply can’t have that on iPhone
  • Many things are not available for iPhone, like some of my VPNs, port forwarding, etc. Android just works better

That’s it. I’ve been using iPhone for some time and it just doesn’t cut. For Macbook, I may use it as a road runner one day when it improves the hardware specification (sic!).

]]>
https://blog.adamfurmanek.pl/2024/04/24/availability-anywhere-part-26/feed/ 0
Types and Programming Languages Part 20 – Measuring how hard your work is https://blog.adamfurmanek.pl/2024/03/22/types-and-programming-languages-part-20/ https://blog.adamfurmanek.pl/2024/03/22/types-and-programming-languages-part-20/#respond Fri, 22 Mar 2024 20:53:02 +0000 https://blog.adamfurmanek.pl/?p=4991 Continue reading Types and Programming Languages Part 20 – Measuring how hard your work is]]>

This is the twentieth part of the Types and Programming Languages series. For your convenience you can find other parts in the table of contents in Part 1 — Do not return in finally

Sometimes you may ask yourself how hard you work. You could count the hours of work but we all know that some things are harder and some others are easier. One hour isn’t the same effort in different activities. Similarly, it’s easier to work when you’re not supervised and there is no time pressure. With deadlines ahead of you, the same amount of work now becomes harder and more stressful. Another factor is when you can do your work. Maybe you can work 24/7 and do it when you just feel like it, or maybe you need to stick to a strict schedule.

Being that said, there are many factors that affect “how hard” the work is. I was considering that recently and this is the formula I think works in my case. Let’s call this “Work Complexity Model”:

    \begin{gather*} C - \text{Cost of single context switch between activities} \\ A - \text{Set of activities} \\ S - \text{The size of particular increment} \\ I - \text{Number of increments in a given timeframe}\\ T - \text{Timeframe length} \\ E - \text{Final effort} \\ E = C^{card(A)} \sum_{a \in A} \frac{S \cdot I ^2} {T} \end{gather*}

Let’s say that at your work you need to do coding, doc writing, and mentoring. Therefore, your set of activities would have these three elements. You then need to asses the cost of context switch which is your personal coefficient. It doesn’t matter per se and do not compare it with others. You can use it to compare your effort in different months when you move between projects or tasks.

Next, you need to decide on the period, for instance a single quarter.

Next, for each activity you need to measure how many increments you have. If you need to deliver your work at the end of the sprint, then you would have six increments (two increments each month). If you need to deliver something every day, then you would have ninety increments.

You then need to measure the size of the increment. Obviously, this is very subjective and it’s up to you to define how hard a particular piece of work is. Technically, this should be the amount of energy (physical and mental) you spent on the task. Since it’s hard to measure, you can just count the number of hours dedicated to the task and then multiply that by how intense and frustrating the work was.

Finally, you need to include the length of the timeframe to do the work. If you can work asynchronously 24/7, then your timeframe would be the whole period. If you can do your work during work days 9-5, then it’s just these working hours.

Let’s say that you can do coding 24/7, same for doc writing, but the mentoring you can do only on Mondays 9-5. You need to deliver your coding artifacts every other week, your doc writing twice a week, and your mentoring every Monday. Therefore, this would be your complexity over 3 months.

    \begin{gather*} E = C^3 \cdot \left(\frac{S_1 \cdot 6^2} {2160} + \frac{S_2 \cdot 24^2} {2160} + \frac{S_3 \cdot 12^2} {48} \right) \end{gather*}

See that the formula has the following features:

  • It shows that context switching has some cost that scales non-linearly with the number of activities
  • Number of increments affects the result much more than the size of the increments. That’s because supervision tends to slow us much more
  • The length of the timeframe is also included in the formula

It’s up to you what your values for C and S are. The goal of this formula is not to give you some absolute scale. It’s much more to compare your different projects to have some numbers showing you how hard it was, as we know that our memory misleads us often.

]]>
https://blog.adamfurmanek.pl/2024/03/22/types-and-programming-languages-part-20/feed/ 0