Availability Anywhere Part 13 — Optimizing FileProxy

This is the thirteenth part of the Availability Anywhere series. For your convenience you can find other parts in the table of contents in Part 1 – Connecting to SSH tunnel automatically in Windows

Last time we implemented TCP over File System (FileProxy). Today we’re going to check its performance and optimize it a bit.

Let’s start with benchmarks. We are going to use the following application:

This application has two modes. In the client mode it generates some traffic to the server. In the server mode it reads the incoming traffic and echoes it back.

The client sends packets of sizes between 1 byte and 100 000 bytes. It sends every packet one thousand times and then calculates how long it took on average to send one packet.

We run this application twice. First, with direct connection between the client and the server. Second time with the TCP over File System proxy in between. Here are the results in milliseconds:

So we can see that the direct connection has an overhead of 40-700 microseconds. FileProxy makes it between 3 and 5 milliseconds. So the total slowdown is up to 70 times. However, this is even slower when run between the host and the guest VM:

This time it is even 5000 times slower! However, average overhead is around 80-100 milliseconds. This makes interactive applications much slower.

Let’s see what we do wrong. We don’t handle permissions to the file well, so two processes throw exceptions when the file is blocked. We try reading the file constantly instead of getting notified when the file is changed. We resize arrays instead of sending the buffer directly. Here is the updated code:

We can see that we optimized the application 3-4 times. This gives a much better performance.

What makes the slowdown between the host and the guest? The filesystem that is not shared well. We could try using RAM Disk based on imDisk:

We can see that it’s actually slower. I don’t know if there is a way to make this much aster. Obiviously, we could optimize the code a little bit more, but I think the drive is the biggest factor here.