This is the ninth part of the Chatterbox series. For your convenience you can find other parts in the table of contents in Part 1 – Origins
Until now we focused on how to integrate text communication. We can share some media the same way, namely via providing links to images or videos. We can even embed them in some webview to simplify previewing. However, to integrate interactive calls we need to do a little more.
How do we call from Facebook to Skype? Or from Whatsapp to Google Hangouts? We can provide a server reencoding all media streams but that would probably require a lot of coding and protocol reversing. Adding mobile calls (over GSM provider, not over Internet) would increase the complexity even more. However, we can actually do this much easier, by capturing screen + voice + webcam.
First, we need to be able to “dial in” from mobile phone to the call. To do that we can use some existing services, e.g. Chime or Zoom. So we create a bridge BridgeA, we dial in from our mobile phone and we’re done.
Next, we need to have some server which would route things between networks. We open two browsers: BrowserA which dials into BridgeA, and BrowserB which calls someone on Facebook, Hangouts, Meet, whatever else.
Now, we need to have two fake audio lines. On Windows you can use Virtual Audio Cable and VB-Cable which are free and can do the trick.
We also need to be able to route an application to selected audio line. Sometimes it can be selected in the application, sometimes it cannot. For the latter case you can use Audio Router.
Now, we configure BrowserA to emit output to Line1 and get input from Line2. Analogically, we configure BrowserB to emit output to Line2 and get input from Line1. Effectively, anything going out of BrowserA will go into BrowserB using Line1. In the same way everything going out of BrowserB will enter BrowserA using Line2.
That’s it. Now you can route voice between two bridges. Obviously, you can mix audio in any way to create multi-way bridges if needed. Nothing can stop you now.
Okay, what about video? For that we can use BridgeA the same way, just dial in using computer/mobile phone with video. Next, you need to configure virtual webcam source. You can use Chrome extension or OBS Virtual Cam. Just capture screen from BrowserA into virtual webcam used by BrowserB, and the other way round. Depending on the capture quality and configuration you can mimic multiple screens etc.
What about the delay? From my experience it is close to 1 second. This is mediocre but I believe people are now used to the latency so it should be fine, at least it’s okay for my purposes.