Chatterbox Part 1 — Origins

This is the first part of the Chatterbox series. For your convenience you can find other parts using the links below (or by guessing the address):
Part 1 — Origins
Part 2 — Desktop interfaces
Part 3 — Security and mobile devices
Part 4 — Other channels
Part 5 — Self healing
Part 6 — Protocols
Part 7 — File writing
Part 8 — Integrations
Part 9 — Calling Facebook using GSM
Part 10 — Poor man’s voice-based paging system
Part 11 — Scraping memory dump in Chrome with Chrome Debugging Protocol
Part 12 — Scraping page’s model with JavaScript or extensions
Part 13 — Capturing model with Fiddler
Part 14 — SMS application for android the hacky way
Part 15 — Make Messenger call you

One of the nicest aspects of IT is one problem can be solved in multiple ways. Some solutions are clearly “wrong”, some of them are clever and tricky, some of them are the de facto or de jure standards. This generalizes to patterns, designs, and ultimately — whole systems. On the other hand, multiple standards lead to incompatibilities and issues, just like in this famous XKCD.

This is a big issue in instant messaging world. There are many solutions in the market and they are very rarely compatible. There were multiple “standards” which seemed like they could solve the issue once and for all (XMPP for instance) but they failed miserably (abandoned by Facebook or Google) and now it looks like things are siloing again. Maybe the trend will reverse in couple years but currently there are more and more platforms for communication, with both text and voice solutions, which are by design independent and fight for market share.

I’m using dozens of protocols, literally. These include more popular ones like Facebook or Hangouts, some typical “group chats” like Slack or IRC, and some local solutions as well like Gadu Gadu in Poland. In some networks I have couple accounts (for instance for multiple SIM cards I have). Also, I’m using multiple devices (phones, tablets, desktops) and generally don’t like typing on mobile ones. Configuring all these devices is super painful, especially that they all have different operating systems etc.

I was using IM+ for some time and it was cool. It supported many of my networks (back in 2012 I didn’t have so many of them) but it had this one killer feature – if I didn’t get the message (because I was out) it was sending it to email and I could reply to it using regular email client. That was really convenient as I didn’t need to configure separate apps for each network, I just had to configure email client and stay in touch.

Unfortunately, couple years back the email feature stopped working. After using it for couple years I realized it’s very useful so I decided to reimplement it on my own in a very narrow scope. How hard could it be after all?

This is how Chatterbox started.

So there were some “plans” around it:

  • Keep it “simple” – I didn’t want to host complex “infrastructure” (like databases, queues etc)
  • Just support one protocol — XMPP — and use other networks via gateways
  • Send email for each message and monitor some inbox so I can respond
  • It’s just a “secondary” application, not something replacing all my communicators (especially Miranda NG I was using at that time)
  • Not spend much time on implementation — just get it up and running in hours and probably never develop

This didn’t seem to be a hard thing to do so it took me like two evenings to implement all of that. Now, over 3 years later, Chatterbox is way bigger than it was supposed to be:

  • It is my main IM and “web monitoring” tool (for forums, books, promotions etc)
  • Supports web, desktop, and mobile UI
  • Can mail me, text me, call me, read messages out loud, transform speech to text
  • I can mail it, text it, use mobile, desktop, or web interfaces to send messages
  • I can use it on an airplane with messaging wifi only (and still can contact any network, not just Fb or Whatsapp)
  • It can share media between networks so I can “just send” image to any network, no matter if it supports attachments or not (like IRC)
  • Recognizes when I’m around or unavailable and then sends notifications
  • Can schedule messages for later so I can compose now and get it deliver at a specific time
  • Supports recalling messages so I can stop them from getting delivered after hitting enter
  • Notifies about deadletters and failures so I’m paged when there is an issue
  • Encrypts messages so it’s safe in transit over public channels
  • Supports 30+ networks (yes, I’m using that many!)
  • Runs on a single box with relatively low resource usage but can also scale on other machines if needed
  • Self heals itself and can survive pretty significant outages (as long as the machine doesn’t die, obviously)
  • Is free and doesn’t use paid components
  • And the most important — it works for years now and I know I can trust it

It is a long and beautiful journey which let me learn a lot about operating systems, distributed applications, “enterprise” approaches. I forked plenty of libraries, fixed bugs in components I never wanted to touch, reimplemented OS primitives, or just learned multiple tricks.

Over the next parts I’ll describe how I implemented couple things which may be super simple once you know them but surprising if you never tried doing. It won’t be technical, more a bunch of notes showing how things can wrong and what mistakes I made on the way.