Message Overlays

Adapted from: Discreet Log #8: Notes on the Cwtch Chat API

We envision Cwtch as a platform for providing an authenticated transport layer to higher-level applications. Developers are free to make their own choices about what application layer protocols to use, whether they want bespoke binary message formats or just want to throw an HTTP library on top and call it a day. Cwtch can generate new keypairs for you (which become onion addresses; no need for any DNS registrations!) and you can REST assured that any data your application receives from the (anonymous communication) network has been authenticated already.

For our current stack, messages are wrapped in a minimal JSON frame that adds some contextual information about the message type. And because serialised JSON objects are just dictionaries, we can easily add more metadata later on as needed.

Chat overlays, lists, and bulletins

The original Cwtch alpha demoed "overlays": different ways of interpreting the same data channel, depending on the structure of the atomic data itself. W

e included simple checklists and BBS/classified ads as overlays that could be viewed and shared with any Cwtch contact, be it a single peer or a group. The wire format looked like this:

{o:1,d:"hey there!"}
{o:3,d:"garage sale",p:"[parent message signature]"}

Overlay field o determined if it was a chat (1), list (2), or bulletin (3) message. The data field d is overloaded, and lists/bulletins need additional information about what group/post they belong to. (We use message signatures in place of IDs to avoid things like message ordering problems and maliciously crafted IDs. This is also how the Cwtch protocol communicates to the front end which message is being acked.)

Data structure

Implementing tree-structured data on top of a sequential message store comes with obvious performance disadvantages. For example, consider the message view, which loads most-recent-messages first and only goes back far enough to fetch enough messages to fill the current viewport, in comparison with a (somewhat pathological) forum where almost every message is a child of the very first message in the history, which could have been gigs and gigs of data-ago. If the UI only displays top-level posts until the user expands them, we have to parse the entire history before we get enough info to display anything at all.

Another problem is that multiplexing all these overlays into one data store creates "holes" in the data that confuse lazy-loaded listviews and scrollbars. The message count may indicate there is a ton more information to display if the user simply scrolls, but when it actually gets fetched and parsed we might realize that none of it is relevant to the current overlay.

None of these problems are insurmountable, but they demonstrate a flaw in our initial assumptions about the nature of collaborative message flows and how we should be handling that data.

Overlay Types

As stated above, overlays are specified in a very simple JSON format with the following structure:

type ChatMessage struct {
    O int    `json:"o"`
    D string `json:"d"`

Where O stands for Overlay with the current supported overlays documented below:

1: data is a chat string
2: data is a list state/delta
3: data is a bulletin state/delta
100: contact suggestion; data is a peer onion address
101: contact suggestion; data is a group invite string

Chat Messages (Overlay 1)

The most simple over is a chat message which simply contains raw, unprocessed chat message information.

Invitations (Overlays 100 and 101)

Instead of receiving the invite as an incoming contact request at the profile level, new inline invites are shared with a particular contact/group, where they can be viewed and/or accepted later, even if they were initially rejected (potentially by accident).

The wire format for these are equally simple:


This represents a departure from our original "overlays" thinking to a more action-oriented representation. The chat "overlay" can communicate that someone did something, even if it's paraphrased down to "added an item to a list," and the lists and bulletins and other beautifully chaotic data can have their state precomputed and stored separately.