Communications metadata is known to be exploited by various adversaries to undermine the security of systems, to track victims and to conduct large scale social network analysis to feed mass surveillance. Metadata resistant tools are in their infancy and research into the construction and user experience of such tools is lacking.
Cwtch was originally conceived as an extension of the metadata resistant protocol Ricochet to support asynchronous, multi-peer group communications through the use of discardable, untrusted, anonymous infrastructure.
Since then, Cwtch has evolved into a protocol in its own right, this section will outline the various known risks that Cwtch attempts to mitigate and will nbe heavily referenced throughout the rest of the document when discussing the various sub-components of the Cwtch Architecture.
It is important to identify and understand that metadata is ubiquitous in communication protocols, it is indeed necessary for such protocols to function efficiently and at scale. However, information that is useful to facilitating peers and servers is also highly relevant to adversaries wishing to exploit such information.
For our problem definition, we will assume that the content of a communication is encrypted in such a way that an adversary is practically unable to break (see tapir and cwtch for details on the encryption that we use, a and as such we will focus to the context to the communication metadata.
We seek to protect the following communication contexts:
- Who is involved in a communication? It may be possible to identify people or simply device or network identifiers. E.g., “this communication involves Alice, a journalist, and Bob a government employee.”.
- Where are the participants of the conversation? E.g., “during this communication Alice was in France and Bob was in Canada.”
- When did a conversation take place? The timing and length of communication can reveal a large amount about the nature of a call, e.g., “Bob a government employee, talked to Alice on the phone for an hour yesterday evening. This is the first time they have communicated.” *How was the conversation mediated? Whether a conversation took place over an encrypted or unencrypted email can provide useful intelligence. E.g., “Alice sent an encrypted email to Bob yesterday, whereas they usually only send plaintext emails to each other.”
- What is the conversation about? Even if the content of the communication is encrypted it is sometimes possible to derive a probable context of a conversation without knowing exactly what is said, e.g. “a person called a pizza store at dinner time” or “someone called a known suicide hotline number at 3am.”
Beyond individual conversations, we also seek to defend against context correlation attacks, whereby multiple conversations are analyzed to derive higher level information:
- Relationships: Discovering social relationships between a pair of entities by analyzing the frequency and length of their communications over a period of time. E.g. Carol and Eve call each other every single day for multiple hours at a time.
- Cliques: Discovering social relationships between a group of entities that all interact with each other. E.g. Alice, Bob and Eve all communicate with each other.
- Loosely Connected Cliques and Bridge Individuals: Discovering groups that communicate to each other through intermediaries by analyzing communication chains (e.g. everytime Alice talks to Bob she talks to Carol almost immediately after; Bob and Carol never communicate.)
- Pattern of Life: Discovering which communications are cyclical and predictable. E.g. Alice calls Eve every Monday evening for around an hour.
Cwtch provides no global display name registry, and as such people using Cwtch are more vulnerable to attacks based around misrepresentation i.e. people pretending to be other people:
A basic flow of one of these attacks is as follows, although other flows also exist:
- Alice has a friend named Bob and another called Eve
- Eve finds out Alice has a friend named Bob
- Eve creates thousands of new accounts to find one that has a similar picture / public key to Bob (won't be identical but might fool someone for a few minutes)
- Eve calls this new account "Eve New Account" and adds Alice as a friend.
- Eve then changes her name on "Eve New Account" to "Bob"
- Alice sends messages intended for "Bob" to Eve's fake Bob account
Because misrepresentation attacks are inherently about trust and verification the only absolute way of preventing them is for users to absolutely validate the public key. This is obviously not-ideal and in many cases simply won't-happen.
As such we aim to provide some user-experience hints in the ui to guide people in making choices around whether to trust accounts and/or to distinguish accounts that may be attempting to represent themselves as other users.
Cwtch does not consider attacks that require physical access (or equivalent) to the users machine as practically defendable. However, in the interests of good security engineering, throughout this document we will still refer to attacks or conditions that require such privilege and point out where any mitigations we have put in place will fail.