Overview

Welcome to the Cwtch Secure Development Handbook. The purpose of this handbook is to provide a guide to the various components of the Cwtch ecosystem, to document the known risks and mitigations, and to enable discussion about improvements and updates to Cwtch secure development processes.

History

In recent years, public awareness of the need and benefits of end-to-end encrypted solutions has increased with applications like Signal, Whatsapp and Wire now providing users with secure communications.

However, these tools require various levels of metadata exposure to function, and much of this metadata can be used to gain details about how and why a person is using a tool to communicate. [rottermanner2015privacy].

One tool that does seek to reduce metadata is Ricochet first released in 2014. Ricochet uses Tor onion services to provide secure end-to-end encrypted communication, and to protect the metadata of communications.

There are no centralized servers that assist in routing Ricochet conversations. No one other than the parties involved in a conversation can know that such a conversation is taking place.

Ricochet isn't without limitations; there is no multi-device support, nor is there a mechanism for supporting group communication or for a user to send messages while a contact is offline.

This makes adoption of Ricochet a difficult proposition; with even those in environments that would be served best by metadata resistance unaware that it exists [ermoshina2017can] [renaud2014doesn].

Risk Model

Communications metadata is known to be exploited by various adversaries to undermine the security of systems, to track victims and to conduct large scale social network analysis to feed mass surveillance. Metadata resistant tools are in their infancy and research into the construction and user experience of such tools is lacking.

Cwtch was originally conceived as an extension of the metadata resistant protocol Ricochet to support asynchronous, multi-peer group communications through the use of discardable, untrusted, anonymous infrastructure.

Since then, Cwtch has evolved into a protocol in its own right, this section will outline the various known risks that Cwtch attempts to mitigate and will nbe heavily referenced throughout the rest of the document when discussing the various sub-components of the Cwtch Architecture.

Threat Model

It is important to identify and understand that metadata is ubiquitous in communication protocols, it is indeed necessary for such protocols to function efficiently and at scale. However, information that is useful to facilitating peers and servers is also highly relevant to adversaries wishing to exploit such information.

For our problem definition, we will assume that the content of a communication is encrypted in such a way that an adversary is practically unable to break (see tapir and cwtch for details on the encryption that we use, a and as such we will focus to the context to the communication metadata.

We seek to protect the following communication contexts:

  • Who is involved in a communication? It may be possible to identify people or simply device or network identifiers. E.g., “this communication involves Alice, a journalist, and Bob a government employee.”.
  • Where are the participants of the conversation? E.g., “during this communication Alice was in France and Bob was in Canada.”
  • When did a conversation take place? The timing and length of communication can reveal a large amount about the nature of a call, e.g., “Bob a government employee, talked to Alice on the phone for an hour yesterday evening. This is the first time they have communicated.” *How was the conversation mediated? Whether a conversation took place over an encrypted or unencrypted email can provide useful intelligence. E.g., “Alice sent an encrypted email to Bob yesterday, whereas they usually only send plaintext emails to each other.”
  • What is the conversation about? Even if the content of the communication is encrypted it is sometimes possible to derive a probable context of a conversation without knowing exactly what is said, e.g. “a person called a pizza store at dinner time” or “someone called a known suicide hotline number at 3am.”

Beyond individual conversations, we also seek to defend against context correlation attacks, whereby multiple conversations are analyzed to derive higher level information:

  • Relationships: Discovering social relationships between a pair of entities by analyzing the frequency and length of their communications over a period of time. E.g. Carol and Eve call each other every single day for multiple hours at a time.
  • Cliques: Discovering social relationships between a group of entities that all interact with each other. E.g. Alice, Bob and Eve all communicate with each other.
  • Loosely Connected Cliques and Bridge Individuals: Discovering groups that communicate to each other through intermediaries by analyzing communication chains (e.g. everytime Alice talks to Bob she talks to Carol almost immediately after; Bob and Carol never communicate.)
  • Pattern of Life: Discovering which communications are cyclical and predictable. E.g. Alice calls Eve every Monday evening for around an hour.

Active Attacks

Misrepresentation Attacks

Cwtch provides no global display name registry, and as such people using Cwtch are more vulnerable to attacks based around misrepresentation i.e. people pretending to be other people:

A basic flow of one of these attacks is as follows, although other flows also exist:

  • Alice has a friend named Bob and another called Eve
  • Eve finds out Alice has a friend named Bob
  • Eve creates thousands of new accounts to find one that has a similar picture / public key to Bob (won't be identical but might fool someone for a few minutes)
  • Eve calls this new account "Eve New Account" and adds Alice as a friend.
  • Eve then changes her name on "Eve New Account" to "Bob"
  • Alice sends messages intended for "Bob" to Eve's fake Bob account

Because misrepresentation attacks are inherently about trust and verification the only absolute way of preventing them is for users to absolutely validate the public key. This is obviously not-ideal and in many cases simply won't-happen.

As such we aim to provide some user-experience hints in the ui to guide people in making choices around whether to trust accounts and/or to distinguish accounts that may be attempting to represent themselves as other users.

A note on Physical Attacks

Cwtch does not consider attacks that require physical access (or equivalent) to the users machine as practically defendable. However, in the interests of good security engineering, throughout this document we will still refer to attacks or conditions that require such privilege and point out where any mitigations we have put in place will fail.

Open Research Questions

The only way to build secure software is to attack it from every angle. There are a number of limitations and problems with Cwtch, both as designed and as implemented in our prototype. We are working on many of these, but would like to invite researchers to work with us on new solutions, as well as find new problems.

Here are the problems we know about:

  • The User Experience of Metadata Resistance Tools: Environments that offer metadata resistance are plagued with issues that impact usability, e.g. higher latencies than seen with centralized, metadata-driven systems, or dropped connections resulting from unstable anonymization networks. Additional research is needed to understand how users experience these kinds of failures, and how apps should handle and/or communicate them to users.

  • Scalability: Heavily utilized Cwtch servers increase message latency, and the resources a client requires to process messages. While Cwtch servers are designed to be cheap and easy to set up, and Cwtch peers are encouraged to move around, there is a clear balance to be found between increasing the anonymity set of a given Cwtch server (to prevent targeted disruptions) and the decentralization of Cwtch groups.

  • The (Online) First Contact Problem: Cwtch requires that any two peers are online at the same time before a key exchange/group setup is possible. One potential way to overcome this is through encoding an additional public key and a Cwtch server address into a Cwtch peer identifier. This would allow peers to send encrypted messages to an offline Cwtch peer via a known server, with the same guarantees as a Cwtch group message. This approach is not without issues, as by encoding metadata into the Cwtch identifier we allow adversaries to mount partially targeted attacks (in particular denial-of-service attacks against the Cwtch server with the aim of disrupting new connections). However, the benefit of first contact without an online key exchange is likely worth the potential DoS risk in many threat models.

  • Reliability: In Cwtch, servers have full control over the number of messages they store and for how long. This has an unfortunate impact on the reliability of group messages: if groups choose an unreliable server, they might find their messages have been dropped. While we provide a mechanism for detecting dropped/missing messages, we do not currently provide a way to recover from such failures. There are many possible strategies from asking peers to resend messages to moving to a different server, each one with benefits and drawbacks. A full evaluation of these approaches should be conducted to derive a practical solution.

  • Discoverability of Servers: Much of the strength of Cwtch rests on the assumption that peers and groups can change groups at any time, and that servers are untrusted and discardable. However, in this paper we have not introduced any mechanism for finding new servers to use to host groups. We believe that such an advertising mechanism could be built ver Cwtch itself.

Connectivity

Cwtch makes use of Tor Onion Services (v3) for all inter-node communication.

We provide the openprivacy/connectivity package for managing the Tor daemon and setting up and tearing down onion services through Tor.

Known Risks

Private Key Exposure to the Tor Process

Status: Partially Mitigated (Requires Physical Access or Privilege Escalation to exploit)

We must pass the private key of any onion service we wish to set up to the connectivity library, through the Listen interface (and thus to the Tor process). This is one of the most critical areas that is outside of our control. Any binding to a rouge tor process or binary will result in compromise of the Onion private key.

Mitigations

Connectivity attempt to bind to the system-provided Tor process as the default, only when it has been provided with an authentication token.

Otherwise connectivity always attempts to deploy its own Tor process using a known good binary packaged with the system (outside of the scope of the connectivity package)

In the long term we hope an integrated library will become available and allow direct management through an in-process interface to prevent the private key from leaving the process boundary (or other alternative paths that allow us to maintain full control over the private key in-memory.)

Tor Process Management

Status: Partially Mitigated (Requires Physical Access or Privilege Escalation to exploit)

Many issues can arise from the management of a separate process, including the need to restart, exit and otherwise ensure appropriate management.

The ACN interface provides Restart, Close and GetBootstrapStatus interfaces to allow applications to manage the underlying Tor process. In addition the SetStatusCallback method can be used to allow an application to be notified when the status of the Tor process changes.

However, if sufficiently-privileged users wish they can interfere with this mechanism, and as such the Tor process is a more brittle component interaction than others.

Testing Status

Current connectivity has limited unit testing capabilities and none of these are run during pull requests or merges. There is no integration testing.

It is worth noting that connectivity is used by both Tapir and Cwtch in their integration tests (and so despite the lack of package level testing, it is exposed to system-wide test conditions)

Tapir

Designed to replace the old protobuf-based ricochet channels, Tapir provides a framework for building anonymous applications.

It is divided into a number of layers:

  • Identity - An ed25519 keypair, required for established a Tor v3 onion service and used to maintain a consistent cryptographic identity for a peer.
  • Connections - The raw networking protocol that connects two peers. Connections are so far only defined over Tor v3 Onion Services (see: connectivity)
  • Applications - The various logic that enables a particular information flow over a connection. Examples include shared cryptographic transcripts , authentication, spam guards and token based services. Applications provide Capabilities which can be referenced by other applications to determine if a given peer has the ability to use a given hosted application.
  • Application Stacks - A mechanism for connecting more than one application together, e.g. authentication depends on a shared cryptographic transcript , and the main cwtch peer app is based on the authentication application.

Primitives

Identity

An ed25519 keypair, required for established a Tor v3 onion service and used to maintain a consistent cryptographic identity for a peer.

  • InitializeIdentity - from a known, persistent keypair: \(i,I\)
  • InitializeEphemeralIdentity - from a random keypair: \(i_e, I_e\)

Applications

Transcript App

Dependencies: None

Initializes a Merlin-based cryptographic transcript that can be used as the basis of higher level commitment-based protocols

Transcript app will panic if an app ever tries to overwrite an existing transcript with a new one (enforcing the rule that a session is based on one, and only one, transcript.)

Authentication App

  • Dependencies: Transcript App
  • Capabilities Granted: AuthenticationCapability
  • Capabilities Required: None

Engages in an ephemeral triple-diffie-hellman handshake to derive a unique, authenticated session key.

transcript.Commit()

The merlin transcript derived challenge is based on all the messages sent in the auth flow (and any that were sent prior to the Auth App)

// Derive a challenge from the transcript of the public parameters of this authentication protocol
transcript := ea.Transcript()
transcript.NewProtocol("auth-app")
transcript.AddToTranscript("outbound-hostname", []byte(outboundHostname))
transcript.AddToTranscript("inbound-hostname", []byte(inboundHostname))
transcript.AddToTranscript("outbound-challenge", outboundAuthMessage)
transcript.AddToTranscript("inbound-challenge", inboundAuthMessage)
challengeBytes := transcript.CommitToTranscript("3dh-auth-challenge")

Asymmetry

The client connection is guaranteed to possess the long term identity of the server connection through the properties of the underlying tor v3 onion connection.

As such if the server attempts to send a different long term identity to the client we can detect it and terminate the authentication protocol early.

Token App

Dependencies: Transcript App

  • Capabilities Granted: HasTokensCapability
  • Capabilities Required: None (implicitly guarded)

Allows the client to obtain signed, blinded tokens for use in another application.

While this application has no explicit requirement for any given capability, we expect it to be protected via a preceeding app in an ApplicationChain e.g.

powTokenApp := new(applications.ApplicationChain).
        ChainApplication(new(applications.ProofOfWorkApplication), applications.SuccessfulProofOfWorkCapability).
        ChainApplication(tokenApplication, applications.HasTokensCapability)

Notes

  • No direct testing (tested via integration tests and unit tests)

Ephemeral Connections

Occasionally it is desirable to have a peer conenct to another / a service without using their long term identity (e.g. in the case of connecting to a Cwtch Server).

In this case we want to enable a convenient way to allow connecting with an ephemeral identity.

It turns out that doing this securely requires maintaining a completely separate set of connections and applications in order to avoid side channel around avoid duplicate connections (i.e. if we did mix them up then a service might be able to exploit the fact that clients avid duplicate connections by attempting to connect to known-online peers and observing if they reject the connection because they already have an outbound ephemeral connection open.)

Because of this, we don't provide an explicit Ephemeral Connect api and instead recommend that peers maintain one long term service and multiple ephemeral services.

Known Risks

Impersonation of Peers

Status: Mitigated

By default, tor v3 onion services only provide one-way authentication, that is the client can verify a metadata resistant connection to the server by the server obtained no information about the client.

Tapir provides a peer-to-peer interface over this client-server structure through the Authentication application.

The Authentication application implements a 3-way, ephemeral diffie-hellman handshake to generate a shared session key. Once generated this session key is used to encrypt all traffic between the two peers for the duration of the session.

The session key is used to encrypt a challenge derived from the shared cryptographic transcript (based on merlin)

Only if all the above checks pass is the connection maintained open - otherwise the peer that detects a failure closes the connection.

Double Connections

Status: Mitigated

Because of the one-way authentication provided by Tor onion services there is a window between connection instantiation and the finalization of authentication when two valid connections can occur between the same two peers.

While these vestigial connections are not harmful, they do have the potential to confuse users and interfaces. To avoid ambiguity Tapir attempt to detect and close duplicate connections through a number of rules:

  1. If a connection open is attempted to a hostname that already has an open connection the connection attempt is aborted.
  2. After authentication the lookup happens again, and if another connection is found the newest connection is closed.

There is a small chance both peers will close their initiated connections if they also happen to start the connection attempt at exactly the sametime . This should be exceedingly rare in practice, and is further mitigated by an exponential backoff of connection retries by the ui

Finally, the Tapir interfaces WaitForCapabilityOrClose and GetConnection are aware of the potential for duplicate connections and have logic that allows the handling of such instances (such as returning an error when they are found allowing a handling application to retry the request if a connection with a given capability isn't returned)

Testing Status

Tapir features a number of well-defined integration tests which exercise not only the ideal case of two well-formed peers authenticating and messaging each other, but also a malicious peer attempting to bypass authentication.

In addition, unit tests are defined for a number of the specified applications (including Authentication) and many of the cryptographic primitives.

All tests are also run with the -race flag which will cause them to fail if race conditions are detected. Both integration tests and unit tests are run automatically for every pull request and main branch merge.

Packet Format

All tapir packets are fixed length (8192 bytes) with the first 2 bytes indicated the actual length of the message, len bytes of data, and the rest zero padded:

| len (2 bytes) | data (len bytes) | paddding (8190-len bytes)|

Once encrypted, the entire 8192 byte data packet is encrypted using libsodium secretbox using the standard structure ( note in this case the actual usable size of the data packet is 8190-14 to accommodate the nonce included by secret box)

For information on how th secret key is derived see the authentication protocol

Authentication Protocol

Each peer, given an open connection \(C\):

\[ \ I = \mathrm{InitializeIdentity()} \\ I_e = \mathrm{InitializeEphemeralIdentity()} \\ \\ I,I_e \rightarrow C \\ P,P_e \leftarrow C \\ \\ k = \mathrm{KDF}({P_e}^{i} + {P}^{i_e} + {P_e}^{i_e}) \\ c = \mathrm{E}(k, transcript.Commit()) \\ c \rightarrow C \\ c_p \leftarrow C \\ \mathrm{D}(k, c_p) \stackrel{?}{=} transcript.LatestCommit() \\ \]

The above represents a sketch protocol, in reality there are a few implementation details worth pointing out:

Once derived from the key derivation function \(\mathrm{KDF}\) the key \(k\) is set on the connection, meaning the authentication app doesn't do the encryption or decryption explicitly.

Also the concatenation of parts of the 3DH exchange is strictly ordered:

  • DH of the Long term identity of the outbound connection by the ephemeral key of the inbound connection.
  • DH of the Long term identity of the inbound connection by the ephemeral key of the outbound connection.
  • DH of the two ephemeral identities of the inbound and outbound connections.

This strict ordering ensures both sides of the connection derive the same session key.

Cryptographic Properties

During an online-session, all messages encrypted with the session key can be authenticated by the peers as having come from their peer (or at least, someone with possession of their peers secret key as it related to their onion address).

Once the session has ended, a transcript containing the long term and ephemeral public keys, a derived session key and all encrypted messages in the session cannot be proven to be authentic i.e. this protocol provides message & participant repudiation (offline deniable) in addition to message unlinkability (offline deniable) in the case where someone is satisfied that a single message in the transcript must have originated from a peer, there is no way of linking any other message to the session.

Intuition for the above: the only cryptographic material related to the transcript is the derived session key - if the session key is made public it can be used to forge new messages in the transcript - and as such, any standalone transcript is subject to forgery and thus cannot be used to cryptographically tie a peer to a conversation.

Cwtch Library

Known Risks

Thread Safety

Status: Partially Mitigated (Work in Progress)

The Cwtch library evolved from a prototype that had weak checks around concurrency and the addition of singleton behavior around saving profiles to files and protocol engines resulted in race conditions.

The inclusion of the Event Bus made handling such cases easier, and the majority of the code is now tested via unit tests and integration test running the -race flag. The last portion of the code that requires work in this regard are around the AppBridge and Server which are used by the UI to maintain separation.

Private information transiting the IPC boundary

Status: Unmitigated (Requires privileged user to exploit)

Information used to derive the encryption key used to save all sensitive data to the file system cross the boundary between the UI front-end and the App backend.

Intercepting this information requires a privileged position on the local machine. There are currently no plans to mitigate this issue.

Testing Status

Cwtch features one well-defined integration test which exercise the ideal case of three well-formed peers authenticating and messaging each other through an untrusted server.

In addition, unit tests are defined for a number of Cwtch modules, however many of them have become outdated with the introduction of Tapir.

Most tests are run with the -race flag which will cause them to fail if race conditions are detected.

Both integration tests and unit tests are run automatically for every pull request and main branch merge.

Resolved or Outdated Risks

Dependency on Outdated Protobuf Implementation

Status: Mitigated

The group features of Cwtch are enabled by an untrusted infrastructure protcol that was originally implemented using the older ricochet-based channels. The go code that was generated from these channels no longer works given the newest version of the protobufs framework.

We have removed protobufs entirely from the project by porting this functionality over the Tapir.

PoW Spam Prevention as a Metadata Vector

Status: Outdated: Cwtch now uses Token Based Services to separate challenges like PoW from resolving the tokens.

Processing capabilities are not constant, and so a malicious server could perform some correlations/fiddle with difficulty per connection in an attempt to identify connections over time.

Needs some statistical experimentation to quantify, but given the existing research detecting timeskews over Tor I wouldn't be surprised if this could be derived.

As for mitigation: Adding a random time skew might be an option,some defense against the server adjusting difficulty too often would also mitigate some of the more extreme vectors.

Additionally, Token Based Services and Peer-based Groups are both potential options for eliminating this attack vector entirely.

Groups

For the most part the Cwtch risk model for groups is split into two distinct profiles:

  • Groups made up of mutually trusted participants where peers are assumed honest.
  • Groups consisting of strangers where peers are assumed to be potentially malicious.

Most of the mitigations described in this section relate to the latter case, but naturally also impact the former. Even if assumed honest peers later turn malicious there are mechanisms that can detect such malice and prevent it from happening in the future.

Risk Overview: Key Derivation

In the ideal case we would use a protocol like OTR, the limitations preventing us from doing so right now are:

  • Offline messages are not guaranteed to reach all peers, and as such any metadata relating to key material might get lost. We need a key derivation process which is robust to missing messages or incomplete broadcast.

Risk: Malicious Peer Leaks Group Key and/or Conversation

Status: Partially Mitigated (but impossible to mitigate fully)

Whether dealing with trusted smaller groups or partially-public larger groups there is always the possibility that a malicious actor will leak group messages.

We plan to make it easy for peers to fork groups to mitigate the same key being used to encrypt lots of sensitive information and provide some level of forward secrecy for past group conversations.

Risk: Active Attacks by Group Members

Status: Partially Mitigated

Group members, who have access to the key material of the group, can conspire with a server or other group members to break transcript consistency.

While we cannot directly prevent censorship given this kind of active collusion, we have a number of mechanisms in place that should reveal the presence of censorship to honest members of the group.

Mitigations:

  • Because each message is signed by the peers public key, it should not be possible (within the cryptographic assumptions of the underlying cryptography) for one group member to imitate another.
  • Each message contains a unique identifier derived from the contents and the previous message hash - making it impossible for collaborators to include messages from non-colluding members without revealing an implicit message chain (which if they were attempting to censor other messages would reveal such censorship)

Finally: We are actively working on adding non-repudiation to Cwtch servers such that they themselves are restricted in what they can censor efficiently.

Cwtch UI

The UI is built on therecipe/qt which links in Qt libraries.

Known Risks

Deanonymization through Content Injection

Status: Mitigated in several places

Like most UI frameworks, QML provides a HTML rendering engine with the potential to make requests through remote resource loading. Any kind of malicious content injection is therefore elevated to a critical deanonymization risk.

To mitigate such a risk we do the following:

  • Maintain our own UI library that explicitly relies on PlainText fields to handle all content (and thus styled safely)
  • Mediate all Cwtch api networking calls through Tor
  • Force QML to use a deliberately broken network resolver that is incapable of resolving remote content
  • Frequently test the UI for potential content injection vulnerabilities.

While none of these mitigations should be assumed robust by themselves, the combination of them should be sufficient to prevent such attacks.

Denial of Service through Spamming

Status: Partially Mitigated

There is currently no limitation on the number of messages that can be sent to a Cwtch server or by a Cwtch peer. Each message requires process and is added to the UI if valid.

We have put in work to ensure that an influx of messages does not degrade the app experience, however it will result in an increase in network badwidth which may be intolerable or undesired for many people - especially those on metered connections (e.g. cellphone data plans)

In order to be suitable to deploy groups at a wide scale, the app require a way to prevent Cwtch from fetching information over such connections, and this should likely be turned on by default.

Testing Status

The UI is currently only subject to manual testing.

Profile Encryption & Storage

Profiles are stored on locally on disk and encrypted using a key derived from user-known password (via pbkdf2).

Note that once encrypted and stored on disk, the only way to recover a profile is by rederiving the password - as such it isn't possible to provide a full list of profiles a user might have access to until they enter a password.

Unencrypted Profiles and the Default Password

To handle profiles that are "unencrypted" (i.e don't require a password to open) we currently create a profile with a default, hardcoded password.

This isn't ideal, we would much rather wish to rely on OS-provided key material such that the profile is bound to a specific device, but such features are currently patchwork - we also note by creating an unencrypted profile, people who use Cwtch are explicitly opting into the risk that someone with access to the file system may be able to decrypt their profile.

Related Code

https://git.openprivacy.ca/cwtch.im/cwtch/src/commit/3529e21b0e0763ca6ec20ddfd13e22c8070c4915/storage/v1/file_enc.go

Cwtch Server

The goal of the Cwtch protocol is to enable group communication through Untrusted Infrastructure.

Unlike in relay-based schemes where the groups assign a leader, set of leaders, or a trusted third party server to ensure that every member of the group can send and receive messages in a timely manner (even if members are offline) - untrusted infrastructure has a goal of realizing those properties without the assumption of trust.

The original Cwtch paper defined a set of properties that Cwtch Servers were expected to provide:

  • Cwtch Server may be used by multiple groups or just one.
  • A Cwtch Server, without collaboration of a group member, should never learn the identity of participants within a group.
  • A Cwtch Server should never learn the content of any communication.
  • A Cwtch Server should never be able to distinguish messages as belonging to a particular group.

We note here that these properties are a superset of the design aims of Private Information Retrieval structures.

Malcious Servers

We expect the presence of malicious entities within the Cwtch ecosystem.

We also prioritize decentralization and permissionless entry into the ecosystem and as such we do not base any security claims on the following:

  • Any non-collusion assumptions between a set of Cwtch servers
  • Any third-party defined verification process

Peers themselves are encouraged to set up and run Cwtch servers where they can guarantee more efficient properties by relaxing trust and security assumptions - however, by default, we design the protocol to be secure without these assumptions - sacrificing efficiency where necessary.

Detectable Faults

  • If a Cwtch server fails to relay a specific message to a subset of group members then there will be a detectable gap in the message tree of certain peers that can be discovered through peer-to-peer gossip.
  • A Cwtch server cannot modify any message without the key material known to the group (any attempt to do so for a subset of group memebers will result in identical behavior to failing to relay a message).
  • While a server can duplicate messages, these will have no impact on the group message tree (because of encryption, nonces and message identities) - the source of the duplication is not knowable to a peer.

Efficiency

As of writing, only 1 protocol is known for achieving the desired properties, naive PIR or "the server sends everything, and the peers sift through it".

This has an obvious impact on bandwidth efficiency, especially for peers using mobile devices, as such we are actively developing new protocols in which the privacy and efficiency guarantees can be traded-off in different ways.

The most developed idea is to bucket the messages on the server into discrete time windows and allow peers to fetch smaller batches, coupled with the underlying tor connection this technique should provide sufficient privacy - although the technique still needs formal verification.

Protecting the Server from Malicious Peers

The main risk to servers come in the form of spam generated by peers. In the prototype of Cwtch a spamguard mechanism was put in place that required peers to conduct some arbitrary proof of work given a server-specified parameter.

This is not a robust solution in the presence of a determined adversary with a significant amount of resources, and thus one of the main external risks to the Cwtch system becomes censorship-via-resource exhaustion.

We have outlined a potential solution to this in token based services but note that this also requires further development.

Development

The main process to counter malicious actors in development of Cwtch is the openness of the process.

To enhance this openness, automated builds, testing and packaging are defined as part of the repositories - improving te robustness of the code base at every stage.

While individual tests aren't perfect, and all processes have gaps, we should be committed to make it as easy as possible to contribute to Cwtch while also building pipelines and processes that catch errors (unintential or malicious) as soon as possible.

Risk: Developer Directly Pushes Malicious Code

Status: Mitigated

Master is currently locked and 3 Open Privacy staff members have permission to override it, and the responsibility of monitoring changes.

Further every new pull request and merge triggered automated builds & tests which trigger emails and audit logs.

The code is also open source and inspectable by anyone.

Risk: Code Regressions

Status: Partially Mitgated (See individual project entries in this handbook for more information)

Our automated pipelines have the ability to catch regressions when that behaviour is detectable.

The greatest challenge is in defining how such regressions are detected for the ui - where behaviour isn't as strictly defined as it is for the individual libraries.

Deployment

Risk: Binaries are replaced on the website with malicious ones

Status: Unmitigated

While this process is now mostly automated, should this automation ever be compromised then there is nothing in our current process that would detect this.

We need:

  • Reproducible Builds - it is unlikely that we will be able to do this overnight, several parts of our build process (Qt builds, the recipe etc.) may introduce non-determinism. Nevertheless, we should seek to identify where this non-determinism is.
  • Signed Releases - Open Privacy does not yet maintain a public record of staff public keys. This is likely a necessity for signing released builds and creating an audit chain backed by the organization. This process must be manual by definition.

References

  • Nik Unger et al. “SoK: secure messaging”. In:Security and Privacy (SP ), 2015 IEEE Sympo-sium on. IEEE. 2015, pp. 232–249 link