Let's Gossip

Message Broadcasting in Distributed Systems

Sharing information across the hosts present in a service is a common requirement in design of distributed systems. The post tries to cover some of the common design architectures which can be used to support message broadcasting.

Let's first discuss some of the use cases where such a component is needed.

  • Rate Limiting : Each client is assigned a quote per time unit. For effective rate limiting, each host must communicate to all other hosts the usage metrics of each client. Hosts will use this information to update its local quota for the client.
  • Membership Management : Each host might need to know about all the other hosts which are part of the service. Membership of hosts can be validated via heartbeats. We need efficient ways to ensure all hosts have updated membership information.
  • Consensus : Multiple hosts together might need to make a decision together. In order to achieve this, they will need to exchange messages with each other.

Now let's discuss some of the architectures which can be used to solve this problem.

Full Mesh Pattern

In Full Mesh Pattern, each host in the fleet will communicate/broadcast the information to every other host in the fleet.

Full Mesh

To achieve this, we would need a service registry where all hosts are registered when they join the fleet. The service registry can be implemented with existing technologies like ZooKeeper and can be used by the hosts to identify the rest of the hosts in the fleet. Alternatively, if supported, we could query the load balancer to get list of hosts registered with it.

This approach is pretty straightforward to implement and is a viable option to start with. But as the size of the fleet grows this approach becomes a scalability bottleneck. The number of connections has a quadratic dependency on the number of hosts. Something similar had happened during the Kinesis outage of November 2020. An alternate to avoid such failures would be to look into cell based architecture, but that's a discussion for another day.

Gossip Protocol

Gossip Protocol spreads messages to all hosts in a way similar to how an epidemic spreads. Each host will randomly select a few of its peers and share the updates with them on a configured frequency. This way each peer will eventually receive the piece of information.

Gossip Protocol

Cassandra uses this protocol to discover location and state information of other nodes in the cluster. WeaveNet is a framework which internally uses gossip protocol to transmit membership information to all hosts. This article explains how it can be integrated with ECS and can be used to broadcast information to all hosts.

Distributed Caching

Distributed Cache

Caches are a commonly used component of distributed systems. For this use case too, a distributed cache like Redis can be used to store the data which needs to be shared among the hosts. All hosts can update and read the information from this cache cluster. Since most distributed systems already have cache cluster, they can be easily extended to this use case. AWS ElastiCache for Redis can be to implement such a pattern.

Coordination Service

Coordination Service for Leader Selection

Another alternative is to assign a leader to the cluster. This leader will be responsible for fetching the data from all hosts and then relaying back the updates to them. This leads us to another issue, how to select a leader? Leader selection is a well known problem and can be achieved via algorithms like Paxos and Raft. But it is very complicated to implement such algorithms with proper failure handling. Alternatively, Apache ZooKeeper can be used to select the leader.

Random Leader Selection

Random Leader Selection

Leader selection algorithms ensure that exactly one leader is selected in the cluster at a time. But some use cases like the rate limiting problem can still work when more than one leaders are selected. Since coordination service solution explained above sacrifices availability for consistency, an alternate where random node is selected as a leader can be used in use cases where availability is of higher importance. In such cases, there might be more than one leader selected in the cluster at a time.