Relevant Files

Background Concepts


Gossip is a general strategy to disseminate information in a network that works as follows. Periodically, each node contacts another node at random and exchanges information with that node, propagating new updates. Several variants of gossip can be implemented: information can flow in one direction or in both directions at each gossip round; a node may send updates (push gossip) or ask for updates (pull gossip); etc. If every nodes propagates information in background in this way, eventually all nodes receive all updates. One advantage of gossip is that it provides a simple way to distribute the load of the information dissemination task among all processes.

Consistent Hashing

Consistent hashing is a technique to assign items to nodes that avoid the need for keeping a directory that stores explicitly the location of every item.

The technique works by mapping all servers and all files in a common address space (usually a large address space, for instance, using 128 bit identifiers). The mapping is provided by a hash function. By hashing the name of the server (or the IP address or the server) one gets the the identifier of the server in the address space. By hashing the name of the file we get the identifier of the file in the address space. This will distribute the servers and the files uniformly at random in the address space, as depicted in the figure below.

The each file is assigned to the first server with an identifier larger than the identifier of the file, known as the sucessor (by moving in the address space in a clockwise manner). In this example "fileZ", with identifier 6512 is assigned to "server D", with identifier 7644.
One advantage of consistent hashing is that if new servers are added or removed, only a fraction of the files need to be relocated. For instance, in this example, if "server A" is removed from the system, only "fileX" needs to be relocated.