Version vector
From Wikipedia, the free encyclopedia
A version vector is a mechanism for tracking changes to data in a distributed system, where multiple agents might update the data at different times. The version vector allows the participants to determine if one update preceded another (happened-before), followed it, or if the two updates happened concurrently (and therefore might conflict with each other). In this way, version vectors enable causality tracking among data replicas and are a basic mechanism for optimistic replication. In mathematical terms, the version vector generates a preorder that tracks the events that precede, and may therefore influence, later updates.
Version vectors maintain state identical to that in a vector clock, but the update rules differ slightly; in this example, replicas can either experience local updates (e.g., the user editing a file on the local node), or can synchronize with another replica:
- Initially all vector counters are zero.
- Each time a replica experiences a local update event, it increments its own counter in the vector by one.
- Each time two replicas a and b synchronize, they both set the elements in their copy of the vector to the maximum of the element across both counters:
. After synchronization, the two replicas have identical version vectors.
Pairs of replicas, a, b, can be compared by inspecting their version vectors and determined to be either: identical (), concurrent (
), or ordered (
or
). The ordered relation is defined as: Vector
if and only if every element of
is less than or equal to its corresponding element in
, and at least one of the elements is strictly less than. If neither
or
, but the vectors are not identical, then the two vectors must be concurrent.
Version vectors[1] or variants are used to track updates in many distributed file systems, such as Coda (file system) and Ficus, and are the main data structure behind optimistic replication.[2]