A distributed, transactional,
fault-tolerant object store

Semantics

  • Full transactions. A transaction is atomic and isolated and durable no matter how many objects it involves.
  • Strongly-serializable transactions only. Strong-serializability requires that the database behaviour is indistinguishable from a database which performs one transaction at a time, and starts each transaction with the database state at the point achieved by the last committed transaction. (For database theory fans: the committed transaction precedence graph not only has no cycles, but as each node (transaction) is added to the graph, the new node must satisfy all necessary dependencies without requiring any outgoing edges. Dependences between transactions are based on versioned objects (MVCC), and multiple transactions from the same client which commit depend on each other in the order in which they commit (commit is a synchronous operation)).
  • Ability to tolerate up to (and including) F failures, where F is a configuration parameter. The only constraint is that the minimum cluster size is 2*F + 1. Provided no more than F nodes of the cluster are unreachable, the cluster will continue working as normal and there is no impact on client operations. Failure is defined as a TCP connection dropping. Failure of remote nodes is only measured by each local node: there is no need to establish any concept of cluster membership.
  • CP-system. When more than F nodes become unreachable, a transaction may block, unable to complete, or may be rejected. When unreachable nodes return, there will be no consistency issues: the cluster will be fully operational as soon as no more than F nodes are unreachable. Divergence is not possible.
  • A Retry transaction will be restarted once any of the objects read by the retry transaction are modified by another transaction. This is regardless of whether the value of the modified object is different from the previous value (i.e. a transaction which rewrites the same value of an object will trigger retry transactions waiting on that object).
  • A client can only access existing objects for which there is a path of readable references to the object and from a root object on which the client has a read capability. To access an object, a client must follow such a path.
  • An object created by a client will remain accessible by that same client connection (and with read-write capability) until the client connection closes. For the object to become accessible to any other client connection, there must be a readable path from a root object that the client can navigate to the object.
  • If a client does not have a read capability on an object, the client will not be sent the value or references of that object by the server. If a client does not have a write capability on an object then any attempt to write to the object will result in the server aborting the transaction.
  • A client can use and grant to others only the capabilities it has itself received for any given object. If a client has only received a read capability then it may read the object and its references, and it may create new references to the object which contain either the read capability or the none capability. If a client has only received a write capability then it may write to the object and set the object's references, and it may create new references to the object which contain either the write capability or the none capability. If a client has received both read and write capabilities (either at once from the same object reference (pointer), or from two separate references), then it may both read and write the object and its references. It may also create new references to the object which contain either the read-write capability, or the write capability, or the read capability, or the none capability.