Paper

Highly Available Transactions: Virtues and Limitations
Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. Proceedings of the VLDB Endowment, 2013.

Background

The aim of a highly available (HA) system is to mitigate effects of network failures and facilitate low latency. The 2000s saw a shift from relational databases, to distributed weakly consistent key-value stores, which was prompted by confusion surrounding the CAP theorem. People mistook the CAP impossibility result, a system cannot be available and consistency in the event of network partitions, as evidence transactional ACID properties cannot be provided with high availability. Thus, foregoing transactional semantics in order to achieve HA. The 2010s, however, has seen a resurgence of transactional, “NewSQL”, systems; indicating the value of transactional semantics.

Main Idea

Some transactional semantics, e.g. Serializability, are known to be unavailable in the event of network partitions. There are, however, due performance limitations of Serializability, a number of weak isolation semantics described in the literature. Additionally, distributed databases are commonly replicated systems, thus, replica consistency semantics must also be considered. This paper identifies which transactional semantics, replica consistency models, and combinations of them, can be provided with high availability in a distributed database, referred to as Highly Available Transactions (HATs).

Theory

A distributed database can be fully replicated, each replica has a copy of all objects, or partially replicated, each replica has a copy of a subset objects. The distributed and parallel computing literature focuses on single-object, single-operation, availability. Here, highly available means if a client can contact any replica of an object then a response is eventually guaranteed, that is, there is no synchronous communication, hence, no stalls if the network is partitioned. Additionally, systems can be sticky available, if clients always contacts the replica, or, the same logical copy then a response is eventually guaranteed.

Databases focus on transactions, groups of multiple-operations over multiple-objects, additionally, they can commit or abort. If a transaction can contact at least 1 replica for each object needed to satisfy the transaction it has replica availability. An internal abort occurs if a transaction aborts due to its own decisions, e.g., violating an integrity constraint. An external abort is system induced, due to some system implementation. Therefore, if a transaction is provided replica availability for all objects and eventually commits (after numerous retries), or internally aborts, the system provides transactional availability.

Key Results

  • HAT-compliant transactional semantics; Read Uncommitted, Read Committed, Monotonic Atomic View, Item-cut Isolation (ANSI Repeatable Read), and Predicate-cut Isolation.
  • HAT-compliant weak replica consistency models; Monotonic Reads, Monotonic Writes, Writes-follow-Reads. Sticky HAT-compliant weak replica consistency models; causal, PRAM, and Read-your-writes.
  • Weak replica consistency models are achievable with ACID properties, e.g., Causal and Predicate-cut Isolation, can provide causally consistent snapshot reads.
  • HAT-compliant systems have lower latency than a distributed serializable system.
  • HATs cannot prevent concurrent updates (lost updates and write skew) or recency guarantees for reads, hence, Serializability, Snapshot Isolation, and Repeatable Read are not HAT-compliant.
  • Some traditional concurrency control protocols do not provide high availability, even if the semantics are HAT-compliant.
  • High availability should be a core design decision in future concurrency control designs.
  • Future systems require HATs with some non-HATs.

Evaluations

The paper provides empirical evidence for the following claims:

  1. Network partitions are common. The paper cites evidence that network partitions are problematic in practice, e.g., in 2011 Microsoft reported 13300 network failures that had end user impact; repair time varied from 5 minutes to 1 week.
  2. Network latency is non-negligible. Performed a network latency measurement study on Amazon EC2, collecting 1 week of ping-times to quantify, intra-datacenter, inter-datacenter (availability zone), and inter-plantery round trip (RTT) latency, results ranged from, (0.50-0.56ms), (1-3.5ms), and (22.5-360ms) respectively.
  3. Weak isolation is prevalent. Surveyed 18 DBMSs, only 3 provided serializability by default, whilst 8 did not provide it at all. Hence, if weak isolation is acceptable then HAT-compliant weak isolation systems would also be acceptable and useful.
  4. HAT-Compliant systems have low latency. Implemented a geo-replicated HAT prototype and was found to be 1-3x lower latency compared to a distributed serializable system.
  5. Are HAT semantics useful. Analyzed TPC-C benchmark finding 4 out of 5 transactions can be executed with HATs.