Replication

分类

Two basic replication approaches

  • Synchronous replication
  • Asynchronous replication

根据是否会产生分歧分为

  • single copy systems
    • 副本间保持一致性,只有一个副本是活跃的
    • 常见算法
      • 1n messages (asynchronous primary/backup)
      • 2n messages (synchronous primary/backup)
      • 4n messages (2-phase commit, Multi-Paxos)
      • 6n messages (3-phase commit, Paxos with repeated leader election)
  • multi-master systems

一致性算法比较图

Primary/backup replication

  • asynchronous primary/backup replication
    • MySQL, MongoDB
  • synchronous primary/backup replication

Two phase commit (2PC)

MySQL Cluster provides synchronous replication using 2PC

Partition tolerant consensus algorithms

  • Paxox
  • Raft

网络分区,Majority decisions ( (N/2 + 1)-of-N)

Roles

leader follower

Epochs

During each epoch only one node is the designated leader

epoch

Leader changes via duels

所有节点开始都是follower,leader维护心跳,follower通过心跳检测leader的状态和是否发生分区。

当节点检测到leader出问题后,它切换到“candidata”状态,开始一轮选举,竞选leader。

为了成为leader,节点必须接受大多数的投票。简单的投票规则是先来先得。同时增加一个随机的等待时间来解决同时请求选举的问题。

Numbered proposals within an epoch

During each epoch, the leader proposes one value at a time to be voted upon. Within each epoch, each proposal is numbered with a unique strictly increasing number. The followers (voters / acceptors) accept the first proposal they receive for a particular proposal number.

Paxos Raft ZAB

Paxos 最有名的强一致性分区容忍算法,在Google的多个系统上使用。主要问题是Paxos只解决一轮一致性决策,实际的实现中往往需要多轮Paxos算法。

ZAB Zookeeper Atomic Broadcast protocol. 用在 Apache Zookeepr 里。

Raft 设计目标之一是易懂。Etcd项目使用该算法。

 Replication methods with strong consistency

Primary/Backup

  • Single, static master
  • Replicated log, slaves are not involved in executing operations
  • No bounds on replication delay
  • Not partition tolerant
  • Manual/ad-hoc failover, not fault tolerant, "hot backup"

2PC

  • Unanimous vote: commit or abort
  • Static master
  • 2PC cannot survive simultaneous failure of the coordinator and a node -during a commit
  • Not partition tolerant, tail latency sensitive

Paxos

  • Majority vote
  • Dynamic master
  • Robust to n/2-1 simultaneous failures as part of protocol
  • Less sensitive to tail latency

Weak consistency model protocols

Eventual consistency with probabilistic guarantees

Quorum

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.

Quorum-based voting in commit protocols

Quorum-based voting for replica control

State machine replication

https://en.wikipedia.org/wiki/State_machine_replication

Amazon's Dynamo

set(key, value) get(key) Dynamo prioritizes availability over consistency;t does not guarantee single-copy consistency.Instead, replicas may diverge from each other when values are written; when a key is read, there is a read reconciliation phase that attempts to reconcile differences between replicas before returning the value back to the client.

[ Client ]
    |
( Mapping keys to nodes )
    |
    V
[ Node A ]
    |     \
( Synchronous replication task: minimum durability )
    |        \
[ Node B]  [ Node C ]
    A
    |
( Conflict detection; asynchronous replication task:
  ensuring that partitioned / recovered nodes recover )
    |
    V
[ Node D]

Consistent hashing Partial quorums Conflict detection and read repair

results matching ""

    No results matching ""