Replication

Primary/backup replication

asynchronous primary/backup replication
- MySQL, MongoDB
synchronous primary/backup replication

Two phase commit (2PC)

MySQL Cluster provides synchronous replication using 2PC

Partition tolerant consensus algorithms

Paxox
Raft

网络分区，Majority decisions （ (N/2 + 1)-of-N）

Roles

leader follower

Epochs

During each epoch only one node is the designated leader

Leader changes via duels

所有节点开始都是follower，leader维护心跳，follower通过心跳检测leader的状态和是否发生分区。

当节点检测到leader出问题后，它切换到“candidata”状态，开始一轮选举，竞选leader。

为了成为leader，节点必须接受大多数的投票。简单的投票规则是先来先得。同时增加一个随机的等待时间来解决同时请求选举的问题。

Numbered proposals within an epoch

During each epoch, the leader proposes one value at a time to be voted upon. Within each epoch, each proposal is numbered with a unique strictly increasing number. The followers (voters / acceptors) accept the first proposal they receive for a particular proposal number.

Paxos Raft ZAB

Paxos 最有名的强一致性分区容忍算法，在Google的多个系统上使用。主要问题是Paxos只解决一轮一致性决策，实际的实现中往往需要多轮Paxos算法。

ZAB Zookeeper Atomic Broadcast protocol. 用在 Apache Zookeepr 里。

Raft 设计目标之一是易懂。Etcd项目使用该算法。

　Replication methods with strong consistency

Primary/Backup

Single, static master
Replicated log, slaves are not involved in executing operations
No bounds on replication delay
Not partition tolerant
Manual/ad-hoc failover, not fault tolerant, "hot backup"

2PC

Unanimous vote: commit or abort
Static master
2PC cannot survive simultaneous failure of the coordinator and a node -during a commit
Not partition tolerant, tail latency sensitive

Paxos

Majority vote
Dynamic master
Robust to n/2-1 simultaneous failures as part of protocol
Less sensitive to tail latency

Weak consistency model protocols

Eventual consistency with probabilistic guarantees

Quorum

A quorum is the minimum number of votes that a distributed transaction has to obtain in order to be allowed to perform an operation in a distributed system. A quorum-based technique is implemented to enforce consistent operation in a distributed system.

Quorum-based voting in commit protocols

Quorum-based voting for replica control

State machine replication

https://en.wikipedia.org/wiki/State_machine_replication

Amazon's Dynamo

set(key, value) get(key) Dynamo prioritizes availability over consistency;t does not guarantee single-copy consistency.Instead, replicas may diverge from each other when values are written; when a key is read, there is a read reconciliation phase that attempts to reconcile differences between replicas before returning the value back to the client.

[ Client ]
    |
( Mapping keys to nodes )
    |
    V
[ Node A ]
    |     \
( Synchronous replication task: minimum durability )
    |        \
[ Node B]  [ Node C ]
    A
    |
( Conflict detection; asynchronous replication task:
  ensuring that partitioned / recovered nodes recover )
    |
    V
[ Node D]

Consistent hashing Partial quorums Conflict detection and read repair

Replication

Replication

分类

Primary/backup replication

Two phase commit (2PC)

Partition tolerant consensus algorithms

Roles

Epochs

Leader changes via duels

Numbered proposals within an epoch

Paxos Raft ZAB

Replication methods with strong consistency

Quorum

State machine replication

Amazon's Dynamo

results matching ""

No results matching ""