Seminar 9: Raft consensus algorithm in python
Goal: To study and practice Raft consensus algorithm in python for distributed applications
Background: Consensus involves multiple servers agreeing on values. It is a fundamental problem in fault-tolerant, distributed systems. But once the servers agree on a value, that agreement is final. Consensus-based replication has become key to building resilient, strongly-consistent distributed systems. Examples of applications of consensus include: Whether to commit a transaction to a database, Agreeing on the identity of a leader, State machine replication and Atomic broadcasts
A consensus protocol tolerating failures must have the following features :
- Validity: A value must have been suggested by another valid process if a process decides (reads or writes) it.
- Agreement : Every correct process must agree on the same value
- Termination : Every correct process must terminate after a finite number of steps.
- Integrity : If all correct processes decide on the same value, then any process has the said value.
Raft The raft algorithm was developed primarily as an alternative to the Paxos consensus algorithm. Raft is a leader-based log replication protocol in contrast to the leaderless Paxos protocol. Simply said, a Raft implementation elects a leader once, and that node is then in charge of making all decisions regarding the database's state.
Raft is built around the concept of a replicated log:
- When the leader receives a request, it first stores an entry for it in its durable local log.
- This local log is then replicated to all of the followers, or replicas.
- Once the majority of replicas confirm they have persisted with the log, the leader applies the entry and instructs the replicas to do the same.
- In the event of leader failure, a replica with the most up-to-date log becomes the leader.
Term of time in Raft
In Raft, time is divided into terms
- Election and Normal operation under a single leader
- A term is depicted by a logical clock and just increases forward
- The term starts by an election to decide who becomes a leader
- Raft guarantees that for any term there is at most one leader
Election operation in Raft
The basic consensus algorithm requires only two types of RPCs. RequestVote RPCs are initiated by candidates during elections, and AppendEntries RPCs are initiated by leaders to replicate log entries and to provide a form of heartbeat.
- All processes begin as followers.
- The election term is included in the leader's periodic heartbeats, which followers anticipate receiving.
- A timeout is triggered and the leader is assumed to be dead if the follower does not detect a heartbeat after a predetermined amount of time.
- By extending the existing election term and switching to the candidate state, the follower initiates a new election.
- After voting for itself, it requests votes from every other processes in the system, stamping the request with the election term in effect.
- After winning the election, the candidate assumes leadership of the group and begins communicating with the other processes.
- When another process declares itself to be the winner of the election with a term larger than or equal to the candidate's term, it accepts the new leader and goes back to the follower state.
- If a length of time passes without a winner—which is extremely unlikely—then the candidate will eventually clock out and begin a fresh election process.
Log replication (Normal operation) in Raft
- Client sends command to leader
- Leader appends command to its log
- Log entry = index, term, command
- Leader sends AppendEntries to followers
- Each AppendEntries contains index, term of entry preceding new ones
- Follower must contain matching entry, otherwise it rejects request
- Once new entry committed (stored on majority of servers/nodes/replicas):
- Leader passes command to its state machine, returns results to client
- Leader notifies followers of committed entries in subsequent AppendEntries
- Followers pass committed commands to their state machines
Exercise: We will get acquainted with the Raft algorithm by using the gRPC in python. It provides raft protocol for leader election and log replication and this implementation returns key-value storage.(Code)
- But first we need to install the dependency TODO in the requirements.txt file in your terminal.
pip install -r requirements.txt
- Updates on each server terminal's state [Follower, Candidate, Leader] will be printed.
- To test the Log replication functionality you will need to run client.py and interact with nodes using the following commands: [connect, getleader, suspend, quit, getval, setval]
Task: Based on exercise, test log replication using the raft consensus algorithm with other key-value storage commands that can:
- INCR <key>: that increments the value associated with a given key by 1
- EXPIRE <key> <seconds>: sets a time-to-live (TTL) value for the given key, after which the key will be automatically deleted
Deliverables: please contain the source python file, and also the screenshot of results in terminals. Then put all the file in a single zip.
Links: