Hajussüsteemid - Kursused - Arvutiteaduse instituut

Seminar 8: Raft consensus algorithm in python

Goal: To study and practice consensus algorithm in python (Raft) for distributed applications.

Background: Consensus can be used in a wide range of applications in distributed systems. For instance, it may happen that some nodes/processes are working maliciously as a faulty individual. Therefore, it is important to reach consensus to ensure reliability and fault tolerance in the system. The consensus problem requires agreement among a number of processes (or agents) for a single data value. Several consensus algorithm have been designed to achieve this distributed decision, e.g., Raft, Paxos. At the fundamental level, any problem that requires consensus can be solved with a state machine replication

Termination - Every non-faulty process eventually agrees on a value
Agreement - The final decision of every non-faulty process is the same everywhere
Integrity - Every correct individual decides at most one value, and the decided value must be proposed by some individual.

state machine replication:

The main idea behind is that a single process (the leader) broadcasts the operations that change its state to other process, the followers (replicas)
Total order broadcast: every node delivers the same messages in the same order
The followers execute the same sequence of operation as the leader, then the state of each follower will match the leader
Replica is a state machine: starts in fixed initial state, goes through same sequence of state transitions in the same order → all replicas end up in the same state.

Raft

Raft is based on state machine replication that guarantees the strongest consistency possible
In Raft, time is divided into terms
- Election and Normal operation under a single leader
- A term is depicted by a logical clock and just increases forward
- The term starts by an election to decide who becomes a leader
- Raft guarantees that for any term there is at most one leader

Leader election
- Every process starts as follower.
- A follower expects to receive a periodic heartbeat from the leader containing the election term.
- If the follower does not receive any heartbeat within a certain period of time, a timeout fires and the leader is presumed dead
- The follower starts a new election by incrementing the current election term and transitioning to the candidate state
- It then votes for itself and sends a request to all processes in the system to vote for it, stamping the request with the current election term
  - The candidate wins the election: the candidate becomes a leader and starts sending out heartbeats to the other processes
  - Another process wins the election: In this case, terms between process are compared, if another process claims to be the leader with a term greater or equal the candidate’s term, it accepts the new leader and returns to the follower state
  - A period of time goes by with no winner: very unlikely, but if it happens, then candidate will eventually time-out and starts a new election process

Normal operation
- Client sends command to leader
- Leader appends command to its log
  - Log entry = index, term, command
- Leader sends AppendEntries to followers
  - Each AppendEntries contains index, term of entry preceding new ones
  - Follower must contain matching entry, otherwise it rejects request
- Once new entry committed (stored on majority of servers/nodes/replicas):
  - Leader passes command to its state machine, returns results to client
  - Leader notifies followers of committed entries in subsequent AppendEntries
  - Followers pass committed commands to their state machines

Exercise: We will get acquainted with the Raft algorithm by using the python library PySyncObj. PySyncObj is a python library for building fault-tolerant distributed systems. It provides the ability to replicate your application data between multiple servers. It provides raft protocol for leader election and log replication.

The code for the above implementation is given here for you. (code)

 $ pip install pysyncobj
 $ python kvstorage.py selfHost:port partner1Host:port partner2Host:port ...

The normal class that store the key-value pairs

 class KVStorage(object):
    def __init__(self):
        self.__data = {}

    def set(self, key, value):
        self.__data[key] = value

    def pop(self, key):
        self.__data.pop(key, None)

    def get(self, key):
        return self.__data.get(key, None)

Transform the normal class into a replicated class
- Inherit it from SyncObj
- Initialize SyncObj with a self address and a list of partner addresses
- Mark all your necessary methods that modifies your class fields with @replicated decorator to achieve replication across different servers.

 class KVStorage(SyncObj):
    def __init__(self, selfAddress, partnerAddrs):
        cfg = SyncObjConf(dynamicMembershipChange = True)
        super(KVStorage, self).__init__(selfAddress, partnerAddrs, cfg)
        self.__data = {}

    @replicated
    def set(self, key, value):
        self.__data[key] = value

    @replicated
    def pop(self, key):
        self.__data.pop(key, None)

    def get(self, key):
        return self.__data.get(key, None)

Task: Based on kvstorage.py, please implement the raft algorithm with log index increment when you set new key-value pairs or pop key-value pairs.

Deliverables: please contain the source python file, and also the screenshot of results in terminals. Then put all the file in a single zip.

Link:

NOTE: Please watch the video or ask us through Slack if something is not clear for you. We usually don't change the points after the deadline.

Hajussüsteemid 2021/22 kevad

Seminar 8: Raft consensus algorithm in python