Seminar 5: Logical clocks using Lamport and Vector clock algorithms in Python
Goal: To study and practice the concept of logical clocks (Lamport and Vector)
Background: For many purposes in distributed systems, it is sufficient that all machines agree on the same time. While physical clock timestamps works similarly to the wall clocks, hence they are not efficient to determine the correct execution behavior of distributed applications. Logical clocks on the other hand ensures ordering of events. It captures the chronological and casual relationships in distributed system and allows global ordering of events from different processes/applications by relying on monotonic counters. In this seminar, we will explore two logical clocks: Lamport clocks and Vector clocks.
The Happened-before relationship
- Happens-before relationship a-->b, ”a happens before b”
- All processes/nodes agree on that first event a occurs, then afterward, event b occurs
- Can be observed in three situations
- If a and b are events in the same process, and a occurs before b, then a-->b is true
- If a is the event of a message being sent by one process, and b is the event of the message being received by another process, then a-->b is also true
- Transitive: if a -->b and b -->c, then a -->c
Lamport logical clocks:
- All processes agree on time C(e) for every event e
- C must always go forward (increase), never backward (decrease)
- If a -> b , then C(a) < C(b)
- For all distinctive events a and b, C(a) ≠ C(b)
- Implementation (Each process Pi maintains local counter Ci)
- Pi executes Ci= Ci + 1 before executing an event (e.g. sending a message over the network)
- When sending message m to Pj, Pi sets m’s timestamp ts(m)=Ci(after having executed step 1)
- Upon receiving message m, Pj adjusts its own local counter as Cj <- max{Cj, ts(m)}, after which it executes step 1 (Cj= Cj + 1) and delivers the message to application.
- Issue: As shown in figure - Even though C(h) < C(c), we can not say whether h happened before c or not. In other words, we cannot establish the causal relationship between these two events. The solution to this problem is to make use of Vector Clock.
Vector clocks:
- A vector clock uses a vector of integer variables, e.g., depicting each process/node in the system, for determining the partial ordering of events as nodes/processes execute operations and exchange messages. Vector clocks can be used to detect causality violations. Just as in Lamport timestamps, inter-process messages contain the state of the sending process's logical clock.
- Capture causality: if VC(a) < VC(b), then event a is known to causally precede event b
- Implementation (Each process Pi maintains a vector clock VCi)
- Pi executes VCi[i] = VCi[i] + 1 before executing an event
- When Pi sends message m to Pi, it sets m’s timestamp ts(m) equal to VCi (after having executed step 1)
- When Pj receives m, it executes VCj[k] = max{VCj[k], ts(m)[k]} for each k, after which it executes step 1 and delivers the message to the application.
- As shown in Figure, VC(h)|||VC(c), VC(f)|||VC(c). (|| Concurrent)
Exercise-1: We will get acquainted with the vector clock algorithm in python. For easy and clear interpretation, we have created a graphical interface by tkinter to display the algorithm. The tkinter package (“Tk interface”) is the standard Python interface to the Tcl/Tk GUI toolkit
- The code for implementing the interface design for vector clock is here. (Code)
$ python vector_clock.py
- Insights
- Define on_click function
x, y = event.x, event.y #No lines have been drawn if not self.lines_drawn: return # Out of range if x<40 or x>self.canvas.winfo_width()-50: return # Event ("L", x, process_number) if self.adding_mode.get()=="L": process_number=self.find_closest(y) self.create_rhombus(x, self.heights[process_number]) self.events.append(("L", x, process_number)) # Event ("M", from_x, to_x, from, to) if self.adding_mode.get()=="M": process_number=self.find_closest(y) self.current_circle=self.create_circle(x, self.heights[process_number]) self.previous_process=process_number self.previous_x=x
- Define on _release function
x, y = event.x, event.y if x<40 or x>self.canvas.winfo_width()-50: return if self.adding_mode.get()=="M": process_number=self.find_closest(y) if process_number==self.previous_process or x<self.previous_x: self.canvas.delete(self.current_circle) return self.create_circle(x, self.heights[process_number]) self.canvas.create_line(self.previous_x, self.heights[self.previous_process], x, self.heights[process_number], arrow=LAST, width=2) self.events.append(("M", self.previous_x, x, self.previous_process, process_number)) self.previous_x=0 self.previous_process=-1 self.current_circle=-1
- Define the calculate_timestamps function
# Calculates the timestamps and returns it somehow # Output #(x, y, [1,2,3]) def calculate_timestamps(self): # Event ("L", x, process_number) # Event ("M", from_x, to_x, from, to) # -> # ("L", x, process_number) # ("S", x, process_number, msg_num) # ("R", x, process_number, msg_num) counter=0 split_events=[] for event in self.events: if event[0]=="M": split_events.append(("S", event[1], event[3], counter)) split_events.append(("R", event[2], event[4], counter)) counter+=1 else: split_events.append(event) sorted_events = sorted(split_events, key = lambda x: x[1]) messages={} current_vectors=np.zeros((self.N_PROCESSES, self.N_PROCESSES), dtype=np.uint8) # Output #(x, process_number, [1,2,3]) timestamps=[] for event in sorted_events: event_type, x, process_number=event[0], event[1], event[2] if event_type=="L": current_vectors[process_number, process_number]+=1 elif event_type=="S": current_vectors[process_number, process_number]+=1 messages[event[3]]=np.array(current_vectors[process_number]) else: current_vectors[process_number, process_number]+=1 received_message=messages[event[3]] current_vectors[process_number]=np.maximum(current_vectors[process_number], received_message) timestamps.append((x, process_number, list(current_vectors[process_number]))) return timestamps
Task 1: Based on "vector_clock.py" in exercise 1, please change the vector clock algorithm to lamport clock algorithm.
- Implementation should achieve a result similar to this screenshot.
Exercise 2: We will get acquainted with the Lamport clock algorithm with gRPC with threads in python. Since logical clocks are simply counters in each thread with resources allocated for different processes, we assume four different processes, A, B, C and D where the resource is granted to process A first before being released.
- Implementing the server code looks like this
- Implementing the client code looks like this
Task 2: From previous seminars, we already understand how to implement some functions with multi-threads by using gRPC. For this task, integrate the lamport or vector clock algorithm into a unique function showing how resources are being allocated to different processes. (You can decide on the number of processes you want).
Deliverables: For each task, please contain the source python file, and also the screenshot of results in terminals. Then put all the file in a single zip.
Relative Links:
- Lamport Timestamp
- Vector clock
- Understanding Lamport & Vector Timestamps with Python’s multiprocessing library
- Implementing an Interface in Python
NOTE: Please watch the video or write via Slack if something is not clear. Please double check the submitted zip files to be sure of complete submission.