Seminar 10: Distributed Commitment
Goal: Expand the scope of your project by establishing distributed transactions through commitment protocols.
Introduction
In Session 10 of our Distributed Systems course, we delve into the critical concept of distributed commitment, a fundamental aspect for ensuring the integrity and reliability of our distributed system. As we progress with the execution of orders within our online bookstore scenario, we encounter the necessity for certain services, such as the payment system, notification service, and database, to commit to the execution of each order. This means that transactions must either succeed entirely and their changes are committed, or they fail without any side effects. Additionally, in cases where a transaction updates data on multiple nodes, it becomes imperative that either all nodes commit their changes or all must abort. Moreover, should any node experience a failure or crash, it is essential that the entire transaction is aborted to maintain consistency across the system.
You'll devise a distributed commitment mechanism to ensure the execution of orders across various services. Your task is to make the executor service to coordinate and ensure commitment over each order, among multiple components like the payments service, notification system, and database module. They must commit to their operations over each order, ensuring data consistency and integrity. Through your design and implementation, you should also navigate the complexities of distributed transactions to achieve a robust order execution workflow. Here are the concrete tasks:
Task
Extend your system with the following functionality and new services:
- New Services: Create one new dummy gRPC service - for instance, the payment system. This service does not need any custom logic apart from the distributed commitment protocol described below. You may implement some dummy logic for their service operations (payment execution). This service doesn’t need to be replicated, meaning, we only need one instance of it.
- Commitment: Add the functionality in the executor service, in the database module, and in your new payment system, to establish a commitment protocol of your choice, f.e., 2PC, 3PC, or other. The executor service should act as the coordinator and the other two services should be the participants. Implement and study the trade-offs of your commitment protocol, especially related to the amount of phases, messages exchanged, and probability of blocking in specific phases.
- Execution: The goal is to make the participant services to commit to their operations over each order. Find a way to encapsulate these operations within the distributed commitment protocol, meaning, after the last commit message is received from the coordinator, these services should execute their operations, i.e., the payment system should execute the payment (dummy operation) and the database should update the data (operation as in the last practice session).
- Bonus Points: How do we deal with failing participants? Think of a solution for the problem of recovering from failing participants, in specific phases of your commitment protocol. Devise and test a mechanism for simple recoveries in one of the services.
- Bonus Points: What about failure of the coordinator? Analyse the system and try to understand what are the consequences of a failing coordinator, during the execution of the commitment protocol. Think of a solution for this issue. No implementation is needed, but the points will only be awarded upon good analysis, justification, and solution.