TLA Homework
We model the payment system for a ride sharing company. We want customers to be able to depend on us, so even if the credit card transaction fails, we will not cancel the ride. We will deal with it later, but we will limit the amount of outstanding debt. For simplicity, we prevent the user from hailing another ride until the situation is sorted out and the outstanding debt has been cleared.
1. Starting Point
We begin with the following code. See that you understand what we are trying to model here. We first want to make sure the system is safe.
------------------------------ MODULE payment ------------------------------ EXTENDS Integers, TLC (* --algorithm payment variables card_is_valid = TRUE, account_locked = FALSE, account_investigating = FALSE define LimitOutstandingDebt == TRUE (* Replace with own condition. *) may_use_service == TRUE (* Replace with own condition. *) end define (* As time passes, our credit card may expire: *) process time = "time" begin Time: card_is_valid := FALSE; end process (* The user can either use our service or update an invalid card with a new one issued from the bank. *) process user = "user" begin User: while TRUE do either (* If the user's account is clear (you have to define that), they may use our service. *) when may_use_service; TakeRide: (* If the payment fails, we will flag this for investigation. *) if ~card_is_valid then account_investigating := TRUE; end if; or (* If the user has an invalid card, they can get a new card from the bank and add a new payment method. We will then be able to clear any outstanding transactions and unlock their account. *) when ~card_is_valid; GetNewCard: card_is_valid := TRUE; RenewPaymentMethod: account_locked := FALSE; end either; end while end process (* The back office takes a look at failed or suspicious transactions. We only model failures due to expired cards, so the investigation will always conclude that the client was at fault and lock their account. *) process backoffice = "backoffice" begin CheckTransaction: while TRUE do await account_investigating; account_locked := TRUE; account_investigating := FALSE; end while end process end algorithm *) =============================================================================
Take some time to understand the model. We have a global variables that model the state of the card and two variables that model the customer's account status. While both variables are simply global variables here, it is important to keep in mind that whether the card is valid is not something stored in our system. We only find this out when we attempt a bank transfer!
Your first task is to write the condition may_use_service
such that the system satisfies the following requirement: if a user hails a ride with an invalid card, they will not be able to take another ride before having updated their card with a valid one. NB! This check is implemented by our system, so we can't rely on "card_is_valid" for this!
2. Formalizing the safety condition
The above condition states that if we have a failed ride, we will not be taking a ride until we have updated the card.
- This is fairly easy to express using the binary temporal operator until, but TLA does not support binary temporal operators. You may try to google how those are expressed in TLA, but this is a really difficult path to follow.
- The recommended way to do this in TLA is to specify a simple state machine that behaves correct in just this regard and then prove that our real system is an instance of that state machine. This is fairly easy, but outside the scope of this course.
- There is a cheap trick we can use to model that state machine within our code by introducing ghost variables. These are just normal TLA variables, but are not there to model the system. They are just bookkeeping information to allow us to state properties more easily.
You may try the more advanced approaches yourself. But here, let's create a new (ghost) variable that tracks the outstanding debt.
- Add the integer variable
ghost_debt
to the global variables of the spec. - Add code to increase it whenever payment fails and reset it when we update the card.
- Now you can express the safety condition
LimitOutstandingDebt
just as a simple invariant for this variable. - Add the safety condition as an invariant to your model.
- Verify that this condition holds!
- If it holds, you should get a deadlock. You may for now disable deadlock checking to get the safety condition verified.
- If the invariant is violated, fix your
may_use_service
condition to make sure the invariant is not violated. (As this ghost variable is not really part of the system, you should not use that variable in this condition.)
3. Understanding the deadlock!
If everything is correct, you will get a deadlock when you run the TLA checker! (If you disabled deadlock checking in the previous step, don't forget to turn it back on.) This is now the key part of the exercise. What is this deadlock?
- Given that you know a deadlock will happen, try just looking at the spec and figure out why this may occur.
- If you do not see it immediately, look at just the final state in the trace and convince yourself that the poor user is indeed stuck.
- Then figure out how it came to be that the user arrived in that stuck state.
You could finally fix the problem, but the more important point here is to see what goes wrong, so that we can anticipate such scenarios already during system design and analysis. There is a somewhat sensible solution that models what one would realistically have to do.
When the user adds a payment method, we need to make sure pending investigations have concluded and then we can try to use the new payment method to deal with outstanding payments and unlock the account.
RenewPaymentMethod: await ~account_investigating; account_locked := FALSE;
Nevertheless, for these kinds of abstract models, the value is mainly in identifying such problematic use cases ahead of time. The model is so distant from a realistic implementation that there is hardly any point in "fixing" it; however, I can assure you that the problematic use case is very much real. I was personally locked out of my account at a prominent Estonian ride-sharing company for this very reason.
4. Submission
You will submit a PDF document with just the following:
- (1 points) Your definition of
may_use_service
. - (1 points) Your definition of the invariant
LimitOutstandingDebt
. - (2 points) Copy-paste of your algorithm code into the PDF.
- The most important part is a description of the deadlock trace:
- (4 points) A bulleted list of actions that takes us to the stuck state, i.e., the user does this, then the card expires, and so on.
- (2 points) An intuitive explanation of why the user is stuck in that state; that is, what prevents the user from doing anything. (This is just one-two sentences.)
- (2 points) One short paragraph where you reflect on the usefulness/uselessness of TLA based on this exercise. It suffices if you briefly address the following questions:
- Was the deadlock above obvious to you? (This question is not about the TLA model, but rather whether you think you could anticipate this kind of problematic use case without TLA.)
- Do you believe detecting such issues ahead of time is worth it? (Does the time spent modelling and playing around with these toy specifications pay off in terms of discovered corner cases that one should address ahead of time?)
- How high is the technical debt of not using TLA? (Surely, at least three months...)
The paragraph is graded based on honest and good arguments. I am not expecting an essay here; it's better you reflect a bit on your own and just write one-two sentence on the theme of each question.
In order to do the key part of the exercise, you need to have the model and conditions right. You may ask on piazza to make sure you have the correct conditions, so that you analyze the correct trace. We will solve this part together during the labs! There, I will help you with the rest as well. You may help any other students with this first part, even showing them the correct spec. But you may NOT share your explanations of the trace with other students!