Information
- Lecturer : Kalmer Apinis (kalmer.apinis@ut.ee)
- Seminars : Wednesdays, 12:15-14:45, Δ-1022, Zoom (log into courses to see link)
- Homework : here, submit to moodle
- Discussion : Fleep forum (log into courses to see link)
- Final exam:
- TBD
- open book e-exam (including the internet) + face-to-face Q&A if needed
- example
- As an alternative to the exam you may attempt to prove something bigger on your own!
- Final grade
- max. 70 points for autograded HW (less if incorrect or submited late)
- max. 30 points for autograded exam+Q&A
- 0..50 → F, 51..60 → E, 61..70 → D, 71..80 → C, 81..90 → B, 91..100 → A
Book recommendation
- Programming Language Foundations by Benjamin C. Pierce et al
- Logical Foundations by Benjamin C. Pierce et al
- The Formal Semantics of Programming Languages: An Introduction, by Glynn Winskel
- Types and Programming Languages, by Benjamin C. Pierce
- Practical Foundations for Programming Languages, by Robert Harper
Preparation for the course
- We will be using Coq together with CoqIDE.
- Install Coq version >=8.13.
- In Windows and OS X I suggest downloading the latest Coq Platform from https://coq.inria.fr/download
- In Linux you can probably use a package manager.
Foundations of Programming Languages (course developed at UPenn)
In this course, our goal is to investigate
- formal techniques for reasoning about the properties of specific programs (e.g., the fact that a sorting function or a compiler obeys some formal specification); and
- the use of type systems for establishing well-behavedness guarantees for all programs in a given programming language (e.g., the fact that well-typed Java programs cannot be subverted at runtime).
All formalities are done within the Coq proof assistant.
What are proof assistants?
A proof assistant such as Coq is a system where you can write propositions (such as "A ⋀ True → A") and prove them interactively. Proving something interactively means that the system keeps track of your goals as you apply logic rules and proof tactics. In the background, the system checks that the rules you have chosen are applicable and prohibits you to use rules that make no sense. This means that Coq proofs are machine-checked.
In this course, we use the Coq system, but there are other similar tools. Coq can be used to formalize mathematics in general, but it can also be used for proving the correctness of programs --- this fact makes Coq interesting for computer scientists.
Coq is not a toy or just a program that teaches students about logic. It is a mature tool that has 30 years of research behind it, which is actively used for formalizing complicated proofs. Wikipedia lists its main achievements as
- Four color theorem: formal proof using Coq was completed in September 2004.
- Disjoint-set data structure: correctness proof in Coq was published in 2007.
- Feit–Thompson theorem: formal proof using Coq was completed in September 2012.
- CompCert: an optimizing compiler for a subset of the C programming language which is fully programmed and proved in Coq.
For the first part of the course, we learn how to use Coq for interactive theorem proving.
Program Verification
In the second part of the course, we introduce two broad topics of critical importance in building reliable software (and hardware): techniques for proving specific properties of particular programs and for proving general properties of whole programming languages. For both of these, the first thing we need is a way of representing programs as mathematical objects, so we can talk about them precisely, plus ways of describing their behavior in terms of mathematical functions or relations. Our main tools for these tasks are abstract syntax and operational semantics, a method of specifying programming languages by writing abstract interpreters. In the beginning, we work with operational semantics in the so-called "big-step" style, which leads to simple and readable definitions when it is applicable. Later on, we switch to a lower-level "small-step" style, which helps make some useful distinctions (e.g., between different sorts of nonterminating program behaviors) and which is applicable to a broader range of language features, including concurrency. The first programming language we consider in detail is Imp, a tiny toy language capturing the core features of conventional imperative programming: variables, assignment, conditionals, and loops. We study two different ways of reasoning about the properties of Imp programs. First, we consider what it means to say that two Imp programs are equivalent in the intuitive sense that they exhibit the same behavior when started in any initial memory state. This notion of equivalence then becomes a criterion for judging the correctness of metaprograms -- programs that manipulate other programs, such as compilers and optimizers. We build a simple optimizer for Imp and prove that it is correct. Second, we develop a methodology for proving that a given Imp program satisfies some formal specifications of its behavior. We introduce the notion of Hoare triples -- Imp programs annotated with pre-, and post-conditions describing what they expect to be true about the memory in which they are started and what they promise to make true about the memory in which they terminate -- and the reasoning principles of Hoare Logic, a domain-specific logic specialized for convenient compositional reasoning about imperative programs, with concepts like "loop invariant" built-in. This part of the course is intended to give readers a taste of the key ideas and mathematical tools used in a wide variety of real-world software and hardware verification tasks.
Type Systems
Our third major topic is type systems -- powerful tools for establishing properties of all programs in a given language. Type systems are the best established and most popular example of a highly successful class of formal verification techniques known as lightweight formal methods. These are reasoning techniques of modest power -- modest enough that automatic checkers can be built into compilers, linkers, or program analyzers and thus be applied even by programmers unfamiliar with the underlying theories. Other examples of lightweight formal methods include hardware and software model checkers, contract checkers, and run-time monitoring techniques. This also completes a full circle with the beginning of the book: the language whose properties we study in this part, the simply-typed lambda-calculus, is essentially a simplified model of the core of Coq itself!
Forms of learning
Besides lectures, you will learn by reading Logical Foundations and Programming Language Foundations and solving exercises in Coq. There will be a Fleep chat or some other online forum for cases where you get stuck with details.
The homework deadlines will have quite strict deadlines as extra motivation to not fall behind too much. This will allow you to help each other.