MTAT.05.113 Bayesian Networks
MTAT.05.113 Bayesi võrgud
- All seminars are held on Wednesdays 12-14 in Liivi 402.
- The seminar starts at the beginning of the spring term (February 11th) by an introductory lecture.
- All other presentations are given by students who cover the topics from the course book.
- The seminar ends with a project, where students must demonstrate their expertise in graphical modelling.
- Lectures and seminar sessions are held in English or in Estonian depending on circumstances.
- Final reports should be written preferably in English.
Requirements
There are no formal prerequisites to the seminar. However, basic knowledge in probability and graph theory is advisable. Also, one needs reasonable English skills to complete the course report.
If the formal requirements of
the
ÕIS do not permit registration then write me an email
or talk with me. After that we decide whether to enrol you
or not. Secondly, active participation in the course is
absolute necessity. You must participate in most seminars
or otherwise you do not pass the seminar. Namely,
student gets grade F if he or she misses 4 or more
seminars.
In reasonable circumstances, it is possible
to compensate missed seminars by extra work. Details are
determined by individual agreements with the lecturer.
The seminar grade is determined by the final research report and the corresponding discussion with the student. The written report gives a base level grade and the discussion can either increase or decrease the base level grade by one grade point. As usual grades range from A to F.
What and Why
The course aims to give a comprehensive overview of Bayesian models starting from the basic computational rules for propagating uncertainty and ending with network inference methods. Additionally, the course considers the elements of decision theory and connections with Bayesian networks. As a concrete example, note that attack trees (certain type of Bayesian networks) together with expected gains can be used to assess whether plausible attacks are beneficial. Generally, Bayesian networks are often used to formalise and harmonise human expertise on well-specified domains such as medicine and insurance. However, another interesting application is automatic inference and visualisation of complex relations between different processes or states. For instance, various graphical models are used to characterise gene regulation in biology.
Topics and Dates
- Interpretations of Probabilities (11 February,
Slides)
Frequentism. Objective and subjective Bayesianism. Dutch book arguments. Kolmogorov's interpretation agnostic probability calculi.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 1-16
- E. T. Jaynes. Probability Theory: The Logic of Science, p. 3-21, 651-655
- S. Arnborg and G. Sjödin. On the foundations of Bayesianism
- Construction of Bayesian Networks (25 February,
Slides)
Graph theoretical formalisation. Markov blankets and d-separation. Chain-rule and evidence propagation. Formalisation of prior knowledge: divorcing and noisy functional dependencies. Compaction techniques: templates and dynamic Bayesian networks.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 23-108
- Inference and Belief Propagation (11 March, 18 March,
Slides
Slides
Slides)
Graph theoretical formalisation. Potentials. Domain graphs. Triangulation. Join and junction trees. Evidence propagation. Triangulation of dynamic Bayesian networks. Time-space-precision trade-offs: recursive conditioning, stochastic simulation, Gibbs sampling, loopy belief propagation.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 108-166
- M. I. Jordan. Learning in Graphical Models, p. 9-50, 51-104, 175-204
- Analysis of Bayesian Networks (25 March
Slides
Slides)
Saturated junction trees. Configurations of maximal probability. Data conflicts. Sensitivity analysis: sensitivity to evidence and sensitivity to parameters.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 167-193
- Parameter Estimation (1 April
Slides)
Maximum likelihood estimation. Regularisation via Bayesian estimation. Latent variables and expectation-maximisation algorithm. Adaptation and tuning.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 195-228
- M. I. Jordan. Learning in Graphical Models, p. 301-354, 521-544, 575-598
- Network Structure Estimation (8 April
Slides
Slides
Slides)
Combinatorial explosion problem. Constraint-based learning. Score based learning. Bayesian Information Criterion. Minimum Description Length principle. Greedy and stochastic search. Local structure and independence tests.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 229-264
- M. I. Jordan. Learning in Graphical Models, p. 421-460, 555-574
- Bayesian Networks as Classifiers (15 April
Slides)
Naive Bayes classifier. Tree augmented naive Bayes classifiers. Common algorithms for constructing decision trees. Pruning.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 265-276
- N. Friedman, D. Geiger, M. Goldszmidth. Bayesian Network Classifiers
- J. R. Quinlan. Induction of Decision Trees
- Elements of Decision Theory (22 April, 29 April
Slides
Slides)
)
Mathematical formalisation of utility. Rational behaviour. Decision trees. Combinatorial explosion problem. Tree compaction. Influence diagrams. Repetitive decisions. Asymmetrical decision problems. Unconstrained influence diagrams and corresponding strategies. Decision problems with unbounded time horizons.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 279-342
- Optimal Strategies and Ways to Find Them (6 May, 13 May
Slides
Slides)
Strategies and expected utilities. Iterative utility maximisation over decision trees. Variable elimination and strong junction trees. Iterative simplification of influence diagrams. Arc reversal. Optimal strategies for unconstrained influence diagrams. Troubleshooting strategies. Solutions to decision problems with unbounded time horizon: solutions for limited and unlimited memory.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 343-405
- Some Properties of Decision Problems(20 May
Slides
Slides
Slides)
Costs and gains of between being informed. Relevant and irrelevant observations in a decision problem. Sensitivity analysis.
- F. V. Jensen and T. D. Nielsen. Bayesian Networks and Decision Graphs, p. 407-428
Course Books
- Finn V. Jensen. Bayesian Networks and Decision Graphs. Springer
- David Edwards Introduction to Graphical Modelling. Springer
- M. I. Jordan Learning in Graphical Models. MIT Press. 1998.
Suggested software
- Bayes Net Toolbox for MATLAB
- Bayesian Network tools in Java (abandonware)
- Windows only binary installation (fails under Linux and Mac due to a different windowing model in SWT)
- Original Java source code
- Java source code with bugfixes (to appear)
- Commercial programs
Assignment Topics
- Analysis of Deal or No Deal Game
- Reconstruction of Gene Regulation Networks
- Bayesian Networks with Continius Distributions
- Application of Bayesian Networks in Natural Language Processing.
(Tutor: Mark Fishel)
List of potential topics- Ontologies as Bayesian Networks. Document or word classification using evidence propagation. Parameter estimation using annotted texts.
- Language rules as Bayesian Networks. Parameter estimation and discovery of data or rule conflicts using annotated text.
- Other similar topics that use Bayesian networks and produce both theoretical and practical outcome.
- Connection between attack trees and descision theory.
(Tutor: Aivo Jürgenson and Margus Niitsoo)
Main topic to be covered- Representation of attack trees as decision graphs.
- Derivation of optimal attack strategy.
- Sensitivity analysis. When is the attack tree instable?
- Any other topic that is connected to Bayesian Networks or decision theory
Approximate Requirements for Course Reports
The main aim of the course report is to measure how well students can apply theoretical knowledge obtained in the course for solving practical tasks. As a second factor, I will also take into account coherence and readability of the report. I will use mostly standard criteria for evaluation. In particular, the report should start with a reasonable problem statement and followed by the main ideas how did you try to solve it with Bayesian networks or influence diagrams. Generally, I would prefer to get a solution or a working prototype as an end result. However, as some problems are more difficult and ambitious than others, I also satisfied with a failure or sub-optimal solution provided that you provide full analysis why the approach you chose did not work or why the problem statement is impossible to solve. In any case, you have to complete the project form beginning to the start and document it according to scientific standards. Most importantly, the results should be in principle re-obtainable for the person who reads this report and the reasons why you chose a certain approach or trade-off should be documented. If for some reason the input data is not public you should still reference its origin. Finally, keep in mind that it is team project and a team gets the same grade unless there are really strong reasons to grade differently.
Assignmets
- Kaur Alasoo, Aleksandr Tkatsenko, Jüri Reimand. Gene Regulation
- Mihkel Pajusalu, Roland Pihlakas. Bayesian Networks with Continius Distributions
- Margus Niitsoo and Aivo Jürgenson. Attack Trees and Their Connection to Influence Diagrams
- Liina Kamm and Konstantin Tretjakov. How to play Texas Holdem without Loosing Too Much
- Krista Liin and Siim Orasmaa. Lausepiiride tuvastamine mittekirjakeelses tekstis
- Raivo Kolde ja Taavi Tammkivi. Text-based Analysis of Usernames as a Mean to Detect Spammers
- Meelis Kull and Laur Tooming. Gene Regulation?