- All lectures will be held in the classroom. I make the recordings available in UTTV
- All lecture and practice session materials are available in Github
- Some of them may be in an incomplete state. I will let you know which materials are stable
What is this course about
The course gives an overview of most common machine learning methods and explains related theoretical concepts from statistics and numeric methods. In particular, we discuss how machine learning should be used so that obtained results are meaningful and interpretable. For that, we need to discuss some essential aspects of Monte-Carlo integration and its connection to various validation methods. In order to understand why some methods perform so well in training but poorly in practice afterwards, we cover the bias-variance problem and show how statistical learning theory can be used in this context. The remaining of the course is dedicated to various machine learning methods. The course is organised into three main blocks.
- Alternative probabilistic view on standard methods
- Performance evaluation measures
- Rudiments of Statistical Learning Theory
- Linear models and polynomial interpolation
- Linear classification methods
- Model-Based Reasoning in Machine Learning
- Basics of probabilistic modelling
- Maximum likelihood and maximum a posteriori estimates
- Model-based clustering techniques
- Expectation-maximisation and data augmentation algorithm
- Factor analysis: PCA, LDA and ICA
- Techniques for filtering and smoothing
- Standard sequence models: Markov Chains and Hidden Markov Models
- Particle filters as Monte-Carlo approximation to integration problems
- Standard grid models: Random Markov Fields and Conditional Random Fields
- Ensemble Methods as Monte-Carlo integration over posterior
How to pass this course
To pass the course, a student has to solve a sufficient amount of home exercises. Additional points can be gained by doing a small course project. There will be one lecture and exercise session each week. Homework must be submitted in every second week.
Exercise session will be held in pair. Each pair corresponds to a particular topic. The homework will be given out in each exercise session. The number of points in exercises is larger than the maximal number of points you can get:
- Nominal score for each exercise session is 5 points
- Maximal score for each exercise session is 7.5 points
The homework will be collected after each exercise session pair. After the exercise session pair has ended you have exactly one week to submit exercises. The formal deadline is the start of the next exercise session on a new topic.
Presence in the lectures and homework: If you are not present in the lectures corresponding to the particular topic then you need to score at least two points in the corresponding homework.
How to format homework submissions: The main solution to tasks is re-runnable Jupyter notebook (note, singular!) that contain code, graphs, and text describing what you did and how to interpret results. The notebook must contain the name of the author. Organize your notebook into sections by task origin notebook (e.g. for HW1, the first section would be 01_convergence). You can pack some local Python modules as .py files provided that the code will run if unpacked into a single directory. Remove all tasks, code and text that are not part of your solution (i.e. tasks you didn't solve, long introductory texts). At the start of the notebook add a list of all the tasks you solved or attempted to solve.
Use the following checklist to see that your submission fits the criteria for submission
Checklist for homework submission:
- Output is 1 Jupyter notebook + accompanying Python files (ZIP them if needed!)
- The submission notebook contains your name at the start
- Random seed is fixed to guarantee that computations are repeatable! The fix is only once per file. All zeros seed is ok but you can pick your favourite
- Restart & Run-All should run the notebook properly (important!)
- At the beginning of the notebook, there is a list of tasks which you (partly) completed
- When a task is incomplete, state what remains undone at the task.
- Unnecessary text and code are removed (e.g. sample practical text and code at the start), what remains are homework solutions with accompanying text. Keep the question text for completeness.
- Results and plots have accompanying text on how you interpret it
- Notebook is organized into sections by task origin notebook (e.g. first section is 01_convergence.ipynb)
- Python code alone is never a solution. You need to describe what you do and interpret the results. Do not lie! If you get bizarre or unexpected results then admit it. A bug in a code can be a minor problem while the fact that you cannot recognise problems is a major issue.
You can try and follow the format as in the example here: Sample submission
Homework points will be normalized by the total number of homeworks. Additionally, it is possible to get 20 points by doing an additional course project. This involves a more complex task and you have to write a 6-10 page project report.
- normalised_homework_score = (sum_of_homeworks + sum_of_bonuses) / nominal_score_of_homeworks * 100
- final_score = normalised_homework_score + project_score
After that, the standard grading scale is used to compute the grade, e.g 91 and up gives A.
In order to get the grade, the student has to pass the exam. The exam will give no points nor otherwise contribute to the grade. The result of an exam is just a pass-fail decision. As per usual, you can redo the exam. However, it does not change your grade. If your final score is below 50 you have failed the course.
How plagiarism and cheating is handled
Since your grade depends solely on the homework, we are very strict in these matters.
- If we have a suspicion, then we inform you and allow you to clarify issues.
- If you have cheated or copied your coursemate's work, then you will get no further points.
- As a result, you may fail the course or get a low grade.
- If you feel that you have been unfairly treated then you can contact pro dean.