### Summary

- All lecture and practice session materials are available in Github
- Some of them may be in incomplete state. I will let you know which materials are stable

## How to pass this course

**Lectures**

To pass the course, a student has to participate in most lectures and exercise sessions, solve sufficient amount of home exercises and do a small course project. There will be one lecture and exercise session in each week. A student will not pass the course if he or she is absent from more than 4 lectures.

**Exercise sessions**

Exercise session will be held in pair. Each pair corresponds to a particular topic. The homeworks will be given out in each exercise session. The number points in exercises is larger than maximal number of points you can get:

- Nominal score for each exercise session is
**5 points** - Maximal score for each exercise session is
**7.5 points**

The homeworks will be collected after each exercise session pair. After the exercise session pair has ended you have exactly week to submit exercises. The formal deadline is the start of the next exercise session on a new topic.

The solution to the homework is a re-runnable Jupyter notebook that contains the code graphs and text describing what you did and how to interpret results. Each notebook must contain the name of the author and the subtopic in the title. You can pack some local Python modules as .py files provided that the code will run if unpacked into a single directory.

**Grading**

Homework points will be normalized by the total number of homeworks. Additionally, it is possible to get **35 points** by doing an additional course project. This involves a more complex task and you have to write a 6-10 page project report.

raw.score = sum.of.homeworks + sum.of.bonuses + project.score

normalized.score = raw.score / nominal.score.of.homeworks * 100

After that the standard grading scale is used to compute the grade, e.g 91 and up gives A.

**Exam**

In order to get the grade, the student has to pass the exam. The exam will give no points nor otherwise contribute to the grade. The result of an exam is just a pass-fail decision. As usual you can redo the exam. However, it does not change you grade. **If your final score is below 50, i.e., you gathered less than 70 points from home exercises and the course project then you have failed the course.**

## How plagiarism and cheating is handled

Since your grade depends solely on the homework, **we are very strict in these matters**. If we have a suspicion, then we inform you and allow you to clarify issues. If you have cheated or copied your coursemate's work, **then you will get no further points for home exercises and the project work**. As a result, you may fail the course. If you feel that you have been unfairly treated then you can contact pro dean.

Ignore this page. It is under construction

### What is this course about

The course gives an overview of most common machine learning methods and explains related theoretical concepts from statistics and numeric methods. In particular, we discuss how machine learning should be used so that obtained results are meaningful and interpretable. For that we need to discuss some essential aspects of Monte-Carlo integration and its connection to various validation methods. In order to understand why some methods perform so well in training but poorly in practice afterwards, we cover the bias-variance problem and show how statistical learning theory can be used in this context. The remaining of the course is dedicated to various machine learning methods. The course is organised into four main blocks.

- Essentials of Machine Learning
- Decision trees and association rules
- Linear models and polynomial interpolation
- Performance evaluation measures
- Machine learning as an optimisation task
- Linear classification methods
- Numeric optimization methods
- Neural networks and discrete optimisation

- Model-Based Reasoning in Machine Learning
- Basics of probabilistic modelling
- Maximum likelihood and maximum a posteriori estimates
- Model-based clustering techniques
- Expectation-maximisation and data augmentation algorithm
- Factor analysis: PCA, LDA and ICA

- Instance-Based Machine Learning Techniques
- Statistical learning theory
- Nearest neighbourhood methods
- Support Vector Machines
- Other kernel methods

- Ensemble Methods
- Basics of Ensemble methods
- Particle filters

The first block introduces the main methodology together with two basic machine learning tasks: classification and prediction. In this block, machine learning is formulated as a minimisation task. That is, any learning algorithm must find a configuration that minimises certain objective function. The following figure captures the most important mathematical concepts used in the first block and a list of university courses that define or study these concept. Subjects with red background are essential, orange subjects are good to know in order to fully appreciate the beauty of mathematical proofs, light green marks subjects that are needed if you want to derive new results, dark green marks truly advanced treatments or applications.

The second block mostly discuss how to choose a good objective function so that the optimisation procedure would yield a reasonable output. It turns out that it is good to treat everything as randomised processes in order to find a good objective function to minimise. Moreover, we can naturally embed our background knowledge into the machine learning method.

The third block revisits classification and prediction tasks form a different perspective. Namely, there is a inevitable trade-off between plasticity and stability in machine learning tasks. If we are willing to learn complex models, then we need many data points to adequately fix the model parameters. As a result, training error could be much smaller that the future performance of the algorithm on unknown data. This issue is further studied by Statistical Learning Theory, which has lead to development of Support Vector Machines and other kernel method. In laymen's terms, these methods do not try to fit parameters on the data. Instead, they keep few very informative data points and use them to make future predictions.

The last block discusses another important issue. Namely, finding a single classifier or predictor often leads to sub-optimal results. A particular model might be correct only in limited domain or only for a certain sub-population in data. Hence, better results can be obtained by training a collection of predictors and then outputting a consolidated prediction.