### What is this course about

The course gives an overview of most common machine learning methods and explains related theoretical concepts from statistics and numeric methods. In particular, we discuss how machine learning should be used so that obtained results are meaningful and interpretable. For that we need to discuss some essential aspects of Monte-Carlo integration and its connection to various validation methods. In order to understand why some methods perform so well in training but poorly in practice afterwards, we cover the bias-variance problem and show how statistical learning theory can be used in this context. The remaining of the course is dedicated to various machine learning methods. The course is organised into four main blocks.

- Essentials of Machine Learning
- Decision trees and association rules
- Linear models and polynomial interpolation
- Performance evaluation measures
- Machine learning as an optimisation task
- Linear classification methods
- Numeric optimization methods
- Neural networks and discrete optimisation

- Model-Based Reasoning in Machine Learning
- Basics of probabilistic modelling
- Maximum likelihood and maximum a posteriori estimates
- Model-based clustering techniques
- Expectation-maximisation and data augmentation algorithm
- Factor analysis: PCA, LDA and ICA

- Instance-Based Machine Learning Techniques
- Statistical learning theory
- Nearest neighbourhood methods
- Support Vector Machines
- Other kernel methods

- Ensemble Methods
- Basics of Ensemble methods
- Particle filters

The first block introduces the main methodology together with two basic machine learning tasks: classification and prediction. In this block, machine learning is formulated as a minimisation task. That is, any learning algorithm must find a configuration that minimises certain objective function. The following figure captures the most important mathematical concepts used in the first block and a list of university courses that define or study these concept. Subjects with red background are essential, orange subjects are good to know in order to fully appreciate the beauty of mathematical proofs, light green marks subjects that are needed if you want to derive new results, dark green marks truly advanced treatments or applications.

The second block mostly discuss how to choose a good objective function so that the optimisation procedure would yield a reasonable output. It turns out that it is good to treat everything as randomised processes in order to find a good objective function to minimise. Moreover, we can naturally embed our background knowledge into the machine learning method.

The third block revisits classification and prediction tasks form a different perspective. Namely, there is a inevitable trade-off between plasticity and stability in machine learning tasks. If we are willing to learn complex models, then we need many data points to adequately fix the model parameters. As a result, training error could be much smaller that the future performance of the algorithm on unknown data. This issue is further studied by Statistical Learning Theory, which has lead to development of Support Vector Machines and other kernel method. In laymen's terms, these methods do not try to fit parameters on the data. Instead, they keep few very informative data points and use them to make future predictions.

The last block discusses another important issue. Namely, finding a single classifier or predictor often leads to sub-optimal results. A particular model might be correct only in limited domain or only for a certain sub-population in data. Hence, better results can be obtained by training a collection of predictors and then outputting a consolidated prediction.

### How to pass this course

To pass the course, a student has to participate in most lectures and exercise sessions, solve sufficient amount of home exercises and do a small course project. There will be one lecture and exercise session in each week. A student will not pass the course if he or she is absent from 7 or more lectures.

Each exercise session is dedicated to a guided solution of pre-made sets of exercises. One exercise is given out as an homework in each exercise session. A student must send solutions to the teaching assistant during a week (that is in 168 hours) after that solutions will be published and no new solutions are accepted. The homework exercises are practical: in order to solve an exercise you need to write a small program in GNU R. We chose this computational environment because it is freeware and contains many methods for data analysis. If you do not like the language, you can use MATLAB, Scilab, C++ or Python instead. However, this would be much harder in our opinion.

Each homework gives 10 points. Additionally, it is possible to get 50 points by doing an additional course project. This involves a more complex task and you have to write a 6-10 page project report. There are no other ways to get points. The grade for the course is computed by summing all points and then applying the formula:

final score = min(gathered points, 140)/1.4.

After that the standard grading scale is used to compute the grade, e.g 91 and up gives A.

In order to get the grade, the student has to pass the exam. The exam will give no points nor otherwise contribute to the grade. The result of an exam is just a pass-fail decision.
As usual you can redo the exam. However, it does not change you grade. **If your final score is below 50, i.e., you gathered less than 70 points from home exercises and the course project then you have failed the course**.

### How plagiarism and cheating is handled

Since your grade depends solely on the homework, **we are very strict in these matters**. If we have a suspicion, then we inform you and allow you to clarify issues. If you have cheated or copied your coursemate's work, then you **will get no further points for home exercises and the project work**. As a result, you may fail the course. If you feel that you have been unfairly treated then you can contact pro dean.