III. Performance evaluation measures
Given by Sven Laur
Brief summary: Principles of experiment design. Machine learning as minimisation of future costs. Overview of standard loss functions. Stochastic estimation of future costs by random sampling (Monte-Carlo integration). Theoretical limitations. Standard validation methods: holdout, randomised holdout, cross-validation, leave-one-out, bootstrapping. Advantages and drawbacks of standard validation methods
Slides: PDF
Videos:
Literature:
- Davison and Hinkley: Bootstrap Methods and Their Application
- Molinaro, Simon and Pfeiffer: Prediction Error Estimation: A Comparison of Resampling Methods
- Arlot and Celisse: A survey of cross-validation procedures for model selection
- Efron: Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation
- Efron and Tibshirani: Improvements on Cross-Validation: The .632+ Bootstrap Method
- Wolfgang Härardle: Applied Nonparametric Regression: Choosing the smoothing parameter (Chapter 5)
- Yang: Can the Strengths of AIC and BIC Be Shared?
- van Erven, Grunwald and de Rooij:Catching Up Faster by Switching Sooner: A Prequential Solution to the AIC-BIC Dilemma
Complementary exercises:
- Generate data form a simple linear or polynomial regression model and use various validation methods and report results:
- Did a training method chose a correct model
- Is there some differences when the correct model is not feasible?
- Estimate bias and variance of a training method
- Did a validation method correctly estimated expected losses
- Try various classification and linear regression methods together with various validation methods report the results
Free implementations: