XV. Basics of ensemble methods
Given by Meelis Kull
Brief summary: Bayesian view on model selection. Ensembles as a Monte Carlo integration technique. Committee voting as a Bayesian Model averaging. Bagging is bootstrapping together with averaging. Sequential error correction methods and the idea of data point weighting. AdaBoost algorithm and its reformulation in terms of standard minimisation problem with a peculiar cost function. Non-robustness of AdaBoost algorithm and alternatives. Mixtures of experts and relation to lazy learning.
Slides: PDF
Video: UTTV(2016)
Literature:
- Bishop: Pattern Recognition and Machine Learning pages 653 - 674
- Hastie, Tibshirani & Friedman: The Elements of Statistical Learning pages 337 - 387
Complementary Exercises:
- Bishop: Pattern Recognition and Machine Learning pages 674-677
- Study the robustness and precision of bagging and boosting on Spambase datased with simple tree based classifiers.
- Study the behaviour of Bayesian Model Averaging for linear models. Interpret the results.
Free implementations:
- BMS package for Bayesian Model Averaging in R:
bms
,topmodels.bms
,image
- BMA package for Bayesian Model Averanging in R:
bicreg
,bic.glm
,bic.surv
andimageplot.bma
- Ipred package in R:
bagging
,inbagg
andbootest
- Ada package in R:
ada
andpredict.ada
- Gbm package in R:
gbm
,gbm.perf
andgbm.perf
- Mboost package in R