## Summary

The seminar follows more or less the coursera course *Neural Networks for Machine Learning* by Geoffrey Hinton. Most of the topics cover certain lectures in the course.

Each seminar participant is expected to present at least one topic. Since all of you have different backgrounds, it is advisable to form groups of 2-3 people and work through 2-3 seminar topics together.

## Administrative details

- Date and location: Tuesdays 16:15-18:00 @ Liivi 2-512
- Contact: aleksandr.tkatsenko@ut.ee
- To pass the course you need to
- make at least one topic presentation and
- participate in seminars (more than 3 missed seminars gives a Fail).
- Since we have more participants than lectures, alternatively, you can do all 4 programming assignments at https://class.coursera.org/neuralnets-2012-001/quiz and send me working solutions till the 10th of December.
- If you miss a seminar, you need to solve the corresponding lecture quiz https://class.coursera.org/neuralnets-2012-001/quiz and send me correct answers within two weeks.

## List of participants

- Ilya Verenich
- Kaspar MÃ¤rtens
- Alexander Tkachenko
- Taivo KÃ¤sper
- Fanny-Dhelia Pajuste
- Andres Viikmaa
- Ardi Tampuu
- Mari-Liis Kruup
- Tambet Matiisen
- Ilya Kuzovkin
- Anti Alman
- Pihel Saatmann
- Karl-Oskar Masing
- Hans-Peeter Tulmin

## Presentation schedule

### Lecture 0: What is deep learning and what are neural networks

**Presenter:**Alexander Tkachenko**Date:**9th of September @ 16:15 Liivi 2-512**Slides:**PDF**Video:**UTTV**Additional Material:**- Deep Networks for NLP: Slides

### Lecture 1: The Perceptron learning procedure

**Presenter:**Alexander Tkachenko**Date:**16th of September**Slides:**PDF**Video:**UTTV**Summary:**- Types of neural network architectures
- Perceptrons: The first generation of neural networks
- A geometrical view of perceptrons
- Why the learning works
- What perceptrons can't do

### Lecture 2: The backpropagation learning proccedure

**Presenter:**Anti Alman**Date:**23th of September**Slides:**PDF**Summary:**- Learning the weights of a linear neuron
- The error surface for a linear neuron
- Learning the weights of a logistic output neuron
- The backpropagation algorithm
- Learning representations by back-propagating errors for The backpropagation algorithm
- Using the derivatives computed by backpropagation

**Additional material:**- UFLDL Tutorial: Backpropagation Algorithm
- Neural Networks and Deep Learning, By Michael Nielsen. CHAPTER 2: How the backpropagation algorithm work

### Lecture 3: Learning feature vectors for words

**Presenter:**Ilja Verenich**Date:**30th of September**Slides:**PDF**Summary:**- Learning to predict the next word
- A brief diversion into cognitive science
- Another diversion: The softmax output function
- Neuro-probabilistic language models
- Ways to deal with the large number of possible outputs

**Additional material:**- A Neural Probabilistic Language Model, Bengio et al: Paper
- UFLDL_Tutorial: Softmax_Regression
- word2vec: efficient implementation for learning vector representations of words, which can then be used as input for other algorithms, i.e. word prediction. Both tool for learning vectors and pre-learned vector sets are available. Slides: https://docs.google.com/file/d/0B7XkCwpI5KDYRWRnd1RzWXQ2TWc/edit
- Paper on generating text character-by-character with Recurrent Neural Networks http://www.icml-2011.org/papers/524_icmlpaper.pdf

### Lecture 4: Object recognition with neural nets

**Presenter:**Fanny-Dhelia Pajuste ja Mari-Liis Kruup**Date:**7th of October**Slides:**PDF**Summary:**- Why object recognition is difficult
- Achieving viewpoint invariance
- Convolutional nets for digit recognition
- Convolutional nets for object recognition

**Additional material:**- UFLDL Tutorial: Working with Large Images

### Lecture 5: Optimization: How to make the learning go faster

**Presenter:**Hans Peeter Tulmin**Date:**14th of October**Slides:**PDF**Summary:**- Overview of mini-batch gradient descent
- A bag of tricks for mini-batch gradient descent
- The momentum method
- Adaptive learning rates for each connection
- Rmsprop: Divide the gradient by a running average of its recent magnitude

### Lecture 6: Recurrent neural networks

**Presenter:**Karl-Oskar Masing ja Kaspar MÃ¤rtens**Date:**21th of October**Slides:**PDF**Summary:**- Modeling sequences: A brief overview
- Training RNNs with back propagation
- A toy example of training an RNN
- Why it is difficult to train an RNN
- Long-term Short-term-memory

### Lecture 7: More recurrent neural networks

**Presenter:**Karl-Oskar Masing ja Kaspar MÃ¤rtens**Date:**28th of October**Slides:**PDF**Summary:**- A brief overview of Hessian Free optimization
- Lecture 8 slides in pdf for A brief overview of Hessian Free optimization
- Modeling character strings with multiplicative connections
- Learning to predict the next character using HF
- Echo State Networks

### Lecture 8: Ways to make neural networks generalize better

**Presenter:**Pihel Saatmann**Date:**4th of November**Slides:**PDF**Summary:**- Overview of ways to improve generalization
- Limiting the size of the weights
- Using noise as a regularizer
- Introduction to the full Bayesian approach
- The Bayesian interpretation of weight decay
- MacKay's quick and dirty method of setting weight costs

### Lecture 9: Combining multiple neural networks to improve generalization

**Presenter:**Andres Viikmaa**Date:**11th of November**Slides:**PDF**Summary:**- Why it helps to combine models
- Mixtures of Experts
- The idea of full Bayesian learning
- Making full Bayesian learning practical
- Dropout

### Lecture 10: Hopfield nets and Boltzmann machines

**Presenter:**Tambet Matiisen**Date:**18th of November**Slides:**PDF**Summary:**- Hopfield Nets
- Dealing with spurious minima
- Hopfield nets with hidden units
- Using stochastic units to improv search
- How a Boltzmann machine models data

### Lecture 11: Restricted Boltzmann machines (RBMs)

**Presenter:**Alexander Tkachenko**Date:**25th of November**Slides:**PDF**Summary:**- Boltzmann machine learning
- More efficient ways to get the statistics
- Restricted Boltzmann Machines
- An example of RBM learning
- RBMs for collaborative filtering

**Additional material:**

### Lecture 12: Stacking RBMs to make Deep Belief Nets

**Presenter:**Ardi Tampuu**Date:**2nd of December**Slides:**PDF**Summary:**- The ups and downs of back propagation
- Belief Nets
- Learning sigmoid belief nets
- The wake-sleep algorithm

### Lecture 13: Deep neural nets with generative pre-training

**Presenter:**Fanny-Dhelia Pajuste ja Mari-Liis Kruup**Date:**9th of December**Slides:**PDF**Summary:**- Learning layers of features by stacking RBMs
- Discriminative learning for DBNs
- What happens during discriminative fine-tuning?
- Modeling real-valued data with an RBM
- RBMs are infinite sigmoid belief nets

### Lecture 14: Modeling hierarchical structure with neural nets

**Presenter:**Ilya Kuzovkin**Date:**16th of December**Slides:**PDF**Summary:**- From PCA to autoencoders
- Deep auto encoders
- Deep auto encoders for document retrieval
- Semantic Hashing
- Learning binary codes for image retrieval
- Shallow autoencoders for pre-training

## Additional Materials:

- Hinton's Home page: http://www.cs.toronto.edu/~hinton/

### Online Books:

- DEEP LEARNING: Methods and Applications, Li Deng and Dong Yu, Microsoft research, January 2014: http://research.microsoft.com/apps/pubs/?id=209355
- DEEP LEARNING, Yoshua Bengio et al, Draft August 2014: http://www.iro.umontreal.ca/~bengioy/dlbook/
- Learning deep architectures for AI, by Bengio, Yoshua: http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/239
- A Brief Introduction to Neural Networks, David Kriesel, 2007: http://www.dkriesel.com/en/science/neural_networks
- Neural Networks and Deep Learning, By Michael Nielsen, Sep 2014: http://neuralnetworksanddeeplearning.com/

### Tutorials:

- Tutorial on Unsupervised Feature Learning and Deep Learning, by Andrew Ng et al.: http://deeplearning.stanford.edu/wiki/index.php/UFLDL_Tutorial
- Deep Learning for Natural Language Processing, by Richard Socher and Christopher Manning, NAACL HLT2013 tutorial: http://nlp.stanford.edu/courses/NAACL2013/
- Yann LeCun & Marc'Aurelio Ranzato's ICML2013 tutorial (computer vision perspective): http://techtalks.tv/talks/deep-learning/58122/
- Li Deng's talk at Johns Hopkins University CSLP (speech recognition perspective): http://vimeo.com/75244336

### Courses:

- Deep Learning at CMU by Bhiksha Raj, 2014 http://deeplearning.cs.cmu.edu/
- Neural networks class by Hugo Larochelle, UniversitÃ© de Sherbrooke: https://www.youtube.com/playlist?list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
- Deep Learning and Neural Networks, Kevin Duh, Nara Institute of Science and Technology, January 2014: http://cl.naist.jp/~kevinduh/a/deep2014/

### Papers:

- Deep Learning of Representations: Looking Forward, Yoshua Bengio, May 2013:http://arxiv.org/abs/1305.0445