Special Course in Machine Learning: Reinforcement Learning
Reinforcement learning has recently received a lot attention after a company named DeepMind published a paper in 2013 describing a system that learned to play Atari video games and was quickly acquired by Google after that. In 2015 their improved system was featured on the front page of Nature. The idea to combine reinforcement learning with deep learning sparked entire new research field called "deep reinforcement learning", that had its own workshop at NIPS 2015. In this course we are going to delve into reinforcement learning theory to understand how DeepMind produced something, that is considered by some as the first step towards general artificial intelligence.
Reinforcement learning is an area of machine learning inspired by behaviorist psychology. In reinforcement learning we are concerned with an agent acting in an environment. Some of agent's actions may result in rewards and the goal of the agent is to maximize the rewards it receives. An important aspect is that rewards are usually rare and time-delayed. Many tasks in economics, engineering and computer science can be modeled as reinforcement learning problems. For example:
- managing an investment portfolio (the investment agent gets reward when it makes profit),
- controlling a humanoid robot (the robot gets reward when it stays upright),
- internet search engine (the search engine gets reward when user clicks on one of the search results).
Reinforcement learning learning also plays a major role in shaping human behavior. Our brain embodies a complex mix of supervised, unsupervised and reinforcement learning modules. Amongst them the reinforcement learning module is the one most often associated with goal directed behavior.
In this course you will learn about:
- Markov Decision Processes,
- Model-based vs Model-free Learning,
- Dynamic Programming
- Value Function Approximation,
- Policy Gradient Method,
- Exploration and Exploitation.
- Linear algebra (matrices)
- Calculus (taking derivatives)
- Probability theory (conditional probabilities)
- Machine learning
- Programming (Python, Numpy)
The course consists of video lectures, tests, homeworks and a project. Each week you have to watch one lecture from David Silver's course. We suggest to watch it in group of 2-3 people and stop video frequently to discuss.
The class is held in J. Liivi 2 room 511 on Mondays at 12-14. For each class one of the students (or team) will produce test questions for the video lecture of that week. Another student will present a homework. Altogether there will be 9 tests and 9 homeworks. NB! The test must be sent to firstname.lastname@example.org by Friday before the class for review! The homeworks must be sent to email@example.com by Thursday after the class where it was presented.
The last 6 weeks are reserved for a project. The idea of a project is to implement a somewhat more complicated reinforcement system than those covered in the homeworks. Alternatively, we might organize a competition between agents trained by the students.
To pass the course one has to:
- create one test and score more than 60% in all other tests,
- submit all 9 homeworks and present one homework,
- present a project.
- David Silver's reinforcement learning course - we will mostly follow this course.
- Berkeley's deep reinforcement learning course - we will use some homeworks from here.
- Nando de Freitas' machine learning course - two last lectures are about reinforcement learning, but may be helpful also to get general machine learning background.
- Richard Sutton's introduction to reinforcement learning
- David Silver's lecture about deep reinforcement learning
- Jaan Tallinn states existential risk as reinforcement learning problem
- "Reinforcement Learning: An Introduction" by Sutton & Barto - the classic.
- "Algorithms for Reinforcement Learning" by Szepesvari
- "Artificial Intelligence: Foundations of Computational Agents" by Poole & Mackworth - very accessible chapter on reinforcement learning.
All announcements during the semester will be made through firstname.lastname@example.org list. To subscribe to the list, send an e-mail to email@example.com and put
SUBSCRIBE deeplearning <your name> to the e-mail body.