Deep Reinforcement Learning
NB! This is a follow-up course for those who took the same course in the 2018/2019 fall semester.
Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. What distinguishes reinforcement learning from other machine learning disciplines (for example from online learning) is that rewards are delayed - the agent learns long later if particular action was good or not. The main challenges in reinforcement learning are (temporal) credit assignment and exploration. Deep reinforcement learning refers to the use of neural networks in reinforcement learning models.
Our course is based on the course of the same name from UC Berkeley:
http://rail.eecs.berkeley.edu/deeprlcourse/
Public discussion board for people taking the course from outside:
https://www.reddit.com/r/berkeleydeeprlcourse/
The course is given by Sergey Levine, well-known expert in applying deep learning to robotics problems.
Prerequisites
We expect you to be comfortable with
- linear algebra,
- (matrix) calculus,
- probability theory,
- machine learning,
- Python and Tensorflow/Pytorch.
This is going to be tough course, so do not take the prerequisites lightly. We provide some self-study materials in Links section.
Organization
The course is split over two semesters, in the fall semester we will cover lectures 1-13 and in the spring semester the rest.
We will use the flipped classroom approach, where we watch lectures at home and discuss them in the class. Each class begins with a discussion of the lecture material and ends with a short test prepared by the course instructors.
Important part of the course is homeworks. There will be a separate "lecture" where students present and discuss their homework solutions. On the next day EVERYBODY is expected to submit their homeworks and these will be graded by course instructors.
In the spring semester you can choose between doing more advanced homeworks or taking on a project. The result of a project is three things: a presentation in class, a blog post on Medium or paper on Arxiv, and code on GitHub. Team size is 1-2 people, we expect more from teams of 2.
Grading
- fall 2018: 50% homeworks, 50% test results
- spring 2019: 50% project / homeworks, 50% test results
You are expected to collect 60% of points to pass the course.
We allow 5 cumulative days for late submission of homeworks. For example if you submit first homework 3 days late and second homework 2 days late, then you must submit the rest of the homeworks in time, or they will not count.
Contacts
- Tambet Matiisen - main instructor
- Roman Ring - assistant instructor
- Raul Vicente - group leader