Erikursus masinõppes: Stiimulõpe närvivõrkudega - Kursused

Deep Reinforcement Learning

Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. What distinguishes reinforcement learning from other machine learning disciplines (for example from online learning) is that rewards are delayed - the agent learns long later if particular action was good or not. The main challenges in reinforcement learning are (temporal) credit assignment and exploration. Deep reinforcement learning refers to the use of neural networks in reinforcement learning models.

Our course is based on the course of the same name from UC Berkeley:
http://rail.eecs.berkeley.edu/deeprlcourse/

Public discussion board for people taking the course from outside:
https://www.reddit.com/r/berkeleydeeprlcourse/

The course is given by Sergey Levine, well-known expert in applying deep learning to robotics problems.

Prerequisites

We expect you to be comfortable with

linear algebra,
(matrix) calculus,
probability theory,
machine learning,
Python and Tensorflow/Pytorch.

This is going to be tough course, so do not take the prerequisites lightly. We provide some self-study materials in Links section.

Organization

The course is split over two semesters, in the fall semester we will cover lectures 1-13 and in the spring semester the rest.

We will use the flipped classroom approach, where we watch lectures at home and discuss them in the class. Each class begins with a short test on the lecture material (prepared by course instructors), which is supposed to give a seed to discussion.

Important part of the course is homeworks. There will be a separate "lecture" where students present and discuss their homework solutions. On the next day EVERYBODY is expected to submit their homeworks and these will be graded by course instructors.

In the spring semester there will be a project taken by teams of 1-3 students. The result of a project is three things: a presentation in class, a blog post and code. Optimal team size is 2, we expect more thorough projects from teams of 3.

Grading

fall 2018: 50% homeworks, 50% test results
spring 2019: 50% project, 40% test results, 10% homeworks

You are expected to collect 60% of points to pass the course.

We allow 5 cumulative days for late submission of homeworks. For example if you submit first homework 3 days late and second homework 2 days late, then you must submit the rest of the homeworks in time, or they will not count.

Contacts

Tambet Matiisen - main instructor
Roman Ring - assistant instructor
Raul Vicente - group leader

Erikursus masinõppes: Stiimulõpe närvivõrkudega 2018/19 sügis

Deep Reinforcement Learning

Prerequisites

Organization

Grading

Contacts