Arvutiteaduse instituut
  1. Kursused
  2. 2015/16 kevad
  3. Erikursus masinõppes: Stiimulõpe (MTAT.03.317)
EN
Logi sisse

Erikursus masinõppes: Stiimulõpe 2015/16 kevad

Older Datamining Seminars: 2008k » 2008s » 2009k » 2009s » 2010k » 2011k » 2012s » 2014k » 2014s » 2014k

  • About
  • Timetable
  • Homeworks
  • Project ideas
  • Virtual Racing

Special Course in Machine Learning: Reinforcement Learning

Reinforcement learning has recently received a lot attention after a company named DeepMind published a paper in 2013 describing a system that learned to play Atari video games and was quickly acquired by Google after that. In 2015 their improved system was featured on the front page of Nature. The idea to combine reinforcement learning with deep learning sparked entire new research field called "deep reinforcement learning", that had its own workshop at NIPS 2015. In this course we are going to delve into reinforcement learning theory to understand how DeepMind produced something, that is considered by some as the first step towards general artificial intelligence.

Introduction

Reinforcement learning is an area of machine learning inspired by behaviorist psychology. In reinforcement learning we are concerned with an agent acting in an environment. Some of agent's actions may result in rewards and the goal of the agent is to maximize the rewards it receives. An important aspect is that rewards are usually rare and time-delayed. Many tasks in economics, engineering and computer science can be modeled as reinforcement learning problems. For example:

  • managing an investment portfolio (the investment agent gets reward when it makes profit),
  • controlling a humanoid robot (the robot gets reward when it stays upright),
  • internet search engine (the search engine gets reward when user clicks on one of the search results).

Reinforcement learning learning also plays a major role in shaping human behavior. Our brain embodies a complex mix of supervised, unsupervised and reinforcement learning modules. Amongst them the reinforcement learning module is the one most often associated with goal directed behavior.

Syllabus

In this course you will learn about:

  • Markov Decision Processes,
  • Model-based vs Model-free Learning,
  • Dynamic Programming
  • Value Function Approximation,
  • Policy Gradient Method,
  • Exploration and Exploitation.

Prerequisites

  • Linear algebra (matrices)
  • Calculus (taking derivatives)
  • Probability theory (conditional probabilities)
  • Machine learning
  • Programming (Python, Numpy)

Organization

The course consists of video lectures, tests, homeworks and a project. Each week you have to watch one lecture from David Silver's course. We suggest to watch it in group of 2-3 people and stop video frequently to discuss.

The class is held in J. Liivi 2 room 511 on Mondays at 12-14. For each class one of the students (or team) will produce test questions for the video lecture of that week. Another student will present a homework. Altogether there will be 9 tests and 9 homeworks. NB! The test must be sent to tambet.matiisen@ut.ee by Friday before the class for review! The homeworks must be sent to tambet.matiisen@ut.ee by Thursday after the class where it was presented.

The last 6 weeks are reserved for a project. The idea of a project is to implement a somewhat more complicated reinforcement system than those covered in the homeworks. Alternatively, we might organize a competition between agents trained by the students.

To pass the course one has to:

  • create one test and score more than 60% in all other tests,
  • submit all 9 homeworks and present one homework,
  • present a project.

Materials

  • David Silver's reinforcement learning course - we will mostly follow this course.
  • Berkeley's deep reinforcement learning course - we will use some homeworks from here.
  • Nando de Freitas' machine learning course - two last lectures are about reinforcement learning, but may be helpful also to get general machine learning background.

Additional videos:

  • Richard Sutton's introduction to reinforcement learning
  • David Silver's lecture about deep reinforcement learning
  • Jaan Tallinn states existential risk as reinforcement learning problem

Online books:

  • "Reinforcement Learning: An Introduction" by Sutton & Barto - the classic.
  • "Algorithms for Reinforcement Learning" by Szepesvari
  • "Artificial Intelligence: Foundations of Computational Agents" by Poole & Mackworth - very accessible chapter on reinforcement learning.

Contacts

All announcements during the semester will be made through deeplearning@lists.ut.ee list. To subscribe to the list, send an e-mail to sympa@lists.ut.ee and put SUBSCRIBE deeplearning <your name> to the e-mail body.

Tambet Matiisen, tambet.matiisen@ut.ee, room 018
Raul Vicente, raulvicente@gmail.com, room 016

  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Tartu Ülikooli arvutiteaduse instituudi kursuste läbiviimist toetavad järgmised programmid:
euroopa sotsiaalfondi logo