Multi-agent communication
Show that multiple agents can learn to communicate in a simple gridworld task. The project is based on article "Cerebral coherence between communicators marks the emergence of meaning", see also the movies in supplemental material.
Supervisor: Jaan Aru
Difficulty: medium
Torcs racing game
Teach a car to drive in racing game. You need to make use of Torcs Championship Server and Python client. The installation and use of championship server is documented in manual.
Supervisor: Tambet Matiisen
Difficulty: easy
Teach computer to solve equations
Given 3 numbers A, B and C, choose an operation so that A (op) B = C. For example given A=2, B=3 and C=5, the correct operation would be +. The state space is three integers, action is either +, -, * or /. The reward is 1, when answer is correct, otherwise 0.
Supervisor: Tambet Matiisen
Difficulty: too easy?
Experiment with MazeBase
MazeBase is a simple environment for creating very simple 2D games and training neural network models to perform tasks within them. It was developed by Facebook AI research group to specifically have a control over difficulty in reinforcement learning problems. The system is described in a paper, the code is on GitHub.
Supervisor: Tambet Matiisen
Difficulty: medium
Experiment with WebNav
WebNav is a task recently proposed NLP researchers. In this challenging task, an agent navigates through a web site, which is represented as a graph consisting of web pages as nodes and hyperlinks as directed edges, to find a web page in which a query appears. Example dataset is based on English Wikipedia. Alternatively you could implement and learn game "5 Clicks to Jesus" on the same dataset.
Supervisor: Tambet Matiisen
Difficulty: hard
Replicate DeepMind's Atari results
DeepMind's Atari paper sparked the interest in deep reinforcement learning and improved version was later featured on the cover of Nature. Your job is to replicate their results. You need to use Arcade Learning Environment (especially Python API), some toolkit for convolutional neural networks (say Keras or Neon) and a lot of GPU power.
Supervisor: Tambet Matiisen
Difficulty: hard
Presentation about AlphaGo
Do an in-depth presentation about DeepMind's AlphaGo engine - how it works, what algorithms are used and how they are combined.
Supervisor: Ilya Kuzovkin
Difficulty: medium