Special Course in Machine Learning: AI-Safety - Courses

Multi-agent communication

Show that multiple agents can learn to communicate in a simple gridworld task. The project is based on article "Cerebral coherence between communicators marks the emergence of meaning", see also the movies in supplemental material.

Supervisor: Jaan Aru
Difficulty: medium

Torcs racing game

Teach a car to drive in racing game. You need to make use of Torcs Championship Server and Python client. The installation and use of championship server is documented in manual.

Supervisor: Tambet Matiisen
Difficulty: easy

Teach computer to solve equations

Given 3 numbers A, B and C, choose an operation so that A (op) B = C. For example given A=2, B=3 and C=5, the correct operation would be +. The state space is three integers, action is either +, -, * or /. The reward is 1, when answer is correct, otherwise 0.

Supervisor: Tambet Matiisen
Difficulty: too easy?

Experiment with MazeBase

MazeBase is a simple environment for creating very simple 2D games and training neural network models to perform tasks within them. It was developed by Facebook AI research group to specifically have a control over difficulty in reinforcement learning problems. The system is described in a paper, the code is on GitHub.

Supervisor: Tambet Matiisen
Difficulty: medium

Experiment with WebNav

WebNav is a task recently proposed NLP researchers. In this challenging task, an agent navigates through a web site, which is represented as a graph consisting of web pages as nodes and hyperlinks as directed edges, to find a web page in which a query appears. Example dataset is based on English Wikipedia. Alternatively you could implement and learn game "5 Clicks to Jesus" on the same dataset.

Supervisor: Tambet Matiisen
Difficulty: hard

Replicate DeepMind's Atari results

DeepMind's Atari paper sparked the interest in deep reinforcement learning and improved version was later featured on the cover of Nature. Your job is to replicate their results. You need to use Arcade Learning Environment (especially Python API), some toolkit for convolutional neural networks (say Keras or Neon) and a lot of GPU power.

Supervisor: Tambet Matiisen
Difficulty: hard

Presentation about AlphaGo

Do an in-depth presentation about DeepMind's AlphaGo engine - how it works, what algorithms are used and how they are combined.

Supervisor: Ilya Kuzovkin
Difficulty: medium

Special Course in Machine Learning: AI-Safety 2023/24 fall