Homeworks
Submit both your code and report.
1. Homework 1: Imitation LearningSolutions for this task can no longer be submitted.
2. Homework 2: Policy Gradient, tasks 1-5
Solutions for this task can no longer be submitted.
3. Homework 2: Policy Gradient, tasks 6-8
Solutions for this task can no longer be submitted.
4. Homework 3: Q-Learning, Actor-Critic
Solutions for this task can no longer be submitted.
5. Homework 4: Model-Based RL
Solutions for this task can no longer be submitted.
Homework presentations
Fill in your choice here.
Homework | Person |
Homework 1 (20.09.2018) | |
Behavioral Cloning (easy) | Markus Kängsepp |
DAgger (medium) | Markus Loide |
Homework 2 tasks 1-5 (11.10.2018) | |
State-dependent baseline (mathy) | Anton Potapchuk |
Implement neural network (easy) | Andre Tättar |
Implement policy gradient (medium) | Novin Shahroudi |
CartPole (easy) | Sebastian Värv |
InvertedPendulum (easy) | Kristjan Veskimäe |
Homework 2 tasks 6-8 (25.10.2018) | |
Neural network baseline (medium) | Maksym Semikin |
LunarLander (medium) | Daniel Majoral |
HalfCheetah (medium) | Kristjan Veskimäe |
Bonus: implement parallelization (hard?) | Aqeel Labash? |
Bonus: implement generalized advantage estimation (easy) | Novin Shahroudi |
Bonus: implement multi-step policy gradient (easy) | |
Homework 3 (15.11.2018) | |
Basic Q-learning (medium) | Hannes Liik |
Double Q-learning (easy) | Laura Ruusmann |
Hyperparameter search (easy) | |
Bonus: Actor-Critic (easy?) | Oriol Corcoll |
Homework 4 (13.12.2018) | |
Implement dynamics model (medium) | |
Implement action selection (medium) | |
Implement model-based reinforcement learning (easy?) | Hasan Sait Arslan |
Hyperparameter search (easy) | |
Bonus: use CEM for action selection | |
Bonus: use multi-step loss |
These are the people who are going to present their homework solutions. EVERYBODY is expected to submit their homeworks the next day after presentation (excluding bonus exercises).