Loomuliku keele töötlus - Kursused - Arvutiteaduse instituut

Project

Every student must make a small project that contributes 30% to the final grade. The projects are individual, although several students can work on the same topic. This means that you can collaborate on the same topic but in the end everyone will submit their own individual work which cannot be identical to the work submitted by any other student.

There are two of options for choosing the project topic:

A. Choose the topic that you like

Because there are only four weeks for working on the project, the task cannot be too large. Also, make sure that suitable training data is freely available. If you choose that option, you can consult either me (Kairit, by e-mail or office hours) or Maksym (by e-mail, on Piazza, or after practice session) to make sure that your chosen task is suitable for the project.

For type A project, you need to do the following:

Choose your topic.
Find few (1-3) reference papers as related work.
Implement your model and run experiments.
Write a short report about your work.
Submit the code and the report as your project.

B. Choose a model implemented in AllenNLP (https://allennlp.org/models)

For each model, the following resources are available: the paper describing the system and the code. Annotated training data is not available for every model, thus you can only choose from a subset of models. The following table describes possible options.

Model	Data
Machine comprehension	SQuAD (available from AllenNLP)
Textual entailment	SNLI (available from AllenNLP)
Coreference resolution	Other datasets might be available, for instance https://preschool-lab.github.io/PreCo/
Named entity recognition	Other datasets might be available (there is also one in Estonian)
Dependency parsing	UD datasets are available: https://universaldependencies.org/
Event2Mind	Event2Mind corpus (available from AllenNLP)

For type B project, you need to do the following:

Choose the model you like.
Read the accompanying paper to understand what’s going on in the model.
Re-train the model as it is on the original dataset or train it with another available dataset.
Think about how you can change the model, implement the changes and rerun the experiments
Write a short report about your work.
Submit the code and the report as your project.

Suggested project timeline

Mandatory deadlines are in red.

Track A - custom project	Track B - AllenNLP model	Date
Confirm the topic	Confirm the topic	10.05.2019
Read the paper(s), implement your model	Read the paper, retrain the AllenNLP model	17.05.2019
Implement your model, experiment	Implement changes, experiment	24.05.2019
Implement your model, experiment	Implement changes, experiment	31.05.2019
Submit project code and report	Submit project code and report	07.06.2019

Confirming project topic

Confirm your topic, either by telling Maksym in the practice session or emailing Kairit and/or Maksym. We will insert them into the following table:

Getting help

You can get help from both Kairit and Maksym with both choosing the project topic and while working on the project.

Kairit: by email, office hours (room 416; if you want to come but cannot make it at those times then please let me know and we'll figure something out):
- 08.05.2019 3:30-5pm
- 16.05.2019 12-2pm
- 23.05.2019 12-2pm
- 30.05.2019 12-2pm
- 05.06.2019 12-2pm
Maksym: by email, via Piazza, during practice sessions

Loomuliku keele töötlus 2018/19 kevad