LTAT.01.001 Natural language processing
This course aims to provide a broad overview of the field of natural langauge processing. The course first covers the basic text processing methods and various ways of representing textual data. We will look at the main task formulations used for natural language processing tasks: text classification, sequence tagging and sequence to sequence prediction. The course discusses various natural language processing tasks, starting from lower level tasks modeling linguistic structure, such as morphology and syntax, to more high level semantic tasks dealing with the meaning of the text, such as sentiment analysis or question answering.
Nowadays, the NLP field is dominated by the use of artificial neural networks. Thus, in this course we will look at various deep neural models that are nowadays commonly used for NLP: feedforward neural networks for learning word representations (embeddings) and text classification, recurrent networks for modeling sequential data and attention-based transformers for training powerful large language models.
Course info
- Lectures and seminars: Wednesdays at 16:15
- Delta building (Narva mnt 18), room 1019
- Practicums: Fridays at 16:15
- Delta building (Narva mnt 18), room 1019
- Lecturer: Kairit Sirts (kairit.sirts@ut.ee)
- TA: Emil Kalbaliyev (emil.kalbaliyev@ut.ee)
- TA: Aleksei Dorkin (aleksei.dorkin@ut.ee)
The lectures will be recorded. The recordings will be made available via Moodle. Practicums are in class-room and you can also join remotely via Zoom.
Join the slack using your university email.
Assessment
Type | Points | Comment |
---|---|---|
Practical homeworks | 40 points | 3 practical individual homeworks |
Project | 35 points | Done individually or in a group |
Seminar presentation | 10 points | Based on an article on a given topic, in groups |
Theory test | 20 points | In the end of the semester |
Total | 105 points |
How to pass the course?
In order to pass the course, you have to obtain at least 51 points from any course activities (homeworks, project, seminar presentation, theory test).
None of the course activities are compulsory. However, most course activities can only be done on the scheduled times and cannot be compensated later. Thus, we advise you to consider carefully in case you decide to skip any of the activities.
Plagiarism
As expected, plagiarism is not allowed. Homeworks and theory test are strictly individual work. Individual assignments can be discussed in groups but your solution must be your own.
Using generative AI (ChatGPT)
Usage of generative AI is allowed in accordance to the university guidelines. Please consult the guidelines carefully about what is appropriate use vs what constitutes plagiarism. In the end, you as a student are solely responsible for the content of your work.
Prerequisites
This course assumes knowledge from various areas. In Study Information System, the required prerequisite course is Machine Learning (MTAT.03.227) and the recommended prerequisite courses are Language Technology (MTAT.06.045) and Artificial Intelligence I (MTAT.06.008). In practice, we also assume the basic knowledge of higher math (calculus, linear algebra, probabilities) and computer programming (python). If you lack some of the required knowledge then it is your responsibility to acquire it at the level necessary for advancing on this course. We can help to find suitable materials for obtaining the necessary background.