Place: Delta, 2048
Time: Friday 12.15-13.45
Coordinators: Mark Fišel, Agnes Luhtaru
The seminar theme is multimodal systems with speech or text. We will cover visual language models, general architectures, audiovisual ASR, etc.
We will discuss one paper per week. Papers are divided into subtopics, like the pre-trained model, used data, different experiments etc. Before each seminar:
- Skim the assigned paper (see Topics) and select a subtopic of interest here.
- Dig deeper into the selected topic (read parts of the paper more closely, go through relevant related works or use the open-source models).
- Submit a half-page overview of your findings by Friday morning (at 9) before the seminar. The overview can be in free format (bullet points, thoughts, references with notes, etc.) but it should help you remember what you learned and explain it to others.
- Participate actively during the seminar, be ready to explain what you learned.
The first seminar (February 10) is introductory and explains the course topic and organization.
Requirements / Grading
- Submit an overview for at least eight seminars.
- Participate actively in at least eight seminars (excluding the first introductory seminar).