Institute of Computer Science
  1. Courses
  2. 2022/23 spring
  3. NLP seminar: multimodal text and speech processing (MTAT.06.046)
ET
Log in

NLP seminar: multimodal text and speech processing 2022/23 spring

  • General
  • Topics

Topics

A list of topics for each seminar is provided below:

  1. February 10: Intro
    • Short seminar, no paper
  2. February 17: Learning Transferable Visual Models From Natural Language Supervision (CLIP)
    • https://arxiv.org/abs/2103.00020
    • Submit your overview here!
  3. February 24: Independence Day
    • No seminar
  4. March 3: ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
    • https://arxiv.org/abs/2102.03334
    • Submit your overview here!
  5. March 10: Flamingo: a Visual Language Model for Few-Shot Learning
    • https://openreview.net/forum?id=EbMuimAbPbs
    • Submit your overview here!
  6. March 17: A Generalist Agent
    • https://openreview.net/forum?id=1ikK0kHjvj
    • Submit your overview here!
  7. March 24: Perceiver IO: A General Architecture for Structured Inputs & Outputs
    • https://arxiv.org/abs/2107.14795
    • Submit your overview here!
  8. March 31: data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language
    • https://arxiv.org/abs/2202.03555
    • Submit your overview here!
  9. April 7: Good Friday
    • No seminar
  10. April 14: PaLM-E: An Embodied Multimodal Language Model
    • https://palm-e.github.io/
    • Submit your overview here!
  11. April 21: Sparks of Artificial General Intelligence: Early experiments with GPT-4
    • https://arxiv.org/abs/2303.12712
    • Submit your overview here!
  12. April 28: Alpaca and co
    • Multiple papers and blogs, links in the subtopics sheet.
    • Submit your overview here!
  13. May 5: SpeechT5
    • https://arxiv.org/abs/2110.07205
    • Submit your overview here!
  14. May 12: VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text
    • https://arxiv.org/abs/2104.11178
    • Submit your overview here!
  15. May 19: ImageBind: One Embedding Space To Bind Them All
    • https://arxiv.org/abs/2305.05665
    • Submit your overview here!
  16. May 26: ChatGPT and other modalities (Visual ChatGPT, HuggingGPT, etc.)
    • Multiple papers and demos, links in the subtopics sheet.
    • Submit your overview here!
  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment