Arvutiteaduse instituut
  1. Kursused
  2. 2015/16 sügis
  3. Andmekaeve uurimisseminar (MTAT.03.277)
EN
Logi sisse

Andmekaeve uurimisseminar 2015/16 sügis

Older Datamining Seminars: 2008k » 2008s » 2009k » 2009s » 2010k » 2011k » 2012s » 2014k » 2014s » 2014k

  • About
  • Track I: Deep Learning for NLP
    • Timetable
    • Creating tests
    • Project ideas
    • Projects
    • Keras
  • Track II: Research Projects
    • Presentations
    • Assignments
    • Deadlines

Deep Learning for Natural Language Processing

You may have seen this dialogue in the news recently:

Human: what is the purpose of life?
Machine: to serve the greater good.
Human: what is the purpose of living?
Machine: to live forever.
Human: what is the purpose of existence?
Machine: to find out what happens when we get to the planet earth.
Human: where are you now?
Machine: i’m in the middle of nowhere.
Human: what is the purpose of dying?
Machine: to have a life.
Human: what is the purpose of being intelligent?
Machine: to find out what it is.
Human: what is the purpose of emotions?
Machine: i don’t know.

This is not a fragment from "The Hitchhiker's Guide to the Galaxy", but actual neural conversational agent created at Google. It learned to give decent responses in conversation and even perform simplistic common-sense reasoning, all that from 62M sentences in movie subtitle dataset. Isn't that amazing? We think it is. By the end of this course we should all understand how they did it and if similar techniques can be used in other domains.

We will learn about:

  • word vectors and distributed semantics,
  • neural networks and backpropagation,
  • recurrent neural networks for language modeling.

While emphasis of the course is natural language processing, it can also be considered as a general course in artificial neural networks, especially recurrent ones, which are the hot topic in deep learning right now.

Prerequisites

  • Experience with Python and Numpy (or willingness to learn). There is a great tutorial, in case you need a refresher.
  • Basic calculus and linear algebra. You should be comfortable taking derivatives and multiplying matrices. If you need a refresher, see this.
  • Basic probability and statistics. You should know what is conditional probability, Gaussian distribution, mean and standard deviation. For a refresher see this.
  • Machine learning basics. You should know what is loss function and have an idea how gradient descent works. For a refresher see this.

Organization

In this seminar we are going to follow Stanford University course Deep Learning for Natural Language Processing, given by Richard Socher (CEO of Metamind, coauthor of GloVe and zero-shot learning techniques). We are going to work through homework assignments and there will be tests created by fellow students.

Seminars

Watching the course lectures will be left as homework. Seminar time will be mostly used for discussion. There will be two types of seminars: tests and homework presentations.

Tests

For some seminars one student has to create a test for others and later grade the results. In such a seminar we spend first 30 minutes discussing the key points in forthcoming test, then 30 minutes will be spent on test itself and after that 30 minutes for discussion of the right answers.

NB! The tests must be sent to tambet.matiisen@ut.ee 3 days BEFORE the seminar, for review and feedback!

Homework

In other seminars students present the homework assignments. During seminar others can verify, if they agree with the solution and if they got the same answers. Everybody has to do all the homeworks, but you have to present only one (or two, if you didn't create a test).

NB! The homeworks must be sent to tambet.matiisen@ut.ee max 3 days AFTER the seminar where it was presented!

Grading

To pass the seminar, you have to do following:

  1. create one test or present one homework;
  2. collect at least 60% of points from tests;
  3. submit all homeworks from the course.

The seminar will give 3 ECTS.

In addition you have an option to do a project based on things you learned. This can be replicating a paper, applying similar technique to your own dataset or something else. This will be recorded as separate course "MTAT.03.275 Special Assignment in Data Mining" and will give additional 3 ECTS. To claim those points you have to:

  1. produce a report,
  2. do a presentation of your results in seminar.

Administrative details

Date and location: Mon 12:15 @ Liivi 2-512

All announcements during the semester will be made through deeplearning@lists.ut.ee list. To subscribe to the list, send an e-mail to sympa@lists.ut.ee and put SUBSCRIBE deeplearning <your name> to the e-mail body.

Contacts:

  • Tambet Matiisen, tambet.matiisen@ut.ee
  • Sven Laur, swen@ut.ee
  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Tartu Ülikooli arvutiteaduse instituudi kursuste läbiviimist toetavad järgmised programmid:
euroopa sotsiaalfondi logo it akadeemia logo