Arvutiteaduse instituut
  1. Kursused
  2. 2017/18 sügis
  3. Tehisnärvivõrgud (LTAT.02.001)
EN
Logi sisse

Tehisnärvivõrgud 2017/18 sügis

  • Main
  • Timetable
  • Practices
  • Projects
  • Resources

DEADLINES

Checkpoint 1: 08.11.2017 - fill in this project form (length should stay 1 page or so) and upload it under Practices (there is a special task Project Checkpoint 1).

Checkpoint 2: Sun 03.12.2017 - Create an empty version of your final report. Fill in the Introduction and Background sections. If you want, you can also start filling in Methods and Results. Upload it under Practices (there is a special task "Second checkpoint").

PRESENTATION: Tue 19.12.2017 at 14.15 or Fri 12.01.2018 at 12.15 in Liivi 2-122 - To present in December you need to have results already. Only the methods and expected results is not enough. If you have no results yet (due to lack of GPU time for example), you need to postpone the presentation to January. Presentation can also be done by only one person - everybody doesn't need be on the stage.

You will have 5-7min for the presentation and a few minutes for questions. You should limit your presentation to 5 slides. No posters need to be produced.

If you want and your project allows you to, you can do a demo.

REPORT DEADLINE: Fri 12.01.2018 - Reports need to be submitted by the time of the January presentation day. Also people who present in December can keep on working on their project and add new results to the report until the January presentation date. Final Report should be 5-10 pages (depending on the amount and size of images), this includes the references.

Report format (advisory) - It would be nice if you format your report like a scientific paper rather than a thesis. This means no titlepage, no table of contents, but with abstract, references, author contributions etc.

Report should be submitted here.


Projects

The list of projects will grow significantly along the course. Each team can also suggest its own project.

Project difficulty EASY, MEDIUM, HARD do not mean that you cannot get the same number of points for an EASY project as you get for HARD. It means that for an EASY project we expect a lot more thorough study of the problem, because the tool are there to begin with and you do not need to invent them. So you are expected to make a really nice report, produce baselines to compare with etc.
For a HARD problem it might be impossible to get decent results in this timeframe, therefore the final results are not as important as the systematic approach you have applied and the effort you put into it.

Points

Your mark will be decided based on the final report (3 x 25 percent: content, complexity, form) and final presentation (25 %). The intermediate milestones are however a wonderful way to lose points - they are demanded of you and if you fail to deliver or they are not thorough enough, you will be penalized.

By default all members of a group get the same amount of points. If you have complaints about a team member inform your supervisor immediately. Also, before starting, choose well who you work with (do not take freeloaders).


LIST OF PROJECTS


You can find the list of projects and the list of people working on them here:
https://docs.google.com/spreadsheets/d/1UNO1TJ-S1BEm9_3LRei7hVgV5EI1epfCJlTlr7DKZbI/edit?usp=sharing


AI RC car racing

RCSnail OÜ operates a RC (remote control) car racing track in Tartu SPARK Democentre. The cars have camera attached, the view from camera is shown on monitor and the car can be controlled with driving wheel. This setup is also ideal for creating a self-driving car. The input to neural network would be camera image (or sequence of camera images) and outputs are driving commands - wheel turn, acceleration and brake values. The network is trained using behavioral cloning, i.e. trained to imitate human driving commands.

PDF
Initial dataset

Difficulty: medium
Type: Application of existing solutions, engineering
Supervisors: Tambet Matiisen, Rainer Paat

Predict app rating from screenshot

Visuals are important part of highly rated mobile apps. The idea of this project is to automatically evaluate visuals of mobile apps. The screenshots from Apple App Store and Google Play store are used to predict rating of the app.

PDF

Difficulty: medium
Type: application
Supervisors: Tambet Matiisen, Kristjan Sägi, Triin Kask

Predict book sales based on book cover

A similar idea to: “Predict app rating from screenshot” the difference is that the images is a book cover and the result vector is the sell-through magnitude class. Also maybe even some other variables such as book width height nr. of pages, price, category, title, publisher etc. could be added as additional input parameters. The idea is to extract the features and find consensus covers for a specific category, title or price.

Difficulty: hard
Type: application
Supervisors: ???

Testing the optimal compression hypothesis in the training of neural networks

The goal of this project is to test a provocative hypothesis presented in this popular paper “Opening the black box of deep learning via information” https://arxiv.org/pdf/1703.00810.pdf

The authors of the paper provide some numerical evidence that the training of a neural network by gradient descent proceeds by changing the weights to first fit the labels or targets, and then compressing the representations of the input in the hidden layers as much as possible without increasing the error. If true this result is important because one can compress the representations of the input by methods way faster than applying gradient descent, and hence considerably shortening the time needed to train neural networks.

In the project, we will test if this hypothesis is true by testing it with cases in which we know how much the input can be compressed by the hidden layers. In particular, we plan to train a neural network to classify a set of cat and dog images with the particularity that in each image there is one pixel that contains perfect information about the label of the image. If the hypothesis of the optimal compression is true, the neural network trained under gradient descent should use compress the representation of the input such that only contains the informative pixel and forgets about the rest of the image. We have some serious doubts that that will be the case, and we expect that the network trained by gradient descent will keep redundant information from the rest of the image (that is parts of the image other than the perfectly informative pixel).

For this project the students will help to test a research question. The steps they will need to develop include training a neural network to perform a binary classification of images, modifying the training data to include or not an informative pixel, and apply mutual information to measure the degree of compression achieved by the hidden layers.

Difficulty: medium
Type: Research
Supervisors: Raul Vicente

Reconstructing and visualizing a human body made from word embeddings

The goal of this project is to use word embeddings to represent the words of different parts of the body in a distributed manner, and visualize the relative position of those vectors.

This project involves training a large corpus of text using word2vec, selecting the vectors for words associated to different body parts, and visualize their locations in 3D using dimensionality reduction techniques such as tSNE.

If successful the output of the project should produce an homunculus (sort of distorted human body) where the distance between two different body parts will be given by the similarity in their semantic roles in the text. Different corpora of text will be used and compared.

Difficulty: easy
Type: Application of existing work
Supervisors: Raul Vicente

Join a Kaggle competition

  • Cdiscount’s Image Classification Challenge (1st place $20000, deadline December 14, 2017)
  • Spooky Author Identification (best 6 get $2000, deadline December 15, 2017)

Difficulty: medium/hard
Type: Application
Supervisors: Tambet Matiisen

Predict football scores

There is a nice database about football scores in Europe from 2008-2016.
https://www.kaggle.com/hugomathien/soccer
Your goal is to predict either winner/draw, the exact score or goals scored with above random accuracy using a neural network. The first complicated step is to figure out what are relevant features and how to feed them as inputs. Once you have achieved above random accuracy you could compare with some other algorithm like Random Forest. You could also compare with betting odds to see how much money you would lose if you trust your system. (Try it on current games).

It is very likely you will get very bad accuracy, don't be under the illusion that it is easy.

Difficulty: medium/hard
Type: Application
Supervisors: Ardi Tampuu

Predict stock market movement

There are many datasets available about stock market movement. Find some and use networks to predict the movements. Minimize your losses (or make a profit).

Interesting alternative approach is to predict the stock movement based on news in google finance.

Difficulty: hard
Type: Application
Supervisors: ???

Replicate a neural network paper

“What I cannot create, I do not understand” R. Feynman
The best way to understand a piece technology -- try to recreate it on your own. This is a great way to learn about NNs and, once a network architecture or method is implemented, an important contribution to scientific and industrial communities.

To complete this project, you need to take a paper about a neural network architecture or NN-relevant technique (such as batch normalization, Xavier initialization, etc) which does not yet have open-source Keras implementation. Implement it and provide proof that it works as expected.

Difficulty: Hard
Type: Research
Supervisor: ??

Simulator of Newtonian dynamics using neural networks

TBD

Difficulty: hard?
Supervisors: Raul Vicente

Characterizing and visualizing the gradient descent on complex energy landscapes

TBD

Difficulty: hard?
Supervisors: Raul Vicente

Spelling correction using neural networks

It has been shown that neural networks can generate grammatically correct text. In this project we are going to use neural networks for automatic spelling correction. We achieve supervision by synthetically introducing common typing errors to text and training the network to revert them. Recurrent neural networks with character level embeddings will be used.

Difficulty: medium
Type: Research
Supervisors: Tambet Matiisen, Mark Fishel

ICLR 2018 Reproducibility Challenge

You should select a paper from the 2018 ICLR submissions, and aim to replicate the experiments described in the paper. The goal is to assess if the experiments are reproducible, and to determine if the conclusions of the paper are supported by your findings. Your results can be either positive (i.e. confirm reproducibility), or negative (i.e. explain what you were unable to reproduce, and potentially explain why).

Homepage

Difficulty: hard
Type: Research
Supervisor: Tambet Matiisen

Jaan Aru bot

The best way to navigate never-ending flow of new AI papers is human curators. In computational neuroscience group Jaan Aru has been our main filter in delivering the daily dose of new Arxiv preprints. What happens if we lose him? Can we replace him with a bot? Your job is to train a network to predict if Jaan Aru would send an Arxiv paper to computational neuroscience list.

Example

Difficulty: medium
Type: Fun / Application
Supervisors: Tambet Matiisen, Jaan Aru

Learning to play Go 2048

In October 2017 (last week) Google DeepMind published an article describing improvements to their AlphaGo model that has beaten the world's best Go players in recent years. Their new AlphaGo Zero is more computationally efficient and also, very importantly, is learning without using any human knowledge about the game (no supervised pre-training using real-world games). In this project you have to read the AlghaGo Zero article, reproduce the algorithm and apply it to a simpler game - 2048 (it was a hit a few years ago). Basically you need to end up with a 2048-bot who is very good at the game.

Article

Blog post

Notice that the main goal is to deeply understand the AlphaGo system, not to create a good 2048 solver.

Difficulty: hard
Type: Application
Supervisors: Ardi Tampuu

Generating text in simple English

Text generation is an interesting task in AI (to pass the Turing test, for example; or generate captions) and RNNs have been applied with reasonable success. The problem with usual text databases used to train text generators is that the number of different words can go up to 1 000 000 and more. This makes the training harder as the number of parameters grows and also because for many words there are very few training examples. Another approach is not to look at individual words, but instead individual characters (reduces the vocabulary size to nr of different characters) - but character-based models have hard time generating consistent text, because the beginning and end of sentence can be 100 timesteps (characters) away and it simply forgets.

The idea here is to use text from Simple English Wikipedia to train a text generating network. Simple English Wikipedia is a collection of wiki articles that try to use only simple English words and not many scientific (or other) jargon. This should limit the number of words used in vocabulary from 1M to a lot lower (names of people and places will still increase the vocabulary, but they can be filtered out). Hopefully speaking simple English is simpler for the RNNs.

You need to - download Simple Wiki articles, clean them from HTML tags and other artifacts, choose a vocabulary size and what to do with names, train a model and test it.

Difficulty: medium
Type: Application
Supervisors: Ardi Tampuu

Generating text based on: (movie descriptions/poetry/any other text database)

(read background above)
Choose a text dataset that could potentially generate interesting and special type of text. Train a model, profit.

Difficulty: easy -medium
Type: Application
Supervisors: Ardi Tampuu

Generating text based on video

Download lots of videos and their plot description. Build a network that generates plot description based on video.

Difficulty: impossible
Type: Application
Supervisors: Tambet Matiisen

Fashion-MNIST

MNIST is a well-known dataset of handwritten digits from 1990s. This is the first dataset new ideas are tried on, because it's small and experimentation proceeds fast. But over time researchers got bored of predicting handwritten digits. Now finally they can have a change - the Fashion-MNIST dataset! It includes pictures of different garments that must be classified to 10 classes: T-shirt/top, trouser, pullover, dress, coat, sandal, shirt, sneaker, bag or ankle boot. It serves as a drop-in replacement for original MNIST dataset, i.e. it has the same resolution 28x28 and the same size of training and test set. Your task is to fit better model than the baselines in Fashion-MNIST paper. You are expected to experiment with some novel approaches like batch normalization, residual connections, etc.

Dataset
Paper

Difficulty: easy
Type: Application
Supervisors: Tambet Matiisen

Classify chest X-ray images

The NIH Clinical Center recently released over 100,000 anonymized chest x-ray images and their corresponding data to the scientific community. The release will allow researchers across the country and around the world to freely access the datasets and increase their ability to teach computers how to detect and diagnose disease. Ultimately, this artificial intelligence mechanism can lead to clinicians making better diagnostic decisions for patients.

Chest X-ray dataset comprises 112,120 frontal-view X-ray images of 30,805 unique patients with the text-mined fourteen disease image labels (where each image can have multi-labels), mined from the associated radiological reports using natural language processing. Fourteen common thoracic pathologies include Atelectasis, Consolidation, Infiltration, Pneumothorax, Edema, Emphysema, Fibrosis, Effusion, Pneumonia, Pleural_thickening, Cardiomegaly, Nodule, Mass and Hernia. Your task is to replicate the results from original paper and possibly propose improvements.

Press release
Dataset
Paper

Difficulty: ???
Type: Research
Supervisors: Tambet Matiisen

Dogs vs Cats

In this project, you'll write an algorithm to classify whether images contain either a dog or a cat. This is easy for humans, dogs, and cats. Your computer will find it a bit more difficult.

  1. Deep Blue beat Kasparov at chess in 1997.
  2. Watson beat the brightest trivia minds at Jeopardy in 2011.
  3. Can you tell Fido from Mittens in 2017?

You should do something extra besides simple classification, but I'm not sure yet what it should be. Try different transfer learning approaches?

Homepage

Difficulty: easy
Type: Application
Supervisors: Tambet Matiisen

  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Tartu Ülikooli arvutiteaduse instituudi kursuste läbiviimist toetavad järgmised programmid:
euroopa sotsiaalfondi logo