## MTAT.03.251 Graph Mining

## MTAT.03.251 Andmekaeve graafides

**Seminars**: On Tuesdays at 16:15-17:45 J. Liivi 2 - 403**Office hours**: Mondays Liivi 2-326. Please arrange exact details of the meeting by email.**Questions:**swen@math.ut.ee- The first seminar starts on
**14:15 on 8th February** - The course is based on the book
*Eric D. Kolaczyk.*Statistical Analysis of Network Data. - Lectures and seminar sessions are held in English or in Estonian depending on circumstances.
- Final reports are methodical web materials written in English.

## What and Why

The course aims to give a overview of basic graph mining techniques starting form the graph construction and ending with prediction of graph structures. Graph mining is relatively new research area. Although graphs are natural way to represent connections and associations between objects, they are also complex discrete objects compared to simple arrays and matrices. As a result, standard statistical techniques and machine learning algorithms have to be modified to run on top of graphs. In this course, we view basic ways how to overcome this by using graph specific methods and various ways to flatten the complex structure by feature extraction. Another reason why graph mining is relatively new is purely pragmatical. Although large network structures have been present for a long time, only recent changes in technology have made it possible to collect such kind of data. For example, only recent network-oriented applications such as instant messengers, social network websites and geo-tagging, have opened an opportunity to observe and study large-scale social interactions. Also, lack of efficient algorithms and lack of raw computing power have been hindering factors. As such the topic is highly suitable for students from computer science and statistics. In particular, note that elements of graph mining are heavily used in system biology and bioinformatics and several major projects done in cooperation between department of computer science and competence center STACC are closely related to study of social networks.

## Requirements

There are no formal prerequisites to the seminar. However, basic knowledge in programming, basic math and graph theory is advisable. Also, one needs reasonable English skills to complete the course report. If the formal requirements of the ÕIS do not permit registration then write me an email or talk with me. After that we decide whether to enroll you or not.

## To pass the course

Differently from the last years seminar, the course is organized as a reading group. That is, on average each student makes two presentations and some illustrative demos. The demos will be included into web tutorials. The course grade will be decided on the quality of presentations and web tutorials. Tutorials must contain explanation of theoretical concepts, their graphical illustrations, code snippets in **GNU R** that explain the concepts furthermore.

You must actively participate in most seminars or otherwise you do not pass the seminar. Namely, student gets grade **F** if he or she misses 3 or more seminars. In reasonable circumstances, it is possible to compensate missed seminars by extra work. Details are determined by individual agreements with the lecturer.
You must do a project work. The project work will be graded in the scale from A to F and it will form the basic grade for the course.

## Assignments

### Presentations

Topic | Team | Presentation date |
---|---|---|

Graph construction and visualisation | Egon Elbre, Elena Nikolaeva & Roland Pihlakas | 22th of February |

Local and global characteristic | Vijayachitra Modhukur & Shahab Mokarizadeh | 8th of March |

Sampling and Estimation in Network Graphs | Vijayachitra Modhukur & Shahab Mokarizadeh | 15th of March |

Small-World Models and Network Growth Models | Vijayachitra Modhukur & Shahab Mokarizadeh | 22th of March |

Basic Methods for Link Prediction | Egon Elbre, Elena Nikolaeva & Roland Pihlakas | 29th of March |

SVM-s and Kernel methods for Graphs | Vijayachitra Modhukur & Shahab Mokarizadeh | 5th of April |

Modeling Dynamic Processes in Graphs | Egon Elbre, Elena Nikolaeva & Roland Pihlakas | 19th of April |

Gravity Models | Egon Elbre, Elena Nikolaeva & Roland Pihlakas | 26th of April |

Tree reconstruction algorithms | Egon Elbre, Elena Nikolaeva & Roland Pihlakas | 3rd of May |

## Web Tutorials

Topic | Student | Status |
---|---|---|

Graph construction and visualisation | Elena | |

Local and Global characteristics | Chitra | |

Small-World Models and Network Growth | Sahab | |

Sampling and Estimation In Network Graphs | Sahab | |

SVM and Kernel methods for Graphs | Chitra | |

Basic Methods for Link Prediction | Roland | |

Modeling Dynamic Processes in Graphs | Egon | |

Markov Fields | Egon | |

Gravity Models | Elena | |

Tree Reconstruction Algorithms | Roland |