Institute of Computer Science
  1. Courses
  2. 2015/16 spring
  3. Data Mining (MTAT.03.183)
ET
Log in

Data Mining 2015/16 spring

  • Course Home
  • Lectures
  • Projects
  • Homeworks
    • Submit
  • Software
  • Links

...

HW11 (25.04) - ML, Clustering, projects...

1. Load the dataset from here (or csv version). The data points belong to two classes - positives and negatives. Notice that two classes are not linearly separated in the original 2D feature space. However, by applying the "kernel trick" we can map the original feature space into a high dimensional one where the classes would be linearly separable. Try to come up with new feature(s) based on X and Y, such that the given points would be separable. (Hint)

2. Load dataset from this Excel file (or csv version). Your task is to simulate hierarchical clustering:

  1. Single link (min distance) clustering
  2. Complete link (max distance) clustering

Use common sense, no need to calculate ALL distances. Draw by hand to save time...

3. Use the same data, and use first 4 points as K cluster centers for K-means. Simulate the K-means (using Euclidean distance). Again, use common sense and approximate distances where needed. When in serious doubt, you can rely on more precise calculations.

4. Form a team of one to four people. Select a project topic. Identify a data set and define the scope of the project. Add your project description as a single slide in this file - https://docs.google.com/presentation/d/1ARpew6odg24QB4cnJ3wmr6qeAkNcajfVLxJn6mYu5Lk/edit

5. Apply descriptive statistics techniques to describe your selected data set.

If in 4 and 5 you do not have a team yet, use the opportunity to attract team members by making your data interesting in the practice session.

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment