Institute of Computer Science
  1. Courses
  2. 2014/15 spring
  3. Data Mining (MTAT.03.183)
ET
Log in

Data Mining 2014/15 spring

  • Home
  • Lectures
    • Videos
  • Homeworks
    • Homework upload
  • Projects
  • Links
  • LaTeX

HW 9 (due April 19th) Clustering continued...

1. Use the data set Attach:D1.txt. Visualise this data using some density representation. E.g.a 2-D density based heatmap, or 3-D density plot ("mountains").

2. Overlay the "cluster centers" on 2-D scatterplot.

3	4
3	5
3	6
4	4
4	5
4	6
5	4
5	5
5	6
6	4
6	5
6	6

This would look something like Make your version.

Now perform the K-means clustering with these above 12 points as starting points and plot similarly the final cluster centers. Plot both the initial and the end states (how each center changes). (Optional: plot the full trajectory of K-means center movements through the intermediate steps of K-means)

3. and 4. Implement in a compact simple style a SOM algorithm yourself (python, C, Java - whatever): iteratively fetch samples in random order, detect closest cluster center (the "winner"), modify that winner to move s% closer to the current sample, and the immediate neighbours of the winner by t%. Experiment with s and t, and lower this speed during the algorithm. Goal: try to visualise the "trajectory" how the SOM cluster centers "move" in the process, as in task 2.

5. Consider now density based clustering, the DBSCAN. Experiment with DBSCAN or some other density based clustering algorithm on this data. Use different parameters and try to visualise the outcomes. Discuss the findings and compare briefly the goods and bads of the DBSCAN vs K-means and SOM algorithms.

6. (Bonus 2p) Look at the "Complex Heatmaps" tools for R (Bioconductor package). Describe briefly the main functionality. Generate some "complex" data yourself and demonstrate how to use "complex" features of this package. Present the "answer" as a brief tutorial and example. Complex Heatmaps

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment