Institute of Computer Science
  1. Courses
  2. 2012/13 fall
  3. Data Mining (MTAT.03.183)
ET
Log in

Data Mining 2012/13 fall

Edit page
Past edits Uploaded files

DM - 2012

  • Main
  • Lectures
  • Projects
  • Links
  • Homework
    • Homework upload
    • admin
  • Feedback
Edit sidebar

HW 05 (18.10)

1. Use the data from last week task 1, and simulate K-means algorithm. Use initial centers of (2,6), (2,8), (5,8). Explain step by step. Then use the same data and simulate K-medoids, starting from cluster center points D, E, and H. Data is this:

	X	Y
A	2	4
B	7	3
C	3	5
D	5	3
E	7	4
F	6	8
G	6	5
H	8	4
I	2	5
J	3	7

2. Install and run mldemos and try out the clustering with k-means. Identify situations when k-means clearly does not cluster as expected as compared the “true” clustering expected by you. Make screenshots and discuss why it happens.

3. When you have identified why such unpleasant situations arise - can you propose some remedy to it? Propose some heuristics how to overcome such issues.

4. I have downloaded a small dataset from http://www.imf.org/external/pubs/ft/weo/2012/02/weodata/index.aspx - the Attach:DM_2010_IMF.xls that has 5 attributes per country. Cluster it and describe the issues you encounter while doing so.

5. During the lecture we described the SOM clustering method and principle. Implement the SOM algorithm in the modification that has only 1-dimensional "grid". E.g. that has 30 or 100 or n grid elements. Take the new datapoint and assign it to the most similar grid point, then update that point and a nearby range of other points. Outline the exact algorithm in pseudocode.

Comment from TA: See e.g. http://en.wikipedia.org/wiki/Pseudocode if you don't know what is pseudocode.

6. (Bonus 2p) - Implement your 1-D SOM algorithm yourself and apply it. (for example, to the above country statistics data set).

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment