Arvutiteaduse instituut
  1. Kursused
  2. 2012/13 sügis
  3. Andmekaeve (MTAT.03.183)
EN
Logi sisse

Andmekaeve 2012/13 sügis

Muuda lehte
Muudatuste ajalugu Üleslaetud failid

DM - 2012

  • Main
  • Lectures
  • Projects
  • Links
  • Homework
    • Homework upload
    • admin
  • Feedback
Muuda külgriba

HW 05 (18.10)

1. Use the data from last week task 1, and simulate K-means algorithm. Use initial centers of (2,6), (2,8), (5,8). Explain step by step. Then use the same data and simulate K-medoids, starting from cluster center points D, E, and H. Data is this:

	X	Y
A	2	4
B	7	3
C	3	5
D	5	3
E	7	4
F	6	8
G	6	5
H	8	4
I	2	5
J	3	7

2. Install and run mldemos and try out the clustering with k-means. Identify situations when k-means clearly does not cluster as expected as compared the “true” clustering expected by you. Make screenshots and discuss why it happens.

3. When you have identified why such unpleasant situations arise - can you propose some remedy to it? Propose some heuristics how to overcome such issues.

4. I have downloaded a small dataset from http://www.imf.org/external/pubs/ft/weo/2012/02/weodata/index.aspx - the Attach:DM_2010_IMF.xls that has 5 attributes per country. Cluster it and describe the issues you encounter while doing so.

5. During the lecture we described the SOM clustering method and principle. Implement the SOM algorithm in the modification that has only 1-dimensional "grid". E.g. that has 30 or 100 or n grid elements. Take the new datapoint and assign it to the most similar grid point, then update that point and a nearby range of other points. Outline the exact algorithm in pseudocode.

Comment from TA: See e.g. http://en.wikipedia.org/wiki/Pseudocode if you don't know what is pseudocode.

6. (Bonus 2p) - Implement your 1-D SOM algorithm yourself and apply it. (for example, to the above country statistics data set).

  • Arvutiteaduse instituut
  • Loodus- ja täppisteaduste valdkond
  • Tartu Ülikool
Tehniliste probleemide või küsimuste korral kirjuta:

Kursuse sisu ja korralduslike küsimustega pöörduge kursuse korraldajate poole.
Õppematerjalide varalised autoriõigused kuuluvad Tartu Ülikoolile. Õppematerjalide kasutamine on lubatud autoriõiguse seaduses ettenähtud teose vaba kasutamise eesmärkidel ja tingimustel. Õppematerjalide kasutamisel on kasutaja kohustatud viitama õppematerjalide autorile.
Õppematerjalide kasutamine muudel eesmärkidel on lubatud ainult Tartu Ülikooli eelneval kirjalikul nõusolekul.
Courses’i keskkonna kasutustingimused