Institute of Computer Science
  1. Courses
  2. 2012/13 fall
  3. Data Mining (MTAT.03.183)
ET
Log in

Data Mining 2012/13 fall

Edit page
Past edits Uploaded files

DM - 2012

  • Main
  • Lectures
  • Projects
  • Links
  • Homework
    • Homework upload
    • admin
  • Feedback
Edit sidebar

HW 06 (25.10) Clustering III and Seriation

1. Simulate DBSCAN on the data from last week task 1 (on paper). Aim at about 3 clusters.

2. Outline an algorithm for handling density based clustering for clusters of varying densities.

3. Look at some example binary matrices from here (tarball here). Does any of them follow the Pareto principle? If not, generate similar data and demonstrate how you would have discovered such example.

Comment from TA: It is enough to consider 80-20 here ("narrow" Pareto principle).

4. Implement a goodness measure for above example data that counts how well each value is surrounded by it's own "kind". Both for 0 and 1, take into account all 8 neighbours. Calculate the scores for all matrices in above.

Comment from TA: "By its own kind" means how 1s are surrounded by 1s and 0s surrounded by 0s.

5. Implement some sort of data reordering for above matrices, try to maximise your score in 4. For which datasets you find "optimal results", "good results", or really "bad results"?

Comment from TA: You can decide yourself where to put the border between good vs optimal and optimal vs bad. When maximising, it is enough to find "approximately maximal" result, you don't have to look at all permutations.

6. (Bonus 2p) - write a project proposal (1 A4) for density based clustering (e.g. developing new ideas, making test data, ...) or for matrix reordering/biclustering tasks. Motivate the problem and ask a relevant question that could be answered with a project.

Comment from TA: Try to be specific, which type of data is going to be used (numeric/non-numeric etc.) and how the methods are applicable if you have this type of data.

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment