Institute of Computer Science
  1. Courses
  2. 2014/15 spring
  3. Data Mining (MTAT.03.183)
ET
Log in

Data Mining 2014/15 spring

  • Home
  • Lectures
    • Videos
  • Homeworks
    • Homework upload
  • Projects
  • Links
  • LaTeX

HW 6 (due March 29th) Descriptive analysis and visualisation

Exploratory data mining - look at the data from different viewpoints. If data is too big, make first a random sample so that your code would work efficiently. Only then you may apply it to larger data.

1. Study the data set National Child Measurement Programme - England, 2013-14. Data from here: http://www.hscic.gov.uk/catalogue/PUB16070/nat-chil-meas-prog-eng-2013-2014-guid.zip

Focus on child height, weight, BMI and age. Calculate average height, weight and BMI for every age in the data. Plot the age (on x-axis), and height, weight, or BMI, overlaying the average value (a line) for that age.

2. Normalise the height, weight and BMI by the age (subtract from every value the average for that age group). Make the same plots again. Identify who are classified as overweight or underweight. (Optional: Calculate age group averages separately for boys and girls - how different are they?)

3. Try to plot the same data in different views. E.g. ( height*weight vs height/weight ) , and ( log(height*weight) vs log(height/weight) ). Or height*weight vs BMI. Can you interpret those graphs?

4. How would the graphs from 3 change if you apply them to the age-normalised height and weight. (Instead of "average height 0", normalise the height to 150cm and respective average weight.)

5. Watch the video presentation by Tamara Munzner: Keynote on Visualization Principles, and slides - http://www.cs.ubc.ca/~tmm/talks/vizbi11/vizbi11.pdf Summarize the key take-home messages from her presentation.

6. (Bonus, 2p) Test some of the ideas from Tamara's presentation and attempt visualising the iris data set from last week "better".

  • Institute of Computer Science
  • Faculty of Science and Technology
  • University of Tartu
In case of technical problems or questions write to:

Contact the course organizers with the organizational and course content questions.
The proprietary copyrights of educational materials belong to the University of Tartu. The use of educational materials is permitted for the purposes and under the conditions provided for in the copyright law for the free use of a work. When using educational materials, the user is obligated to give credit to the author of the educational materials.
The use of educational materials for other purposes is allowed only with the prior written consent of the University of Tartu.
Terms of use for the Courses environment