HW5. Heap, Union-find (8th of Oct, 23.59)
EX1. Read: Kamp (CACM 2010 53:7) "You're doing it all wrong.". What is the main criticism in the article? What is your opinion about the article? Discuss how exactly should the Heap use implicit indexing (i => 2*i, 2*i+1; and parent==i/2 ) be changed according to Kamp's article.
EX2. Consider an interval (x, y). In this exercise we will interpret these intervals as geometrical points in 2D space, such that x would be on the x-axis and y on the y-axis (interval (1,2) would be represented as a point p = (1,2) on 2D plot). First of all we will fix one interval (5, 10) and use it as a reference interval. The rest of the points in this 2D space need to be coloured according to the type of relationship they have with the reference interval. For example interval (1, 3) would be non-overlapping from the left to the reference interval, interval (11, 14) non-overlapping from the right etc. How many types of relationships are there in this 2D space? Visualise these types by colouring points (making coloured regions) in the space accordingly. Include reference interval and also one example interval from each region.
EX3. Simulate inserting values 6, 4, 2, 9, 12, 20, 12, 13, 7, 5, 11 into a min-priority Binomial heap. Draw the final state. Then delete the min value 4 times in a row. Draw the final state.
EX4. Implement Union-Find with and without path compression. Simulate the efficiency of both algorithms by generating one million singletons and then performing merges between two random elements until only one set is left. Count the number of unsuccessful mergers (both items i and j were already in the same set) and the number of edge lookups needed to achieve the final situation where only one single set is left (for both UF with and without path compression). Repeat the same simulation for 10 times and discuss how stable are those two measured numbers are.
singletone: single isolated node with no edges
EX5. Use 4-D data points given below. Define a random hyperplane direction (hint: random 4-D unit vector). Find such a hyperplane that would split the data points at “median” - half on one side, half on another side. Label the below points accordingly and calculate the distance from the hyperplane. Provide formulae in your report.
X Y Z D 73.64 10.36 97.37 53.80 34.17 94.43 33.35 73.01 32.08 97.13 25.58 43.36 51.45 89.81 38.43 48.67 15.24 95.85 97.98 81.92 44.86 12.32 29.15 81.43 34.50 28.76 48.76 66.48 67.45 22.75 53.81 40.16 7.510 90.28 81.90 45.47 77.12 26.87 22.04 81.32 71.35 66.32 86.06 35.05 23.59 38.70 26.60 47.57 17.07 44.94 41.63 72.22 12.07 61.15 20.94 37.69 48.47 37.50 25.41 82.65 92.62 66.00 54.00 65.94 79.05 60.91 1.420 39.06 1.540 83.79 92.31 63.50 20.09 65.39 75.14 69.09
EX6. (bonus 2p) Oct-tree is a generalization of quad-tree from 2D into 3D. Now generalize it further into 4D and show how the data from the previous exercise would be distributed. Note that to identify a particular (largest bounding) cell a binary vector - 0/1 along each dimension is sufficient. Assign to each point the label of a bounding box with such 4-bit numbering. If cell has more than one value, split it further. For the first data point the outermost bounding box is 1101 and for the second point it is 0010. If some cell contains multiple values and needs to be split further, indicate that particular next-level split as well. How would you search for a new query point listing all points within a radius 10 from it?