HW6. Succinct trees (12th of Oct, 23.59)
EX1. In the lectures, we tried to establish what is the information-theoretic bound in the smallest number of bits needed to represent binary trees. Slides claimed that there are up to 22n distinct trees of n nodes, but we also showed that Catalan numbers count the number of different trees. Compare the growth rate of Catalan numbers vs 2^{2n}. Plot the numbers (use logarithmic axes to fit larger values on plots). Are they in Theta() relationship (exactly the same growth rate)?
EX2. A binomial tree (not a binary heap!) of degree 4 (or any degree) has a very specific structure. If you would need an implicit structure to represent it, there is exactly one such tree. Draw a binomial tree of degree 4 and number the nodes in the following ways (for this one, it might be a good idea to start numbering from 0 / 0000):
- Post-order
- Pre-order
- Level-order (label nodes level-by-level)
Hint: Post-order is on lecture slides (Heaps, Page 16, Figure 20.4)
Represent the tree as arrays, where the array index represents the number you gave to the node. Now take post-order node nr 11 (binary 1011, check heaps slides, same as before) as a reference node. Try to give indexing formulas to access the reference points parents and children for each representation (e.g. for normal heaps and its array structure the accessing was parent=i//2, child=i*2 (+1)). You can also try to use the number of combinations in your formulas e.g. 4 choose 2 (C(4,2)) or look at bit representations of your numbers (e.g. 1111 = 15) to see if you see some patterns.
For the following exercises, make sure to also shortly describe, what is the procedure to create the succinct representation and how to find the parent and children nodes.
EX3. Represent the same tree in a Level-order unary degree sequence (check slides). Show how you perform operations of finding parent, i-th child. In order to do that, you might want to describe rank and select operations first.
EX4.
Represent the same tree, this time using Parenthesis representation. Show how to perform operations of finding parent, first child, next sibling.
Represent the same tree, this time using Depth-first unary degree sequence (DFUDS) representation. Show how to find parent, first child, next sibling.
Argue, which presentation feels the most natural for you.
EX5. Describe two further operations - subtree size and the lowest common ancestor of two nodes using one of the above representations (Parenthesis, DFUDS). Why do you need such operations?
Briefly describe in your own words, why do we care about succinct data structures at all? What is the potential trade-off in using less memory? i.e. what do we gain and what do we lose?
EX6. (bonus 2p) Consider the van Emde Boas tree (lecture about Heaps) implementation for 64-bit integers. How large would be only the data structure in bits (only structure, without data) if you insert 1 million values distributed uniformly randomly across 64-bit integers? In comparison, count the size of the similarly organized tree data structure if implemented explicitly as a tree with pointers. What if you would use a succinct instead?