HW12. Suffix trees/arrays, BWT transform, Feedback (December 6, 23.59)
EX1(2p) Construct a Trie from a Collection of Patterns. You are give a collection of patterns {bananna, pan, and, nab antenna, bandana, ananas, nana}. Then implement TrieMatching with the text panamabananas$. You can use the following pseudocode:
PrefixTrieMatching(Text, Trie) symbol ← first letter of Text v ← root of Trie while forever: if v is a leaf in Trie: return the pattern spelled by the path from the root to 𝑣 else if there is an edge (v, w) in Trie labeled by symbol: symbol ← next letter of Text v ← w else: output “no matches found” return
EX2(2p). You are given word: panamabananas$ . What are all the suffixes for the given word?
Build a Suffix Trie and a Tree for the same word. Demonstrate searching for patterns "ana" and "bananas" using these data structures. Make sure to describe the algorithms you used to perform these tasks.
What is a Suffix Trie/Tree be used for? What is the task of full-text indexing? What are the differences between using Tree instead of Trie?
EX3(1p). Compare the memory complexity between Suffix Trie and Suffix Tree for the above task.
EX4(1p). Take the same word and instead build a Suffix Array. Describe the algorithm.
How could you use this suffix array to perform some operations? Where should we use Suffix Arrays instead of Suffix trees?
EX5(1p). Perform Burrows-Wheeler Transform on the following string: panamabananas$
Make sure to draw the matrix.
Demonstrate how you can reconstruct the original string from BWT.
Where and why is BWT used?
EX6(1p) Show on pen and paper how to search for the pattern ana in the panamabananas$ using Burrows-Wheeler Transform.
EX7. (Bonus 1p): Provide feedback for the course and homework topics so far
- which topics were most useful?
- what needs to be covered better in the course?
- are there some topics that would need more practical implementation assignments?
- are there some topics that got too much attention (e.g. too boring or otherwise already well known)