HW12. Suffix trees/arrays, Peer review, Feedback, TSP competition (4th December, 23.59)
Just state that you've completed peer reviews for EX1 and EX2. No need for anything extra.
EX1 (1p) . Peer review the first article assigned to you.
EX2 (1p) . Peer review the second article assigned to you.
EX3. (2p) You are given word: sinsinati.
- What are all the suffixes for the given word? Build a Suffix Trie and a Tree for the same word. Demonstrate searching for pattern "sina" using these data structures. Make sure to describe the algorithms you used to perform these tasks. What is a Suffix Trie/Tree be used for? What is the task of full-text indexing? What are the differences between using Tree instead of Trie?
- Take the same word and instead build a Suffix Array. Describe the algorithm. How could you use this suffix array to perform some operations? Where should we use Suffix Arrays instead of Suffix trees?
EX4. (1p) Substring matching. Given two strings S1 and S2, and a positive integer k, find the number of substrings of S1 of length at least k that occur in S2. Develop and analyze an algorithm to solve this problem in O(|S1| + |S2| + sort(Σ)) time, where Σ is the alphabet.
EX5. (Bonus 2p + 2p)
- Perform Burrows-Wheeler Transform on the following string: CGGTCGCT$. Make sure to draw the matrix. Demonstrate how you can reconstruct the original string from BWT. Where and why is BWT used?
This video might help you get started:
- Given the string T = acracca$ construct a suffix tree using the following algorithm:
Simple-Suffix-Tree-Algorithm(T) Create the root node, with empty string for i ← 1 to n do Traverse current tree from the root Match symbols in the edge label one-by-one with symbols in the current suffix, Ti if a mismatch occurs then Split the edge at the position of mismatch to create a new node, if need be Insert suffix Ti into the suffix tree at the position of mismatch end if end for
EX6. (Bonus Upto 3p): Your task is to optimize 1024-city case as well as possible. Data is here: 1024-city-case.
An additional rule is that when you are exporting the image using web tool, the label in "Path legal:False", must be True! This confirms that every city has been visited exactly once.
Report the best solutions (state-of-the-art). In the post describe your approach, report the calculation time, obtained path length and the visualization of the resulting path. The ones who can beat past best solutions will get points (even if someone will outcompete your solution next). To get points to make sure to post new best solutions. NB! For the solution to beat previous solutions, it should have either a shorter resulting path or if lengths are comparable the new solution should be considerably faster.
Also, very good descriptions of the approach and convergence rates - may get awarded with points upon TA decision.
EX7. (1p) Provide feedback for the course and homework topics so far
- which topics were most useful?
- what needs to be covered better in the course?
- are there some topics that would need more practical implementation assignments?
- are there some topics that got too much attention (e.g. too boring or otherwise already well known)