HW12. Suffix trees/arrays, Peer review, Feedback, TSP competition (5th December, 23.59)
Just state that you've completed peer reviews for EX1 and EX2. No need for anything extra.
EX1. Peer review the first article assigned to you.
EX2. Peer review the second article assigned to you.
EX3 (Bonus 2p). You are given word: sinsinati.
- What are all the suffixes for the given word? Build a Suffix Trie and a Tree for the same word. Demonstrate searching for pattern "sina" using these data structures. Make sure to describe the algorithms you used to perform these tasks. What is a Suffix Trie/Tree be used for? What is the task of full-text indexing? What are the differences between using Tree instead of Trie?
- Take the same word and instead build a Suffix Array. Describe the algorithm. How could you use this suffix array to perform some operations? Where should we use Suffix Arrays instead of Suffix trees?
EX4 (Bonus 2p). Substring matching. Given two strings S1 and S2, and a positive integer k, find the number of substrings of S1 of length at least k that occur in S2. Develop and analyze an algorithm to solve this problem in O(|S1| + |S2| + sort(Σ)) time, where Σ is the alphabet.
EX5. Perform Burrows-Wheeler Transform on the following string: CGGTCGCT$. Make sure to draw the matrix. Demonstrate how you can reconstruct the original string from BWT. Where and why is BWT used? This video might help you get started:
EX6. (Bonus Upto 3p): As was promised, the last task is from optimization homework, now we have small competition. Your task is to optimize 1024-city case as well as possible. Data is here: 1024-city-case.
An additional rule is that when you are exporting the image using web tool, the label in "Path legal:False", must be True! This confirms that every city has been visited exactly once.
Report the best solutions (state-of-the-art) to the Piazza in a public thread, to initialize the competition. In the post describe your approach, report the calculation time, obtained path length and the visualization of the resulting path. The ones who can beat past best solutions will get points (even if someone will outcompete your solution next). To get points to make sure to post new best solutions to corresponding Piazza thread. NB! For the solution to beat previous solutions, it should have either a shorter resulting path or if lengths are comparable the new solution should be considerably faster.
Also, very good descriptions of the approach and convergence rates - may get awarded with points upon TA decision.
EX7. Provide feedback for the course and homework topics so far
- which topics were most useful?
- what needs to be covered better in the course?
- are there some topics that would need more practical implementation assignments?
- are there some topics that got too much attention (e.g. too boring or otherwise already well known)