HW11. Text algorithms, NFA, and projects (22nd of Nov, 23.59)
EX1. Peer review the articles assigned to you!
The article peer reviews are assigned. You have to review the ones which have your pseudonym (same as in the gradesheet). You can see links when you are logged in to courses.cs.ut.ee!
Try to give constructive feedback which the person can make use of to make their final submission.
EX2: In the Boyer-Moore Algorithm for pattern matching there are two rules: (i) Bad character rule is defined as Upon mismatch, skip alignments until (a) mismatch becomes a match, or (b) pattern {$ p $} moves past mismatched character (ii) Good suffix rule: if {$ t $} = substring matched by inner loop; skip until (a) there are no mismatches between pattern {$ p $} and {$ t $} or (b) {$ p $} moves past {$ t $}.
- For a given pattern {$ p $} = ‘TCAA’ and text {$ t $} = ‘GCTAGCTC’ implement the bad character rule.
- For a given pattern {$ p $} = ‘ACTA’ and text {$ t $} = ‘GCTAGCTC’ implement the good suffix rule
- How many alignments does Boyer-Moore try during the matching?
You can use the following code snippet:
https://colab.research.google.com/drive/1yAbpG81j6aihXWVUxH2AyPkEjo-QruQp#scrollTo=XIsQJVyG1FUs
EX3: Construct an Aho-Corasick search automaton. Pass through given Text= ”fearworkamournaword” with that search automaton with the following keys= {word, amour, armour, our, mourn}. Draw the states in an automaton on paper.
EX4: Given the following
(ab|(aa|b)(ba)*(bb|a))*
Construct an NFA (Nondeterministic-Finite Automaton) using the Thompson construction algorithm.
Minimize the constructed NFA.
Provide example patterns that are accepted by this automaton and some that are not.
EX5: The course ends with the project to be completed usually by a small team of 2-3 people. Your task is to come up with one project proposal all by yourself (this is an individual task, you will have time to share) - taking ideas from project ideas file, from previous homework, or your imagination + google. E.g. you may attempt to extend some of the past exercises. Note - this is a proposal that you may use to attract other students to join, and this is at the same time a proposal that you do not need to start executing necessarily. So. it's more a planning exercise, not an execution exercise.
Your project proposal should have:
- Title
- 2-5 sentences of a short description
- Briefly described the motivation and the main challenge of this project
- Division of tasks, the estimated number of work hours per task, and deadline (aim at our poster session!)
- Allocation of the 2-3 people to tasks and hours.
- Recommended: create a GANTT chart to cover the previous two points.
- Description of the envisioned end results that would go to the project report/poster.
We will discuss these in the practice session.
EX6. (Bonus 2p): This task is from optimization homework, now we have small competition. Your task is to optimize the 1024-city case as well as possible. Data is here: 1024-city-case.
An additional rule is that when you are exporting the image using web tool, the label in "Path legal: False", must be True! This confirms that every city has been visited exactly once.
Report the best solutions (state-of-the-art) to Slack to initialize the competition. In the post describe your approach, report the calculation time, obtained path length, and the visualization of the resulting path. The ones who can beat past best solutions will get points (even if someone will outcompete your solution next). To get points to make sure to post new best solutions to slack. NB! For the solution to beat previous solutions, it should have either a shorter resulting path or if lengths are comparable the new solution should be considerably faster (less calculation time).
Also, very good descriptions of the approach and convergence rates - may get awarded with points upon TA decision.