HW11. DE, Edit distance, DTW (25th of Nov, 23.59)
EX1 (1p): Function optimisation. In the file there are four functions with visualizations and a call to the optimization code. Your task is to create your own minimization function using any of the preferred methods - Simulated Annealing, Genetic Algorithm, Differential Evolution, etc.
https://colab.research.google.com/drive/1y6HS2TrwDcZXwZrQYwsOGZhXNE3MJkGj#scrollTo=q-3i1aQwtK7Y
Ideally, use the same optimizer for all four example cases.
Report the results - make a small green dot for where you found the optimal solution; and provide a small table comparing results of python minimzer. Add that to report.
In case of randomly generated functions, compare, how good your result is compared to scipy.optimize
EX2 (1p): Differential evolution. There is a random data generator based on a polynomial function. Also, there are added outliers/noise. See code:
https://colab.research.google.com/drive/1-6GoEiLWMoaOTw3w8ESVxQ_vAiEog08o
Your task is to create a differential evolution algorithm that tries to approximate the 5th degree polynomial to the data points (including outliers).
Use both the mean square error (dark green) and median square error (red) in your DE algorithm as the score to optimize.
EX3 (1p): Edit distance and modifications. It is easy to misspell words. Here is an example of some common medicines and potential spelling errors.
https://colab.research.google.com/drive/1VABPg76CsxTjH99rlUTbuIsdMqesqpBq#scrollTo=zGR_UFpcOHQD
Play with the provided edit distance code. Modify the costs of operations. Find cases where a single character was changed and count over all data, what was changed to what most often?
Create an edit operation of the transposition of two characters. Can you detect from the provided data, if any misspelling simply reversed the characters, e.g. as in charatcers.
(With normal edit distance there should be at least two edit operations first).
EX4 (1p): Dynamic Time Warping is similar to edit distance. Implement an explicit DTW function for the time series generated in here:
https://colab.research.google.com/drive/1jMEYw7qzdlUqq4Jq-CryfcPw6RWs2u92#scrollTo=Spwed2VbZ4zw
Group the generated time series so that each group has the same number of “bumps”.
E.g. by finding similar ones and eliminating them from the pool (if you know clustering, you can also cluster). Or by creating a template of specific nr of bumps in the similar manner, and fetching them by the provided template.
In the report provide an example of multiple time series of similar nr of “bumps” (3-4 examples on a plot).
Secondly, provide the DTW table. If possible, import the numeric table into Excel or Sheets and provide a compact heat map representation of your DTW calculated table.
EX5 (1p): Continue from Task 4. When you have a time series of variable lengths but similar shapes, “warp” the time according to the optimal distance calculation. Visualize similar shapes but now stretched to the same “timescale” or length of time, by first the original series, and secondly stretched to match the longer series scale.
EX6. (Bonus 3P): Seam Carving ´´Seam Carving for Content-Aware Image Resizing´´ is a novel method developed by Shai Avidan and Ariel Shamir describe a novel method of resizing images. Please refer starting to the YouTube video:
https://perso.crans.org/frenoy/matlab2012/seamcarving.pdf
You are given an image, and your task is to calculate the best vertical seam to remove. A vertical seam is a connected path of pixels, one pixel in each row. We call two pixels connected if they are vertically or diagonally adjacent. The best vertical seam is the one that minimizes the total ´´energy´´ of pixels in the seam.
Hint:
- Subproblems: For each pixel(i, j), what is the lower-energy seam that starts at the top row of the image, but ends at (i, j)?
- Relation: Let dp[i, j] be the solution to subproblem(i,j).Then dp[i,j ]=min(dp[i, j - 1], dp[i-1, j-1],dp[i+1, j-1]) + energy(i, j)
- Analysis: Solving each subproblem takes O(1) time: there are three smaller subproblems to look up, and one call to energy(), which all take O(1) time. There is one subproblem for each pixel, so the running time is θ(A), where A is the number of pixels, i.e., the area of the image.
Download the zipped data and unpack it \\ https://drive.google.com/file/d/1CM7c042VHT55AIOvooWQhLMEqriiJ1Eq/view?usp=sharing
In resizeable_image.py, write a function best_seam(self) that returns a list of coordinates corresponding to the cheapest vertical seam to remove, e.g., [(5, 0), (5, 1), (4, 2), (5, 3),(6, 4)]. You should implement the dynamic program described above in a bottom-up manner. The class ResizeableImage inherits from ImageMatrix. You should use the following components of ImageMatrix in your dynamic program:
- self.energy(i,j) returns the energy of a pixel.This take sO(1) time,but the constant factor is sizeable. If you call it more than once, you might want to cache the results.
- self.widthand self.height are the width and height of the image, respectively.
Test your code using test_resizable_image.py, and submit ResizeableImage.py to the report.