HW9. Graphs continued (November 17, 23:59)
T1, Study the notebook on Colab. Describe the produced graphs and the provided methods.
https://colab.research.google.com/drive/11C4cUazro4NR2iO7tdZofFbREPBzOu4u#scrollTo=RVbeuJgXSUzm
Describe briefly, what the code does.
Create and describe the SCC component graphs and visualise them. Focus on largest connected component.
Since SCC component graph does not contain cycles, run the topological sorting on the component graph.
Choose start and target nodes for the Max-Flow from the first and last components according to the SCC component graph after toplogical sorting. Run the Max-Flow on original graph, but with your new source and target nodes. State how similar or different the flows are compared to the one provided in the original code.
T2. Here the task is to come up with “meaningful” two example graphs that you would use in the next tasks.
Make one manually, and the other through a random generation approach that you can choose based on the above generator by trying out different parameters, or write yourself.
Create (manually) a graph of size of ca 15-20 nodes, that you fully understand and can draw, including the weights (capacities). i.e. the one that helps you to validate and debug methods and code.
A second one is generated: ca 40-60 nodes, ca 4-5x more edges, and edge weights with uneven distribution (large, mid, small values).
Fix the two graphs for later use. But read first the Task 3 and 4 descriptions, to make best choices here.
Visualise your own original and the respective SCC component graph, like from Task 1. Attempt a reasonable-size SCC graph, hopefully ca 6-20 components/nodes. Modify your original graph to reach a case where you have a satisfactory original and SCC graph.
With your own example two graphs, calculate the Max Flow on your graph; Choose appropriate source and target nodes that would help to illustrate a meaningful max-flow on the graph. Visualise that.
T3. Here you will attempt to modify the original graph into the smaller SCC graph for calculating the comparable Max-Flow.
First, simply sum the original capacities for edges in between SCC components. Choose appropriate source and target nodes in original and the same for SCC graph. Run Max Flow on that SCC. Compare the result to the original Max-Flow. Identify by a counter-example why the edge capacities can not be simply a sum of all edge capacities from one SCC to another. Provide some small illustrative counterexamples with explanation why the “trivial” sum approach would not always work.
T4. Now we have the directed graphs and weights (capacities). Run a randomised walk algorithm on your graphs.
Scale (normalise) your capacities so that they can be treated as the link probabilities that add up to 1. Calculate the frequency/probability to be in each state (node) if you run the random walk long enough. Add a “re-incarnation” probability from the “dead end” node that has no outgoing edges, such that each node is equally probable. Secondly, play with a parameter that from any node you can re-appear in any other node randomly (with equal probability), Vary that probability at 0%, 5%, 10%, and 20% of times, instead of following an actual link. Identify the “most probable nodes” for such random walkers. List your top-5 nodes and identify them on your visualisation.
Assess whether the SCC components and "most frequent nodes" tend to be related. E.g. if most popular nodes are within the same components?
T5. Random walk calculated through matrix operations.
Random walk from Task 4 can be implemented in truly a random walk and coin tossing at each node, simply counting nr of times of visiting each node in your graph. But it can be also simulated by matrix representation and multiplications, as the Page Rank procedure. Simulate the same results from Task 4 (random walk) by matrix representation and matrix operations.
Report the similarities and differences of the end result.
T6. (bonus, 2p) Follow up from the task 3.
Provide a better scheme to calculate edge capacities for the SCC graph from the original graph, so that the result on the SCC graph would be more similar or same as the original Max-Flow. Hopefully you can demonstrate that you achieved a more correct Max-Flow solution on a smaller SCC graph as compared to the “true max flow” on the larger original graph. Remember, you can also keep generating the graphs randomly and running the same SCC procedures, as in the Task 1 random generator, toi understand how big are discrepancies and how well you can avoid them.