Concurrent Programming Languages
Lab 2: Concurrent algorithms
Exercise 1: Parallel summation
We want to find the sum of all numbers in an array. In a serial fashion, this can be done by iterating over the array and adding the numbers to an accumulator.
Questions
- How many operations are needed to form the sum in the serial algorithm? What is the overall time complexity of the algorithm?
- Now think about how this can be done in a parallel fashion. Try to use Foster's method if possible and useful. Is there something you need to know about the hardware?
- In the parallel algorithms, how many operations are needed? What is the overall time complexity of the algorithm?
- Does it make a difference if we have a shared memory or a distributed memory machine?
Exercise 2: Merge sort
The merge sort is a sorting algorithm based on the divide and conquer paradigm. The idea is to divide the input array into halves and repeat this until lists are of length 1 (divide step). Pairs of adjacent elements are then merged to get sorted lists until the complete list is sorted (conquer step). An illustration of this is given here
(source: https://en.wikipedia.org/wiki/Merge_sort)
Note that for the merge step, we need to run through the two lists to merge only once. So on each level of the merge, there are n elements traversed. Since the depth of the list is log n, the overall complexity is O(n log n).
In this lab, we are looking at parallel approaches to the merge sort.
A sequential algorithm could look like this:
- Data is saved in an array, l and r are the left and right limits of the segment we deal with
- if r > l,
- Find the mid-point and divide the array into two halves, mid = l + (r-l) / 2.
- Mergesort first half, mergeSort(arr, l, mid)
- Mergesort second half, mergeSor(arr, mid+1, r)
- Merge the two sorted halves, merge(arr, l, mid, r)
Note the algorithm is external, i. e. it is not in place, so we need arrays in addition to the original one.
Questions
- Assume you have shared memory and an infinite number of processors/cores, what could be parallelized? Look at the image to see the dependencies of the operations. Try to use Foster's method if possible and useful.
- Can you say anything about the time complexity of the parallel algorithm?
- If there is a limit to the number of processors/cores, what would be a sensible way be to parallelize the task? You could also use other sorting algorithms, if that is useful.
- Now consider distributed memory - how can you solve this if every processor can only access its own memory?
- Self study: There is a very nice discussion of distributed sorting as a google interview question at https://quanticdev.com/algorithms/distributed-computing/distributed-sorting/. Study this page and try to relate it our discussion as much as possible!