MergeSort in F#
source link: https://www.cogitoergofun.io/merge-sort-in-fsharp/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
MergeSort in F#
Tim Roughgarden says that MergeSort is a canonical example of a divide-and-conquer algorithm: you want to sort an array of numbers? Split it into two smaller arrays, sort each one, and then merge the partial results. This graph from Algorithms Illuminated Part 1 illustrates the process:
The high-level pseudo-code for this process is:
As usual in this series, I totally recommend you to check section 1.4 of the book for a thorough explanation of the algorithm (or watch this video). Under the adequate conditions, divide-and-conquer algorithms require less operations than algorithms that process the whole in one sweep.
Let's see the implementation of the previous pseudo-code in F#:
let rec mergeSort a =
if Array.length a <= 1 then
a
else
let halfLength = Array.length a / 2
let c = mergeSort a[.. halfLength - 1]
let d = mergeSort a[halfLength ..]
merge c d
You've got to love how naturally the pseudo-code translates into F#, also note that we aren't really using variables: once assigned a value, halfLength
, c
, and d
don't change any more. As with many divide-and-conquer algorithms, the trick is in the merge phase, when you combine the partial solutions. Tim proposes an imperative way of doing it:
We have indexes for the partial arrays C
and D
(i
and j
), and for the merged array B
(k
). For each new item in B
we take either from C
and increment i
or from D
and increment j
. The pseudo-code doesn't consider the possibility of C
or D
being exhausted so the imperative implementation in F# is a little bit more complicated:
let merge c d =
let cLength = Array.length c
let dLength = Array.length d
let bLength = cLength + dLength
let b = Array.zeroCreate bLength
let mutable i = 0
let mutable j = 0
for k in 0 .. bLength - 1 do
if i < cLength && (j >= dLength || c[i] <= d[j]) then
b[k] <- c[i]
i <- i + 1
else
b[k] <- d[j]
j <- j + 1
b
That weird i < cLength && (j >= dLength || c[i] <= d[j])
says "if there are still items in C
and either there are no more items in D
or there is an item in D
but it is bigger than the current item in C
" then take the C
item. Otherwise, that is, if C
is exhausted or D
's current item is smaller than C
's current item, take the D
item. That was imperative F# code and it works, but how about a functional alternative:
let merge c d =
let cLength = Array.length c
let dLength = Array.length d
let bLength = cLength + dLength
let b = Array.zeroCreate bLength
let rec mergeItem i j k =
if k >= bLength then
b
elif i < cLength && (j >= dLength || c[i] <= d[j]) then
b[k] <- c[i]
mergeItem (i + 1) j (k + 1)
else
b[k] <- d[j]
mergeItem i (j + 1) (k + 1)
mergeItem 0 0 0
We see here a frequent functional pattern:
- Define a local function that processes one element,
mergeItem
in our example - The function calls itself with arguments adequately updated to process the next element, in this case the indexes
i
,j
, andk
- There is a clear finishing state, in this case
k >= bLength
, i.e., thatb
has been filled with the merged values - Finally, start the process by making the initial call to that local recursive function, in this case
mergeItem 0 0 0
Note also that mergeItem
takes advantage of its closure, by this we mean that from inside the function we access values like cLength
and dLength
(and even update the b
array) in a fairly natural way. There are more sophisticated scenarios for closures, but this one is common and useful.
If you are wondering about the performance of this pattern against the familiar imperative loop, my informal benchmarks show that, at least for MergeSort in F#, actually the functional recursive code is a little faster. Amazing, isn't it?
You can find the sample source code here.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK