MergeSort in F#

Tim Roughgarden says that MergeSort is a canonical example of a divide-and-conquer algorithm: you want to sort an array of numbers? Split it into two smaller arrays, sort each one, and then merge the partial results. This graph from Algorithms Illuminated Part 1 illustrates the process:

The high-level pseudo-code for this process is:

As usual in this series, I totally recommend you to check section 1.4 of the book for a thorough explanation of the algorithm (or watch this video). Under the adequate conditions, divide-and-conquer algorithms require less operations than algorithms that process the whole in one sweep.

Let's see the implementation of the previous pseudo-code in F#:

let rec mergeSort a =
    if Array.length a <= 1 then
        a
    else
        let halfLength = Array.length a / 2
        let c = mergeSort a[.. halfLength - 1]
        let d = mergeSort a[halfLength ..]
        merge c d

You've got to love how naturally the pseudo-code translates into F#, also note that we aren't really using variables: once assigned a value, halfLength, c, and d don't change any more. As with many divide-and-conquer algorithms, the trick is in the merge phase, when you combine the partial solutions. Tim proposes an imperative way of doing it:

We have indexes for the partial arrays C and D (i and j), and for the merged array B (k). For each new item in B we take either from C and increment i or from D and increment j. The pseudo-code doesn't consider the possibility of C or D being exhausted so the imperative implementation in F# is a little bit more complicated:

let merge c d =
    let cLength = Array.length c
    let dLength = Array.length d
    let bLength = cLength + dLength

    let b = Array.zeroCreate bLength
    let mutable i = 0
    let mutable j = 0
    for k in 0 .. bLength - 1 do
        if i < cLength && (j >= dLength || c[i] <= d[j]) then
            b[k] <- c[i]
            i <- i + 1
        else
            b[k] <- d[j]
            j <- j + 1
    b

That weird i < cLength && (j >= dLength || c[i] <= d[j]) says "if there are still items in C and either there are no more items in D or there is an item in D but it is bigger than the current item in C" then take the C item. Otherwise, that is, if Cis exhausted or D's current item is smaller than C's current item, take the Ditem. That was imperative F# code and it works, but how about a functional alternative:

let merge c d =
    let cLength = Array.length c
    let dLength = Array.length d
    let bLength = cLength + dLength
    let b = Array.zeroCreate bLength

    let rec mergeItem i j k =
        if k >= bLength then
            b
        elif i < cLength && (j >= dLength || c[i] <= d[j]) then
            b[k] <- c[i]
            mergeItem (i + 1) j (k + 1)
        else
            b[k] <- d[j]
            mergeItem i (j + 1) (k + 1)

    mergeItem 0 0 0

We see here a frequent functional pattern:

Define a local function that processes one element, mergeItem in our example
The function calls itself with arguments adequately updated to process the next element, in this case the indexes i, j, and k
There is a clear finishing state, in this case k >= bLength, i.e., that b has been filled with the merged values
Finally, start the process by making the initial call to that local recursive function, in this case mergeItem 0 0 0

Note also that mergeItem takes advantage of its closure, by this we mean that from inside the function we access values like cLength and dLength (and even update the b array) in a fairly natural way. There are more sophisticated scenarios for closures, but this one is common and useful.

If you are wondering about the performance of this pattern against the familiar imperative loop, my informal benchmarks show that, at least for MergeSort in F#, actually the functional recursive code is a little faster. Amazing, isn't it?

You can find the sample source code here.

MergeSort in F#

MergeSort in F#

Recommend

GORM-GEN初识与入门,和SQL注入说拜拜

Intel 第 12 代 Core 行動處理器、non-K 桌上型處理器聯手登場

NuGet Gallery | Funtom.Linq 0.0.5

Understanding the ASP.NET Core Boilerplate in Giraffe

Go 实战 | 基于有向无环图的并发执行流的实现

Why your Typescript compiles when you thought you were safe

Thoth.Json

Progressively Adding Fable to JS Projects

Go 语言的 :=的具体使用与区别

Goland 使用教程 01：高效的处理 json 文本

About Joyk