Testing for breaking changes

I wrote last summer regarding the use of the excellent FSCheck tool to give us extra confidence when refactoring to ensure external behaviour remains consistent. Well, it turns out that in the move to .NET Core, a number of unexpected breaking changes were accidentally introduced in LINQ. So, I spent some time using FSCheck to "prove" the breaking changes, but then never wrote up my findings. So, here they are!

Breaking Changes in LINQ

It turns out that whilst LINQ is essentially a number of features whose background is firmly in the FP background that are baked into C#, this doesn't prevent people using it in ways that perhaps weren't expected. One of those is that .NET doesn't guarantee purity of function (this is true of both F# and C#) which means that it's perfectly possibly to write code which, in the process of execution, executes some side-effect - such as writing to a database table.

In F#, there's a dedicated "side-effectful" version of map called iter, which is designed explicitly for "dead end" operations over collections that don't return anything. This doesn't prevent the case above, but it does at least try to support separating pure and impure collection operations.

Unfortunately, in the move from .NET Framework to .NET Core, one of the many optimisations in the base class library changed the number of items that LINQ internally iterated over where composing the OrderBy and FirstOrDefault methods. In other words (this taken directly from the GitHub issue):

"We have a bit of code to reserve an account from an available pool, which looks like this:

var account =
    accounts
        .OrderBy(x => x.UsageCount)
        .FirstOrDefault(x => x.TryReserve(token));

After porting our code from .NET Framework to .NET Core, this now invokes the predicate method for every item in the list. In practise, this code now reserves ALL accounts and then returns the first."

Whoops! Imagine if this was a destructive change e.g. delete the first order in database that meets some condition. Sorry - now you've deleted all your orders.

Property-based testing to the rescue

Obviously this is not a great place to be, but I wanted to try to create a set of exhaustive tests to see if (a) we could prove this issue, and (b) if there were any other methods that had been affected by similar optimisations.

At a high level, this means comparing the behaviour of the .NET Framework and .NET Core LINQ implementations. In terms of behaviour, I was interested in two things:

The result of calling both methods with the same input i.e. do they both give the same outputs?
The number of calls to any higher order functions supplied to both methods i.e. do they make the same number of calls, or has this been changed?

Testing through building blocks

We'll start with a basic helper function that we can use later on:

/// A helper function that will track calls to any higher order function passed into another
/// function.
let trackCalls func higherOrderFunc data =
    let key = obj()
    let mutable count = 0
    let higherOrderFunc input =
        lock key (fun () -> count <- count + 1)
        higherOrderFunc input
    {| Result = func(data, higherOrderFunc); CallCount = count |}

This function takes in some function, a higher order function that is used by the function, and some data that the function operates on. For example:

[1 .. 5].Select(fun n -> n * 2)

In this case:

[1 .. 5] is the data
Select is the function
fun n -> n * 2 is the higher order function that Select will call on every item

trackCalls silently decorates the higher order function with a counter, to monitor how many times the higher order function has been called. It then returns back out the result of the function, and the number of calls:

(Functions.trackCalls (Enumerable.Select >> Seq.toArray) (fun n -> n * 2) [ 1 .. 5 ])

//  { CallCount = 5
//    Result = [|2; 4; 6; 8; 10|] }

It's important to include the toArray call - this forces LINQ to fully evaluate the call across all data.

Creating a test function

With this helper, we can now create a generic "test" function:

let testTwoFuncs firstFunc secondFunc higherOrderFunc inputData =
    let actual = Functions.trackCalls firstFunc higherOrderFunc inputData
    let expected = Functions.trackCalls secondFunc higherOrderFunc inputData
    actual = expected

In other words, given two functions (in our case, a netcore and netfx implementation of some code), some higher order function and some input data, check that both functions return the same results.

Interestingly, we could rename higherOrderFunc and inputData as simply argOne and argTwo - because F# automatically genericises everything for us, this would work for any function that simply takes in two arguments in tupled form.

Now that we've done this, we can test out both implementations of a basic LINQ function - in this case Select:

testTwoFuncs
    (Enumerable.Select >> Seq.toArray) // net core implementation
    (OldEnumerable.Select >> Seq.toArray) // net fx implementation
    (fun x -> x * 2) // some arbitrary higher order function to use in Select
    [| 1 .. 10 |] // input data

This call will return true - for an input dataset of 1 to 10 with a higher order function that squares the numbers, both the NetCore and NetFx versions of Select both return the same result set and make the same number of calls.

OldEnumerable is a module I have created which is a port of a subset of the original netfx LINQ implementation.

Introducing FSCheck

Of course, now that we've done this, we can generalise this test by omitting the final two parameters (the higher order function and the input data set) and letting FSCheck test against other datasets (and against other random higher order functions!):

FsCheck.Check.Quick(
    testTwoFuncs
        (Enumerable.Select >> Seq.toArray)
        (OldEnumerable.Select >> Seq.toArray)
)

//Ok, passed 100 tests.

In other words, FsCheck has tried 100 combinations of random higher order functions (yes, FSCheck has generated functions for us!) and input data, and confirmed that the behaviour of those both versions of Select are always the same.

Summary

In this post I explained the breaking change that was introduced in .NET Core in LINQ. I then showed how we can write a simple function to decorate arbitrary functions for the purposes of call counting, before looking at how to compose this together with standard LINQ methods for the purposes of comparison.

We've not actually looked at the bug though - that'll come in my next post in this series!

Until then, have (fun -> _).

Breaking Changes in LINQ

Property-based testing to the rescue

Testing through building blocks

Creating a test function

Introducing FSCheck

Summary

Recommend

Retail traders buying more Bitcoin than institutions: JPMorgan

How To Use Elmish Commands For Dynamic F# MVU Web Front-Ends | Sutil Tutorial |...

Layer-2 for Beginners

Piano Practice Software Progress

Your E-Mail Validation Logic is Wrong

参与新算法稳定币FEI的正确姿势，怎么参与风险最低？

太平洋未来科技完成A轮2.5亿元融资，加速文旅和线下娱乐市场布局

Ether Cards首发「Accidental Collaboration」五层艺术品NFT

CryptoKitties Marketplace on OpenSea: Buy, sell, and explore digital assets

New wave of App Store rejections suggests iOS 14.5, new iPad may be imminent

About Joyk