Gamlor's Blog

December 13, 2021

Automated Tests Advice, C# Edition

This post is part of C# Advent Calendar 2021.

I do like writing automated tests, for two reasons. First, it gives me a fast feedback loop. Testing on the fully running app is usually time-consuming compared to running a test. Second, over time it gives some confidence that changes in the code didn’t break your application in unexpected ways.

Anyway, here are a few of my recommendations. Of course, don’t follow my advice blindly. Use your judgment, if the advice fits in your project.

Advice 1: Use Mocking Libraries Sparingly

When I was less experienced and discovered mocking libraries like Moq, FakeItEasy etc, I found them awesome and used them everywhere. However, I now avoid mocking libraries and if I use them, I use them sparingly.

The main issue with overusing mocking libraries I’ve experienced:

Long, complicated setup code, which builds up an object graph and implements some behavior with awkward methods instead of regular code, something like:

[SetUp]
public void Setup()
{
    var paperFormatsMock = new Mock<IPaperFormat>();
    paperFormatsMock.Setup(p => p.SupportedFormatCount).Returns(42);
    // More stuff about papers formats

    var printerQueue = new Mock<IPrintQueue>();
    printerQueue.Setup(p => p.PaperJam).Returns(false);
    // More stuff about the queue

    // Build up a mocked object graph
    var printerMock = new Mock<IPrinter>();
    printerMock.Setup(p => p.PaperFormats).Returns(paperFormatsMock.Object);
    printerMock.Setup(p => p.PrintQueue).Returns(printerQueue.Object);
}

Does you mock simulate the mocked object well? You don’t know. The real object might behave very differently.
If the real implementation changes, your tests using mocks won’t notice.
Often this mock code is copied and pasted around, spreading a shadow implementation more.
Many issues arise in the interaction between components. If I use mock, I don’t see any realistic interaction of components.
You can’t debug those mocks easily, for example, to see for example why it is invoked with certain parameters.

Figure 1. Tested against to many mocks

What I do instead

Whenever possible: Use the real implementation instead of a mock. I rarely mock my own interfaces, because I can use my real implementation. I only mock things that are too slow for fast tests or ifs part of a complex framework.
Often I write a mock as plain code, which I can reuse in many tests. For example, I worked on some app that used git operations, provided by some framework. For that, I wrote a very basic mock implementation in straight code. I can reuse it in many tests, and everybody can understand it:

interface IGitRepo
{
    String FetchFile(string revision, string path);
    IEnumerable<String> ListFiles(string revision);
    // ...
}

class MockGitRepo : IGitRepo
{
    private IDictionary<String, IDictionary<string, string>> content;

    /// <param name="content"> revision->Map of file->content</param>
    public MockGitRepo(IDictionary<string, IDictionary<string, string>> content)
    {
        this.content = content;
    }

    public string FetchFile(string revision, string path)
    {
        var revisionContent = content[revision];
        if (revisionContent.TryGetValue(revision, out string? fileContent))
        {
            return fileContent;
        }
        else
        {
            throw new GitFileNotFound($"File ${path} not found in revision {revision}");
        }
    }

    public IEnumerable<string> ListFiles(string revision)
    {
        return content[revision].Keys;
    }
}

[Test]
public void ReadGitIgnoreFiles()
{
    var gitRepo = new MockGitRepo(new Dictionary<string, IDictionary<string, string>>()
    {
        { "master", new Dictionary<string, string>() { {".gitignore", "Temp"} } }
    });

    // Do the test
}

Last, if writing a manual mock is too much effort, for example, due to a large API surface, then I fall back to mocking libraries. I keep the mocks in a central place for most tests, sharing the mock setup between tests. The goal is that the mock setup gets more realistic as different tests need more behavior of the mocked API:

static class PrintFrameworkMocks
{
    public static Mock<IPrintQueue> PrintQueue(bool paperJam = false)
    {
        var printerQueue = new Mock<IPrintQueue>();
        printerQueue.Setup(p => p.PaperJam).Returns(paperJam);
        return printerQueue;
    }
    // Shared for most the tests, with convenience parameters for the
    public static Mock<IPrinter> Printer(IPrintQueue? queue = null)
    {
        if (queue == null)
        {
            queue = PrintQueue().Object;
        }
        var paperFormatsMock = new Mock<IPaperFormat>();
        paperFormatsMock.Setup(p => p.SupportedFormatCount).Returns(42);
        var printerMock = new Mock<IPrinter>();
        printerMock.Setup(p => p.PaperFormats).Returns(paperFormatsMock.Object);

        printerMock.Setup(p => p.PrintQueue).Returns(queue);
        // As tests grow, the mock setup will 'simulate' more and more of the API, for all tests.
        return printerMock;
    }
}

Advice 2: Pass in Time Explicitly

I wrote a whole blog post about this. Short: If your business logic uses time, pass it along explicitly. Avoid calling things like DateTime.Now in the middle of the code.

Figure 2. Be Precise about Time

When you sprinkle DateTime.Now through your code, and the logic depends on it, then testing gets hard. Especially if you need to test edge cases. Furthermore, you might get an inconsistent and hard to reproduce result, if two phases of the code got a different time tick.

If you pass along the time of your operation should use, testing gets easy. You can fix exact times, you can use edge cases you got in the past, your system uses the same time reference for a logical operation. I sometimes go so far as allowing to specify the time in a REST interface. It is very useful to quickly reproduce and test things on in the production system. Of course, specifying the time probably needs admin permissions ;).

A rough example:

[HttpGet]
public PurchaseSummary Get(string atTime)
{
    // Support a optional time query parameter to show reports of the past
    DateTime time;
    if (string.IsNullOrEmpty(atTime))
        time = DateTime.UtcNow;
    else
        time = DateTime.Parse(atTime);
    // Assume more complex code, like access checks, showing partial or full data depending on permissions etc
    return _summaries.SummarizeLastWeek(time);
}

// Testing edge cases is easy.
[Test]
public void WeekEndsAtTheEndOfYear()
{
    var endOfYear = DateTime.Parse("2020-12-31T23:00:00Z");
    /* Snip: insert test data at the end of the year */

    var endOfYearSummary = toTest.SummarizeLastWeek(endOfYear);
    /* Check that end of year calculation is correct */
}

public class Summaries{
    public PurchaseSummary SummarizeLastWeek(DateTime at)
    {
        // Assume more complex logic. Maybe it goes to different data sources, does more calculations etc
        var purchases = _purchases.LastWeeksPurchases(at);

        var total = purchases.Sum(e => e.Price);
        var count = purchases.Count();
        var priciest = purchases.OrderByDescending(p=>p.Price).FirstOrDefault();
        return new PurchaseSummary(total, count, priciest);
    }
}

public class Database
{
    public IList<Purchase> LastWeeksPurchases(DateTime at)
    {
        // Assume more complex queries, let's say joins with applied coupons, discounts at the time and god knows what
        return Database.InTransaction(conn => conn.Query<Purchase>(@"select * from purchases
        where dateTime < @at
        and (@at + INTERVAL '-7 day') < dateTime
        order by dateTime", new {at=at}).ToList());
    }
}

Advice 3: Test Against A Real Database

Databases are powerful and complex beasts. A typical application will end up relying on the particular behavior of if the database. How does your database handle sorting, search (like collation), null handling, date and time handling (like rounding of datetimes) and god knows what other oddities lurk deep in the database.

Therefore, I try to use the real database. Luckily, databases are fast, so it is usually possible to keep the test time is reasonable even when using the real database.

Figure 3. Developer in SQL land

These things help with running tests against a real database:

Docker & containers: A few years ago it was hard to automate a test database setup that works out of the box. Many databases expected complicated setups which made automation painful. These days there are Docker images ready to go. Worst case you have to create your own image, (I look at you, Oracle >:-( ) but then you are done.
You can keep one or more databases with reasonable test data around, to have real-world-like data around.
You can often use transactions as a 'cheap' DB cleanup. The test starts a transaction, the code under test doesn’t commit it. At the end of the test, the transaction is rolled back, and the DB acts as nothing happened. There are libraries for this, like Respawn.

Advice 4: Integration Tests First

At the beginning of my testing career, I often wrote a lot of unit tests but not many integration tests. These days I often keep the tests high level first. For example, if my system offers a REST endpoint, then I write the tests towards that REST API. And if the implementation is simple, I probably do not add lower-level tests.

Why that order?

The high-level API (like REST/other endpoint) is the thing that needs to be stable. If I heavily rework the implementation, that high-level API stays the same and my tests still work against it.
Many things are just trivial, at least at the beginning. A new REST endpoint that is another CRUD-like construct with not much logic to it. You have done this 100 times, know exactly how to write/copy it from the right places. Any low-level tests won’t give you much more insight. Once that service gets more complicated, you can still add lower-level tests.
Such high-level tests ensure the functionality is there and wired up correctly, and that it covers the basic case. After that, you can react depending on what happens. The feature gets more complex, then it is worth adding more tests. The feature never gets more complex, then your are good. Or maybe the feature didn’t have the demand we thought and you remove it again. You avoided writing tons of testing code for it ;)

Advice 5: Parameterized Tests Are Great

Quite often a test can be repeated with different input parameters and slightly tweaked assertions. Use that fact to easily extend the tests with more examples.

Many tests frameworks have explicit support for parameterized tests. However, don’t be afraid of writing regular code if the test framework doesn’t cover your case well. You won’t get the shiny test reports, but I rather have the tests than the report ;).

Figure 4. Parameterize Tests

Here are a few examples:

[Test]
public void DetectServiceFromUrl()
{
    var expectedDetection = new Dictionary<String, WellKnownEmailService>()
    {
        { "https://gmail.com", WellKnownEmailService.GMail },
        { "https://mail.google.com", WellKnownEmailService.GMail },
        { "https://hotmail.com", WellKnownEmailService.Microsoft },
        { "https://outlook.com", WellKnownEmailService.Microsoft },
        { "https://live.com", WellKnownEmailService.Microsoft },
        { "https://mail.company-x.com", WellKnownEmailService.OtherService },
    };

    foreach (var expected in expectedDetection)
    {
        var result = ParseUrls(expected.Key);

        Assert.AreEqual(expected.Value, result);
    }
}


[Test]
public void TitleAndBodyIsPresent()
{
    var expectedDetection = new List<IMailHandler>()
    {
        new GmailFormat(GmailHandler.Public),
        new GmailFormat(GmailHandler.Company),
        new OutlookFormat(),
        new MdnFormat()
    };

    foreach (var implementation in expectedDetection)
    {
        var formatted = implementation.FormatHtmlQuirks("Title", "Body of the Email");

        // No matter the html formatting flavor,
        // The title and the body have to be present
        Assert.IsTrue(formatted.Contains("Title"));
        Assert.IsTrue(formatted.Contains("Body of the Email"));
    }
}

[Test]
public void SerializeInts()
{
    var rnd = new Random();
    var baseCases = new[]
    {
        0,
        -1,
        1,
        int.MaxValue,
        int.MinValue
    };
    var randomExamples = Enumerable.Range(0, 10).Select(i => rnd.Next());

    var testExamples = baseCases.Concat(randomExamples);
    foreach (var original in testExamples)
    {
        var serialized = MyCleverSerialization.Serialize(original);

        var deserialized = MyCleverSerialization.Deserialize(serialized);

        Assert.AreEqual(original, deserialized);
    }
}

Summary

These are my four advices from approaches I use often but didn’t know/use when I was less experienced.

Automated Tests Advice, C# Edition

Advice 1: Use Mocking Libraries Sparingly

What I do instead

Advice 2: Pass in Time Explicitly

Advice 3: Test Against A Real Database

Advice 4: Integration Tests First

Advice 5: Parameterized Tests Are Great

Summary

Recommend

VMware Named a Leader in The Forrester Wave™: Unified Endpoint Management, Q4 20...

A momentous year for Redgate Software

10 Tips and Tools for Developer Productivity

JavaScript Modules – A Beginner's Guide

Durability Performance Testing With SDK 3.0+

An introduction to GraphQL and how to use GraphQL APIs

Using GitHub’s security features to help identify Log4j exposure in your codebas...

Collection Performance: How Do You LINQ? – dotNetTips.com

Running your local dev environment inside a container — VS Code Remote, GitHub C...

Production postmortem: The memory leak that only happened on Linux - Ayende @ Ra...

About Joyk