1

Asynchronous Injection

 1 year ago
source link: https://blog.ploeh.dk/2019/02/11/asynchronous-injection/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Asynchronous Injection

How to combine asynchronous programming with Dependency Injection without leaky abstractions.

C# has decent support for asynchronous programming, but it ultimately leads to leaky abstractions. This is often conspicuous when combined with Dependency Injection (DI). This leads to frequently asked questions around the combination of DI and asynchronous programming. This article outlines the problem and suggests an alternative.

The code base supporting this article is available on GitHub.

A synchronous example #

In this article, you'll see various stages of a small sample code base that pretends to implement the server-side behaviour of an on-line restaurant reservation system (my favourite example scenario). In the first stage, the code uses DI, but no asynchronous I/O.

At the boundary of the application, a Post method receives a Reservation object:

public class ReservationsController : ControllerBase
{
    public ReservationsController(IMaîtreD maîtreD)
    {
        MaîtreD = maîtreD;
    }
 
    public IMaîtreD MaîtreD { get; }
 
    public IActionResult Post(Reservation reservation)
    {
        int? id = MaîtreD.TryAccept(reservation);
        if (id == null)
            return InternalServerError("Table unavailable");
 
        return Ok(id.Value);
    }
}

The Reservation object is just a simple bundle of properties:

public class Reservation
{
    public DateTimeOffset Date { get; set; }
    public string Email { get; set; }
    public string Name { get; set; }
    public int Quantity { get; set; }
    public bool IsAccepted { get; set; }
}

In a production code base, I'd favour a separation of DTOs and domain objects with proper encapsulation, but in order to keep the code example simple, here the two roles are combined.

The Post method simply delegates most work to an injected IMaîtreD object, and translates the return value to an HTTP response.

The code example is overly simplistic, to the point where you may wonder what is the point of DI, since it seems that the Post method doesn't perform any work itself. A slightly more realistic example includes some input validation and mapping between layers.

The IMaîtreD implementation is this:

public class MaîtreD : IMaîtreD
{
    public MaîtreD(int capacity, IReservationsRepository repository)
    {
        Capacity = capacity;
        Repository = repository;
    }
 
    public int Capacity { get; }
    public IReservationsRepository Repository { get; }
 
    public int? TryAccept(Reservation reservation)
    {
        var reservations = Repository.ReadReservations(reservation.Date);
        int reservedSeats = reservations.Sum(r => r.Quantity);
 
        if (Capacity < reservedSeats + reservation.Quantity)
            return null;
 
        reservation.IsAccepted = true;
        return Repository.Create(reservation);
    }
}

The protocol for the TryAccept method is that it returns the reservation ID if it accepts the reservation. If the restaurant has too little remaining Capacity for the requested date, it instead returns null. Regular readers of this blog will know that I'm no fan of null, but this keeps the example realistic. I'm also no fan of state mutation, but the example does that as well, by setting IsAccepted to true.

Introducing asynchrony #

The above example is entirely synchronous, but perhaps you wish to introduce some asynchrony. For example, the IReservationsRepository implies synchrony:

public interface IReservationsRepository
{
    Reservation[] ReadReservations(DateTimeOffset date);
 
    int Create(Reservation reservation);
}

In reality, though, you know that the implementation of this interface queries and writes to a relational database. Perhaps making this communication asynchronous could improve application performance. It's worth a try, at least.

How do you make something asynchronous in C#? You change the return type of the methods in question. Therefore, you have to change the IReservationsRepository interface:

public interface IReservationsRepository
{
    Task<Reservation[]> ReadReservations(DateTimeOffset date);
 
    Task<int> Create(Reservation reservation);
}

The Repository methods now return Tasks. This is the first leaky abstraction. From the Dependency Inversion Principle it follows that

"clients [...] own the abstract interfaces"

Robert C. Martin, APPP, chapter 11

The MaîtreD class is the client of the IReservationsRepository interface, which should be designed to support the needs of that class. MaîtreD doesn't need IReservationsRepository to be asynchronous.

The change of the interface has nothing to with what MaîtreD needs, but rather with a particular implementation of the IReservationsRepository interface. Because this implementation queries and writes to a relational database, this implementation detail leaks into the interface definition. It is, therefore, a leaky abstraction.

On a more practical level, accommodating the change is easily done. Just add async and await keywords in appropriate places:

public async Task<int?> TryAccept(Reservation reservation)
{
    var reservations =
        await Repository.ReadReservations(reservation.Date);
    int reservedSeats = reservations.Sum(r => r.Quantity);
 
    if (Capacity < reservedSeats + reservation.Quantity)
        return null;
 
    reservation.IsAccepted = true;
    return await Repository.Create(reservation);
}

In order to compile, however, you also have to fix the IMaîtreD interface:

public interface IMaîtreD
{
    Task<int?> TryAccept(Reservation reservation);
}

This is the second leaky abstraction, and it's worse than the first. Perhaps you could successfully argue that it was conceptually acceptable to model IReservationsRepository as asynchronous. After all, a Repository conceptually represents a data store, and these are generally out-of-process resources that require I/O.

The IMaîtreD interface, on the other hand, is a domain object. It models how business is done, not how data should be accessed. Why should business logic be asynchronous?

It's hardly news that async and await is infectious. Once you introduce Tasks, it's async all the way!

That doesn't mean that asynchrony isn't one big leaky abstraction. It is.

You've probably already realised what this means in the context of the little example. You must also patch the Post method:

public async Task<IActionResult> Post(Reservation reservation)
{
    int? id = await MaîtreD.TryAccept(reservation);
    if (id == null)
        return InternalServerError("Table unavailable");
 
    return Ok(id.Value);
}

Pragmatically, I'd be ready to accept the argument that this isn't a big deal. After all, you just replace all return values with Tasks, and add async and await keywords where they need to go. This hardly impacts the maintainability of a code base.

In C#, I'd be inclined to just acknowledge that, hey, there's a leaky abstraction. Moving on...

On the other hand, sometimes people imply that it has to be like this. That there is no other way.

Falsifiable claims like that often get my attention. Oh, really?!

Move impure interactions to the boundary of the system #

We can pretend that Task<T> forms a functor. It's also a monad. Monads are those incredibly useful programming abstractions that have been propagating from their origin in statically typed functional programming languages to more mainstream languages like C#.

In functional programming, impure interactions happen at the boundary of the system. Taking inspiration from functional programming, you can move the impure interactions to the boundary of the system.

In the interest of keeping the example simple, I'll only move the impure operations one level out: from MaîtreD to ReservationsController. The approach can be generalised, although you may have to look into how to handle pure interactions.

Where are the impure interactions in MaîtreD? They are in the two interactions with IReservationsRepository. The ReadReservations method is non-deterministic, because the same input value can return different results, depending on the state of the database when you call it. The Create method causes a side effect to happen, because it creates a row in the database. This is one way in which the state of the database could change, which makes ReadReservations non-deterministic. Additionally, Create also violates Command Query Separation (CQS) by returning the ID of the row it creates. This, again, is non-deterministic, because the same input value will produce a new return value every time the method is called. (Incidentally, you should design Create methods so that they don't violate CQS.)

Move reservations to a method argument #

The first refactoring is the easiest. Move the ReadReservations method call to the application boundary. In the above state of the code, the TryAccept method unconditionally calls Repository.ReadReservations to populate the reservations variable. Instead of doing this from within TryAccept, just pass reservations as a method argument:

public async Task<int?> TryAccept(
    Reservation[] reservations,
    Reservation reservation)
{
    int reservedSeats = reservations.Sum(r => r.Quantity);
 
    if (Capacity < reservedSeats + reservation.Quantity)
        return null;
 
    reservation.IsAccepted = true;
    return await Repository.Create(reservation);
}

This no longer compiles until you also change the IMaîtreD interface:

public interface IMaîtreD
{
    Task<int?> TryAccept(Reservation[] reservations, Reservation reservation);
}

You probably think that this is a much worse leaky abstraction than returning a Task. I'd be inclined to agree, but trust me: ultimately, this will matter not at all.

When you move an impure operation outwards, it means that when you remove it from one place, you must add it to another. In this case, you'll have to query the Repository from the ReservationsController, which also means that you need to add the Repository as a dependency there:

public class ReservationsController : ControllerBase
{
    public ReservationsController(
        IMaîtreD maîtreD,
        IReservationsRepository repository)
    {
        MaîtreD = maîtreD;
        Repository = repository;
    }
 
    public IMaîtreD MaîtreD { get; }
    public IReservationsRepository Repository { get; }
 
    public async Task<IActionResult> Post(Reservation reservation)
    {
        var reservations =
            await Repository.ReadReservations(reservation.Date);
        int? id = await MaîtreD.TryAccept(reservations, reservation);
        if (id == null)
            return InternalServerError("Table unavailable");
 
        return Ok(id.Value);
    }
}

This is a refactoring in the true sense of the word. It just reorganises the code without changing the overall behaviour of the system. Now the Post method has to query the Repository before it can delegate the business decision to MaîtreD.

Separate decision from effect #

As far as I can tell, the main reason to use DI is because some impure interactions are conditional. This is also the case for the TryAccept method. Only if there's sufficient remaining capacity does it call Repository.Create. If it detects that there's too little remaining capacity, it immediately returns null and doesn't call Repository.Create.

In object-oriented code, DI is the most common way to decouple decisions from effects. Imperative code reaches a decision and calls a method on an object based on that decision. The effect of calling the method can vary because of polymorphism.

In functional programming, you typically use a functor like Maybe or Either to separate decisions from effects. You can do the same here.

The protocol of the TryAccept method already communicates the decision reached by the method. An int value is the reservation ID; this implies that the reservation was accepted. On the other hand, null indicates that the reservation was declined.

You can use the same sort of protocol, but instead of returning a Nullable<int>, you can return a Maybe<Reservation>:

public async Task<Maybe<Reservation>> TryAccept(
    Reservation[] reservations,
    Reservation reservation)
{
    int reservedSeats = reservations.Sum(r => r.Quantity);
 
    if (Capacity < reservedSeats + reservation.Quantity)
        return Maybe.Empty<Reservation>();
 
    reservation.IsAccepted = true;
    return reservation.ToMaybe();
}

This completely decouples the decision from the effect. By returning Maybe<Reservation>, the TryAccept method communicates the decision it made, while leaving further processing entirely up to the caller.

In this case, the caller is the Post method, which can now compose the result of invoking TryAccept with Repository.Create:

public async Task<IActionResult> Post(Reservation reservation)
{
    var reservations =
        await Repository.ReadReservations(reservation.Date);
    Maybe<Reservation> m =
        await MaîtreD.TryAccept(reservations, reservation);
    return await m
        .Select(async r => await Repository.Create(r))
        .Match(
            nothing: Task.FromResult(InternalServerError("Table unavailable")),
            just: async id => Ok(await id));
}

Notice that the Post method never attempts to extract 'the value' from m. Instead, it injects the desired behaviour (Repository.Create) into the monad. The result of calling Select with an asynchronous lambda expression like that is a Maybe<Task<int>>, which is a awkward combination. You can fix that later.

The Match method is the catamorphism for Maybe. It looks exactly like the Match method on the Church-encoded Maybe. It handles both the case when m is empty, and the case when m is populated. In both cases, it returns a Task<IActionResult>.

Synchronous domain logic #

At this point, you have a compiler warning in your code:

Warning CS1998 This async method lacks 'await' operators and will run synchronously. Consider using the 'await' operator to await non-blocking API calls, or 'await Task.Run(...)' to do CPU-bound work on a background thread.

Indeed, the current incarnation of TryAccept is synchronous, so remove the async keyword and change the return type:

public Maybe<Reservation> TryAccept(
    Reservation[] reservations,
    Reservation reservation)
{
    int reservedSeats = reservations.Sum(r => r.Quantity);
 
    if (Capacity < reservedSeats + reservation.Quantity)
        return Maybe.Empty<Reservation>();
 
    reservation.IsAccepted = true;
    return reservation.ToMaybe();
}

This requires a minimal change to the Post method: it no longer has to await TryAccept:

public async Task<IActionResult> Post(Reservation reservation)
{
    var reservations =
        await Repository.ReadReservations(reservation.Date);
    Maybe<Reservation> m = MaîtreD.TryAccept(reservations, reservation);
    return await m
        .Select(async r => await Repository.Create(r))
        .Match(
            nothing: Task.FromResult(InternalServerError("Table unavailable")),
            just: async id => Ok(await id));
}

Apart from that, this version of Post is the same as the one above.

Notice that at this point, the domain logic (TryAccept) is no longer asynchronous. The leaky abstraction is gone.

Redundant abstraction #

The overall work is done, but there's some tidying up remaining. If you review the TryAccept method, you'll notice that it no longer uses the injected Repository. You might as well simplify the class by removing the dependency:

public class MaîtreD : IMaîtreD
{
    public MaîtreD(int capacity)
    {
        Capacity = capacity;
    }
 
    public int Capacity { get; }
 
    public Maybe<Reservation> TryAccept(
        Reservation[] reservations,
        Reservation reservation)
    {
        int reservedSeats = reservations.Sum(r => r.Quantity);
 
        if (Capacity < reservedSeats + reservation.Quantity)
            return Maybe.Empty<Reservation>();
 
        reservation.IsAccepted = true;
        return reservation.ToMaybe();
    }
}

The TryAccept method is now deterministic. The same input will always return the same input. This is not yet a pure function, because it still has a single side effect: it mutates the state of reservation by setting IsAccepted to true. You could, however, without too much trouble refactor Reservation to an immutable Value Object.

This would enable you to write the last part of the TryAccept method like this:

return reservation.Accept().ToMaybe();

In any case, the method is close enough to be pure that it's testable. The interactions of TryAccept and any client code (including unit tests) is completely controllable and observable by the client.

This means that there's no reason to Stub it out. You might as well just use the function directly in the Post method:

public class ReservationsController : ControllerBase
{
    public ReservationsController(
        int capacity,
        IReservationsRepository repository)
    {
        Capacity = capacity;
        Repository = repository;
    }
 
    public int Capacity { get; }
    public IReservationsRepository Repository { get; }
 
    public async Task<IActionResult> Post(Reservation reservation)
    {
        var reservations =
            await Repository.ReadReservations(reservation.Date);
        Maybe<Reservation> m =
            new MaîtreD(Capacity).TryAccept(reservations, reservation);
        return await m
            .Select(async r => await Repository.Create(r))
            .Match(
                nothing: Task.FromResult(InternalServerError("Table unavailable")),
                just: async id => Ok(await id));
    }
}

Notice that ReservationsController no longer has an IMaîtreD dependency.

All this time, whenever you make a change to the TryAccept method signature, you'd also have to fix the IMaîtreD interface to make the code compile. If you worried that all of these changes were leaky abstractions, you'll be happy to learn that in the end, it doesn't even matter. No code uses that interface, so you can delete it.

Grooming #

The MaîtreD class looks fine, but the Post method could use some grooming. I'm not going to tire you with all the small refactoring steps. You can follow them in the GitHub repository if you're interested. Eventually, you could arrive at an implementation like this:

public class ReservationsController : ControllerBase
{
    public ReservationsController(
        int capacity,
        IReservationsRepository repository)
    {
        Capacity = capacity;
        Repository = repository;
        maîtreD = new MaîtreD(capacity);
    }
 
    public int Capacity { get; }
    public IReservationsRepository Repository { get; }
 
    private readonly MaîtreD maîtreD;
 
    public async Task<IActionResult> Post(Reservation reservation)
    {
        return await Repository.ReadReservations(reservation.Date)
            .Select(rs => maîtreD.TryAccept(rs, reservation))
            .SelectMany(m => m.Traverse(Repository.Create))
            .Match(InternalServerError("Table unavailable"), Ok);
    }
}

Now the Post method is just a single, composed asynchronous pipeline. Is it a coincidence that this is possible?

This is no coincidence. This top-level method executes in the 'Task monad', and a monad is, by definition, composable. You can chain operations together, and they don't all have to be asynchronous. Specifically, maîtreD.TryAccept is a synchronous piece of business logic. It's unaware that it's being injected into an asynchronous context. This type of design would be completely run of the mill in F# with its asynchronous workflows.

Summary #

Dependency Injection frequently involves I/O-bound operations. Those typically get hidden behind interfaces so that they can be mocked or stubbed. You may want to access those I/O-bound resources asynchronously, but with C#'s support for asynchronous programming, you'll have to make your abstractions asynchronous.

When you make the leaf nodes in your call graph asynchronous, that design change ripples through the entire code base, forcing you to be async all the way. One result of this is that the domain model must also accommodate asynchrony, although this is rarely required by the logic it implements. These concessions to asynchrony are leaky abstractions.

Pragmatically, it's hardly a big problem. You can use the async and await keywords to deal with the asynchrony, and it's unlikely to, in itself, cause a problem with maintenance.

In functional programming, monads can address asynchrony without introducing sweeping leaky abstractions. Instead of making DI asynchronous, you can inject desired behaviour into an asynchronous context.

Behaviour Injection, not Dependency Injection.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK