2

C# Pattern Matching Explained

 2 years ago
source link: https://blog.ndepend.com/c-pattern-matching-explained/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

C# Pattern Matching Explained

Since the C# version 7, C# has support for pattern matching. C# pattern matching is here to simplify complex if-else statements into more compact and readable code. Pattern matching hasn’t been introduced to write special code that cannot be written without. Its only purpose is to have more concise and elegant code.

I believe the only way to present pattern matching in a proper and complete way is to explain its evolution through C# versions. Else this is cumbersome to illustrate what can be expressed and what cannot. The present post will be updated with future C# evolutions.

C#7: Null, Constant, Type, Discard and Var Patterns

C#7 introduced checking against a null pattern and a constant pattern.

static void NullPattern(object o, int? ni) {
   if (o is null)        Console.WriteLine("o is null");
   if (!(ni is null))    Console.WriteLine(ni.Value);
static void ConstantPattern(double d, string str) {
   if (d is Math.PI)     Console.WriteLine("d is PI");
   if (str is "314159")  Console.WriteLine("str looks like PI");

One limitation of pattern matching – also till latest C# version- is that values embedded in pattern must be constant. In practice often the values to check against are not hardcoded but retrieved from some configurations. This limitation makes pattern matching useless in a number of situations.

Notice that constant supported by patterns are:

  • an integer or floating-point numerical literal
  • a char or a string literal
  • a boolean value true or false
  • an enumeration value
  • the name of a declared const field or null

C#7 also introduced the type pattern, which is a great improvement, especially to introduce temporary variables in complex bool expressions:

static void TypePattern(Shape shape) {
   if (shape is Circle circle1)
      Console.WriteLine($"shape is a circle of radius {circle1.Radius}");
   // Type pattern and compound expressions
   if (shape is Rectangle rect && rect.Width == rect.Height)
      Console.WriteLine($"shape is a square of length {rect.Width}");

Pattern matching works also with switch statements. This is especially useful to manage control flows through types that aren’t related by an inheritance hierarchy.

public static int Count<T>(this IEnumerable<T> seq) {
   switch (seq) {
      case Array a:                   return a.Length;
      case ICollection<T> c:          return c.Count;
      case IReadOnlyCollection<T> c:  return c.Count;
      // Matches if seq is not null
      case IEnumerable<T> _:          return seq.Count();
      // Discard pattern when seq is null
      default:                        return 0;

Notice the last default: clause that matches when seq is null. This is the discard pattern.

Notice also that the order of the case clauses matters. For example the compiler is smart enough to prevent such mistake:

Also with switch statement the keyword when can be used to add additional conditions that refine the pattern:

public static int CountUpTo10<T>(this IEnumerable<T> seq) {
   switch (seq) {
      case Array a when a.Length <= 10:                  return a.Length;
      case ICollection<T> c when c.Count <= 10:          return c.Count;
      case IReadOnlyCollection<T> c when c.Count <= 10:  return c.Count;
      default:    throw new ArgumentException("Too large sequence");

Notice that the keyword when cannot be used in if statement -only in switch statement – even with latest C# versions:

Finally C# 7 introduced the var pattern which is a special type pattern that matches even when null.

static void VarPattern() {
   object o = null;
   Assert.IsFalse(o is object);
   Assert.IsTrue(o is var v);

It is not recommended to use the var pattern to skip null check because undefined null state is dangerous. Practically the var pattern is used it in complex situations where anonymous types are involved.

C# 8: Switch Expressions and Property, Positional and Tuple Patterns

C#8 improved pattern matching in several ways. First there is switch expression:

static void SwitchExpression(Shape shape) {
   string whatShape = shape switch {
      Circle r     => $"shape is a circle of radius {r}",
      Rectangle _  =>  "shape is a rectangle",
      _            =>  "shape is null or not a known type of shape"

Arguably the code is more readable because quite a lot of characters are saved here, no more case, no more variable declaration and bodies are expressions. Also compared to a regular expression switch, the compiler warns ( but don’t emit error) about not handled possible values. An exception is thrown at runtime when a switch expression reaches the end without a match.

Switch expression fits particularly well with expression-bodied member:

public static string WhatShape(this Shape shape) => shape switch {
   Circle r    => $"This is a circle of radius {r}",
   Rectangle _ =>  "This is a rectangle",
   _           =>  "shape is null or not a known type of shape"

C#8 also introduced extremely useful property patterns.

static void PropertyPattern(Shape shape) {
   if(shape is Circle { Radius: 1.0d })
      Console.WriteLine($"shape is a circle of radius 1");
   string whatShape = shape switch {
      Circle { Radius: 1.0d }                   =>  "shape is a circle of radius 1",
      Rectangle r when r.Width == r.Height      =>  "shape is a square",
      Rectangle { Width: 10d, Height: 5d}       =>  "shape is a 10 x 5 rectangle",
      Rectangle { Width: var x, Height: var y } => $"shape is a {x} x {y} rectangle",
      { }                                       =>  "shape is not null",
      _                                         =>  "null"

Notice that:

  • The var pattern can be used to introduce variables to handle properties values.
  • A property pattern requires the object reference to be not null. Hence empty property pattern { } is a test for null check.

If rectangle has a deconstructor, the expression Rectangle { Width: var x, Height: var y } can be simplified to  Rectangle(var x, var y). This is the new C# 8 positional pattern.

public class Rectangle : Shape {
   public double Height { get; set; }
   public double Width { get; set; }
   public override double Area => Height * Width;
   public void Deconstruct(out double width, out double height) {
      width = Width;
      height = Height;
public static string WhatShapeWithPositionalPattern(this Shape shape) => shape switch {
   Circle r                 => $"This is a circle of radius {r}",
   Rectangle(var x, var y)  => $"shape is a {x} x {y} rectangle",
   _                        =>  "shape is null or not a known type of shape"

Finally C#8 introduced the tuple pattern to test multiple values at the same time:

public static string ColorsOfWithTuplePattern(Color c1, Color c2, Color c3) =>
   (c1, c2, c3) switch {
      (Color.Blue, Color.White, Color.Red)    => "France",
      (Color.Green, Color.White, Color.Red)   => "Italy",
      (Color.Black, Color.Red, Color.Yellow)  => "Germany",
      _ => throw new ArgumentException("Unknown flag")

C# 9: Combinator, Parenthesized and Relational Patterns

C#9 introduced combinator patterns: conjunctive and, disjunctive or and negated not patterns. This is especially useful to avoid repeating a variable in a complex boolean expression.

public static bool IsLetterOrSeparator(this char c) =>
   c is (>= 'a' and <= 'z') or (>= 'A' and <= 'Z') or '.' or ',';

The code sample above also illustrates the parenthesized pattern. Here parenthesis can be removed but they make the logic clearer.

Notice that the new negated not patterns constitutes a new syntax for null check: if(obj is not null) { ... }

C#9 also introduced relational patterns < > <= >=.

public static double GetTaxRateWithRelationalPattern(double monthlyIncome)
   => monthlyIncome switch {
      >= 0 and < 1000 =>  0,
      < 5000          => 10,
      _               => 20

The relational pattern fits especially well when used with property pattern as shown by the code sample below. Also a nice C#9 addition illustrated below is that the underscore symbol can be omitted in type pattern for a lighter syntax:

public static string WhatShapeWithRelationalPattern(this Shape shape)
   => shape switch {
   Circle { Radius: > 1 and <= 10} => "shape is a well sized circle",
   Circle { Radius: > 10 }         => "shape is a too large circle",
   // Before C#9 underscore was required: Rectangle _ =>
   Rectangle                       => "shape is a rectangle",
   // Here underscore is still required
   _                               => "shape doesn't match ay pattern"

C# 10: Extended Property Pattern

C# 10 introduced extended property pattern which is useful to nest property calls as illustrated by the code sample below:

public static IEnumerable<Person> PersonsWithShortNameCSharp9(this IEnumerable<object> seq)
   => seq
      .Where(x => x is Person { FirstName: { Length: <= 5 } })
      .Cast<Person>();
public static IEnumerable<Person> PersonsWithShortNameCSharp10(this IEnumerable<object> seq)
   => seq
      .Where(x => x is Person { FirstName.Length: <= 5 })
      .Cast<Person>()

C# 11: List and Slice Pattern [under development]

C#11 to be released in November 2022 might introduce new patterns like:

  • array is [1, 2, 3]  that will match an integer array of the length three with 1, 2, 3 as its elements, respectively.
  • [_, >0, ..] or [.., <=0, _] to match length >= 2 && ([1] > 0 || length == 3 || [^2] <= 0) where the length value of 3 implies the other test.
  • [_, >0, ..] and [.., <=0, _]to match length >= 2 && [1] > 0 && length != 3 && [^2] <= 0 where the length value of 3 disallows the other test.

See the public discussion here.

Pattern Matching is no magic

Pattern matching is no magic. At first glance one could think that a pattern expression is like a IQueryable LINQ expression: a language peculiarity that the compiler translates to a parametrized runtime object with special runtime processing. But it is not. Patterns are translated to traditional IL code. For example let’s decompile these two methods:

public static bool IsLetterOrSeparator1(this char c) =>
   c is (>= 'a' and <= 'z') or (>= 'A' and <= 'Z') or '.' or ',';
public static bool IsLetterOrSeparator2(this char c) =>
   (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
   c == '.' || c == ',';

Here is the IL code:

Don’t overuse Pattern Matching

C# Pattern matching is to complex if / else statements what C# LINQ is to for / foreach loop: a nice and modern C# improvement to write more concise and readable code. The parallel with C# LINQ can go a little further: the decompiled IL code above shows that pattern matching can emit more verbose code than regular if / else statements for similar behavior. Could pattern matching code be slower than the counterpart if / else code? There are a few evidences found on the web but no real thorough study. However the same recommendation applies: don’t use LINQ nor Pattern Matching in performance critical path executed millions or billions of time at runtime.

One of the most prominent pattern matching usage (at least in documentation) is type pattern. Concretely the code should behave differently if a shape is a circle or a rectangle. But one must keep in mind that polymorphism is here for that. It would be terrible design to use pattern matching to compute a shape area instead of providing an abstract Area property within the base class Shape:

public static double Area(this Shape shape) => shape switch {
   Circle c     => c.Radius * c.Radius * Math.PI,
   Rectangle r  => r.Width * r.Height

This would be a maintenance nightmare since new kind of shape introduced would need its own area formula (a clear violation of the Open Close Principle). Also we can expect poor performance since virtual method table used at runtime to handle polymorphism is certainly faster than type compatibility checking.

Conclusion

Again C# Pattern Matching is to complex if / else statements what C# LINQ is to for / foreach loop: a nice and modern C# improvement to write more concise and readable code.

However pattern matching should be avoided in performance critical situations where usual checks can perform better at runtime. At least do benchmark such usage.

Type pattern is not a replacement for polymorphism and should be keep only for peculiar situations.

Finally – as underlined in the C# 7 section – C# pattern matching suffers from the limitation that only constant expressions can be used to match against. Maybe the C# team will relax this in the future but they need to find a good experience around exhaustiveness, particularly in switch expressions. With a non-compile-time-constant pattern, how the compiler could determine if all cases are handled? Another concern is when matching against values that trigger side effects – or even worse non-constant values – which would lead to undefined behavior.

My dad being an early programmer in the 70's, I have been fortunate to switch from playing with Lego, to program my own micro-games, when I was still a kid. Since then I never stop programming.

I graduated in Mathematics and Software engineering. After a decade of C++ programming and consultancy, I got interested in the brand new .NET platform in 2002. I had the chance to write the best-seller book (in French) on .NET and C#, published by O'Reilly and also did manage some academic and professional courses on the platform and C#.

Over my consulting years I built an expertise about the architecture, the evolution and the maintenance challenges of large & complex real-world applications. It seemed like the spaghetti & entangled monolithic legacy concerned every sufficiently large team. As a consequence, I got interested in static code analysis and started the project NDepend.

Today, with more than 12.000 client companies, including many of the Fortune 500 ones, NDepend offers deeper insight and full control on their application to a wide range of professional users around the world.

I live with my wife and our twin kids Léna and Paul in the beautiful island of Mauritius in the Indian Ocean.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK