13

Stupid RCU Tricks: Waiting for Grace Periods From NMI Handlers

 2 years ago
source link: https://paulmck.livejournal.com/66175.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

UNDER CONSTRUCTION

This blog post discusses a few alternative Rust-language memory models. I hope that this discussion is of value to the Rust community, but in the end, it is their language, so it is also their choice of memory model.

This discussion takes the Rust fearless-concurrency viewpoint, tempered by discussions I have observed and participated in while creating this blog series. Different members of that community of course have different viewpoints, and thus might reasonably advocate for different choices. Those who know me will understand that these viewpoints differ significantly from my own. However, my viewpoint is dictated by my long-standing privilege of living at the edge of what is possible in terms of performance, scalability, real-time response, energy efficiency, robustness, and much else besides. Where I live, significant levels of fear are not just wise, but also necessary for survival. To borrow an an old saying from aviation, there are old pilots, and there are bold pilots, but there are no old bold pilots.

Nevertheless, I expect that my more than three decades of experience with concurrency, my work on the C/C++ memory model (memory_order_consume notwithstanding), and my role as lead maintainer of the Linux-kernel memory model (LKMM) will help provide a good starting point for the more work-a-day situation that I believe that the Rust community wishes to support.

But Doesn't Rust Already Have a Memory Model?

Kinda sorta?

Some in academia have assumed that Rust's memory model will be based on that of C/C++, for example, see here. And the Rustnomicon agrees, though the author of that text does not appear to be very happy about it:

Rust pretty blatantly just inherits the memory model for atomics from C++20. This is not due to this model being particularly excellent or easy to understand. Indeed, this model is quite complex and known to have several flaws. Rather, it is a pragmatic concession to the fact that everyone is pretty bad at modeling atomics. At very least, we can benefit from existing tooling and research around the C/C++ memory model. (You'll often see this model referred to as "C/C++11" or just "C11". C just copies the C++ memory model; and C++11 was the first version of the model but it has received some bugfixes since then.)

Trying to fully explain the model in this book is fairly hopeless. It's defined in terms of madness-inducing causality graphs that require a full book to properly understand in a practical way. If you want all the nitty-gritty details, you should check out the C++ specification. Still, we'll try to cover the basics and some of the problems Rust developers face.

The C++ memory model is fundamentally about trying to bridge the gap between the semantics we want, the optimizations compilers want, and the inconsistent chaos our hardware wants. We would like to just write programs and have them do exactly what we said but, you know, fast. Wouldn't that be great?

I would argue that compiler optimizations are the source of much more inconsistent chaos than hardware would ever dream of providing, but that argument has been going on for many years and I doubt that we will be able to settle it here.

Leaving that argument aside, entertaining though it might be, the wording of this passage of the Rustnomicon certainly invites alternatives to the C/C++ memory model. Let's see what we can do.

Where to Start?

Let's first eliminate a number of candidate memory models. Rust's presumed portability goals rules out any of the hardware memory models. Rust's growing connection to the deep embedded world rules out the memory models of dynamic languages such as Java and Javascript. The growing number of memory models derived from the C/C++ memory model are represented by the actual C/C++ memory model, so we will not consider those variants separately.

This leaves the aforementioned C/C++ memory model and of course LKMM, the latter inspired by the Rust communities ambitions within the Linux kernel. Because this blog series focuses on the Linux kernel, this post will start with LKMM, move to the C/C++ memory model, and end with a specific recommendation.

Linux-Kernel Memory Model (LKMM)

This section will consider aspects of LKMM, starting with the most fear-inducing aspects and moving to the more work-a-day aspects.

Control dependencies are not the most fearless portion of LKMM, in fact, in my role as lead maintainer of LKMM, I need people using control dependencies not to merely feel fearful, but instead to feel downright terrified. Control dependencies are fragile and easy for control-dependency-oblivious compiler optimizations to destroy. Therefore, for the time being, control dependencies should be excluded from any Rust memory model. Perhaps the day will come when we have a good way to communicate the intent of control dependencies to compilers, but today is not that day.

Address and data dependencies carried by pointers are not free of risk, but they do provide dependency-oblivious compilers significantly less scope for destructive optimizations. They should thus require much less fear than do control dependencies, little though that might be saying. Nevertheless, address and data dependencies are mainly used in conjunction with RCU, and we do not yet have a known good way of expressing all RCU use cases within the confines of Rust's ownership model. Therefore, address and data dependencies should also be excluded from Rust's memory model until such time as significant RCU use cases are handled or some other clear and present use case militates for address and data dependencies. It would be increasingly reasonable to permit control and data dependencies within Rust unsafe mode should use of RCU within Rust increase, keeping in mind that things like epoch-based reclamation (EBR) are particular classes of implementations of RCU.

At first glance, it seems entirely reasonable to countenance use of READ_ONCE() and WRITE_ONCE() within prospective Linux-kernel Rust code, but this post is discussing Rust in general, not just Rust in the Linux kernel. And the fact is that address, data, and control dependencies are the only things that prevent the Linux kernel's use of READ_ONCE() and WRITE_ONCE() from resulting in out-of-thin-air (OOTA) behavior. Except that these operations (along with the Linux kernel's unordered atomic read-modify-write (RMW) operations) are implemented so as to prevent the compiler from undertaking code-motion optimizations that might otherwise reorder such operations. Furthermore, all of the underlying hardware memory models of which I am aware preserve dependency orderings. Therefore, one might expect that these unordered operations might reasonably be part of a Rust memory model.

Unfortunately, one imporant benefit of a memory model is the tooling that analyzes concurrent code fragments, and if this tooling is to exclude OOTA behaviors, it is absolutely necessary for that tooling to understand dependencies. Except that we have already excluded such dependencies from the Rust memory model.

Therefore, the Rust memory model should restrict its support for Linux-kernel atomic operations to those that provide ordering. These would be the value-returning non-relaxed read-modify-write (RMW) atomic operations along with the _acquire() and _release() variants of non-value-returning RMW atomic operations. It might also make sense to allow combinations of unordered RMW operations with combination memory barriers, for example, atomic_inc() followed by smp_mb__after_atomic(), but it would make even more sense to wrapper these in combination as a single Rust-accessible primitive. This combined Rust primitive would no longer be unordered, and could thus be included as an ordered unit in the Rust memory model. Alternatively, unordered atomics might be relegated to Rust's unsafe mode.

It is hard to imagine a useful Rust memory model that excludes locking.

Thus, starting from LKMM, we arrive at a model that supports ordered atomic operations and locking, possibly including unordered atomics in unsafe mode.

C/C++ Memory Model

This section will instead start from the C/C++ memory model.

Because memory_order_relaxed accesses can yield OOTA results, they seem inconsistent with Rust's fearless-concurrency goals. These cannot actually occur in practice with any implementation that I am aware of, but fearless concurrency requires accurate tools, and this accuracy requires excluding even the theoretical possibility of OOTA results. Another reasonable approach would permit use of memory_order_relaxed only within Rust unsafe code.

The memory_order_consume is primarily useful in conjunction with RCU. In addition, in all implementations that I am aware of, memory_order_consume is simply promoted to memory_order_acquire. It therefore seems unnecessary to include memory_order_consume within a Rust memory model. As before, a reasonable alternative would permit its use only within Rust unsafe code.

In contrast, memory_order_acquire, memory_order_release, and memory_order_acq_rel, are all easily analyzed and have been heavily used in practice for decades. Although the celebrated memory_order_seq_cst (sequentially consistent) memory ordering can be difficult to analyze, its strong ordering is absolutely required by some use cases, including a sadly large fraction of concurrent algorithms published by academics and industrial researchers. In addition, there is a decades-long tradition of proofs and tooling handling sequential consistency, difficulties aside. Use of all four of these orderings should therefore be permitted from safe Rust code.

With the C/C++ memory model as it was with LKMM, it is hard to imagine a useful Rust memory model that excludes locking.

Thus, starting from the C/C++ memory model, we also arrive at a model that supports ordered atomic operations and locking, possibly along with unordered atomic operations in unsafe mode.

Recommendation

The fact that starting with LKMM got us to roughly the same place as did starting with the C/C++ memory model should give us considerable confidence in that destination. Why the “roughly”? Because there are subtle differences, as can be seen by comparing the C and C++ standards to LKMM.

One reaction to this situation would be to go through these memory models one corner case at a time, and for each corner case making a choice for Rust, now and forever. Perhaps a better approach is to pick one and adapt as needed based on real situations as they arise. At this time, there are far more projects living within the C/C++ memory model than within LKMM. Therefore, despite my role as lead maintainer of LKMM, it is my painful duty to recommend the following, based on the C/C++ memory model:
  1. Locking may be used from both safe and unsafe modes.
  2. Atomic operations using memory_order_acquire, memory_order_release, memory_order_acq_rel, and memory_order_seq_cst may be used from both safe and unsafe modes.
  3. Atomic operations using memory_order_relaxed and memory_order_consume may be only from unsafe mode.
  4. All atomic operations in Rust code should be marked, that is, Rust should avoid following the C/C++ practice of interpreting an unmarked mention of an atomic variable as a memory_order_seq_cst access to that variable. Requiring marking allows accesses to concurrently accessed shared variables to be identified at a glance, and also makes the choice of any default memory_order value much less pressing.
This provides explicit support for the operations and orderings that have a long history of heavy use and and for which analysis tools have long been available. It also provides provisional support of operations and orderings that, while also having a long history of heavy use, are beyond the current state of the art for accurate and complete analysis.

Other Options

One could argue that memory_order_relaxed also has many simple use cases, for but one example, constructing distributed counters. However, given the difficulty verifying its full range of use cases, unsafe mode seems the safest bet at the moment. Should any of the current efforts to more straightforwardly verify memory_order_relaxed accesses bear fruit, then perhaps memory_order_relaxed might be permitted in Rust safe mode.

Finally, one could argue that memory_order_consume should be excluded entirely, rather than simply being relegated to unsafe mode. However, some Rust libraries already use RCU internally, and the ability to flag RCU pointer traversals could prove helpful should Rust someday fully support address and data dependencies.

In contrast, as noted earlier, the other four memory_order enum members are heavily used and routinely analyzed. It is therefore reasonable to permit use of memory_order_acquire, memory_order_release, memory_order_acq_rel, and memory_order_seq_cst within Rust safe code.

What Does This Mean for the Linux Kernel?

As it turns out, not much.

Please keep in mind that the Linux kernel currently interoperates with the full range of memory models used by arbitrary userspace code written in arbitrary languages when running on a wide range of hardware memory models. Any adjustments required are currently handled by architecture specific code, for example, in the system-call layer or in exception entry/exit code. Strong ordering is also provided deeper within the Linux kernel, with but one example being the context-switch code, which must provide full ordering when processes migrate from one CPU to another.

It turns out that Rust code in Linux has wrappers around any C code that it might invoke, and it is through these wrappers that Rust code will make use of the non-Rust portions of the Linux kernel. These wrappers will therefore contain calls to whichever memory-ordering primitives might be required at any particular point in Rust's memory-model evolution.

But Does Rust Really Need to Define Its Memory Model Right Now?

This is of course not my decision.

But it is only fair to point out that the longer the Rust community waits, the more the current “kinda sorta” choice of the full C/C++ memory model will become the actual long-term choice. Including those portions of the C/C++ memory model that have proven troublesome, and that are therefore likely to be subject to change.

My recommendation is therefore to adopt the untroublesome portions of the C/C++ memory model in safe mode, and the rest in unsafe mode. This approach allows people to write work-a-day concurrent algorithms in Rust and to be confident that the resulting code will still work in the future. It also allows those needing to live more dangerously to confine their code to unsafe blocks so as to attract an appropriate level of skepticism and attention.

But again, this decision rests not with me, but with the Rust communty.

History

November 4, 2021: Unmarked Rust-language accesses are never atomic operations plus some wordsmithing.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK