Async Interview #8: Stjepan Glavina
source link: http://smallcultfollowing.com/babysteps/blog/2020/07/09/async-interview-8-stjepan-glavina/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Async Interview #8: Stjepan Glavina
Jul 9, 2020
Several months ago, on May 1st, I spoke to Stjepan Glavina about his (at the time) new crate, smol. Stjepan is, or ought to be, a pretty well-known figure in the Rust universe. He is one of the primary authors of the various crossbeam crates, which provide core parallel building blocks that are both efficient and very ergonomic to use. He was one of the initial designers for the async-std runtime. And so when I read stjepang’s blog post describing a new async runtime smol that he was toying with, I knew I wanted to learn more about it. After all, what could make stjepang say:
It feels like this is finally it - it’s the big leap I was longing for the whole time! As a writer of async runtimes, I’m excited because this runtime may finally make my job obsolete and allow me to move onto whatever comes next.
If you’d like to find out, then read on!
Video
You can watch the video on YouTube. I’ve also embedded a copy here for your convenience:
What is smol?
smol is an async runtime, similar to tokio or async-std, but with a distinctly different philosophy. It aims to be much simpler and smaller. Whereas async-std offers a kind of “mirror” of the libstd API surface, but made asynchronous, smol tries to get asynchrony by wrapping and adapting synchronous components. There are two main ways to do this:
- One option is to delegate to a thread-pool. As we’ll see, stjepang
argues that this option is can be much more efficient than people realize,
and that it makes sense for things like accesses to the local file system.
smol offers the
blocking!
macro as well as adapters like thereader
function, which convertsimpl Read
values intoimpl AsyncRead
values. - The other option is to use the
Async<T>
wrapper to convert blocking I/O sockets into non-blocking ones. This works for any I/O typeT
that is compatible withepoll
(or its equivalent; on Mac, smol useskqueue
, and on Windows, smol useswepoll
).
Delegation to a thread pool
One of the debates that has been going back and forth when it comes to
asynchronous coding is how to accommodate things that need to block.
Async I/O is traditionally based on a “cooperative” paradigm, which
means that if you thread is going to do blocking I/O – or perhaps
even just execute a really long loop – you ought to use an explicit
operation like spawn_blocking
that tells the scheduler what’s going
on.
Earlier, in the context of async-std, stjepang introduced a new async-std scheduler, inspired by Go. This scheduler would automatically determine when tasks were taking too long and try to spin up more threads to compensate. This was simpler to use, but it also had some downsides: it could be too pessimistic at times, creating spikes in the number of threads.
Therefore, in smol, stjepang returned to the approach of explicitly
labeling your blocking sections, this time via the blocking!
macro. This macro will move the “blocking code” out from the
cooperative thread pool to one where the O/S manages the scheduling.
Explicit blocking is often just fine
In fact, you might say that the core argument of smol
is that some
mechanism like blocking!
is often “good enough”. Rather than reproducing
or cloning the libstd API surface to make it asynchronous, it is often
just fine to use the existing API but with a blocking!
adapter
wrapped around it.
In particular, when interacting with the file system or with
stdin/stdout, smol’s approach is based on blocking. It offers
reader
and writer
adapters that move that processing to
another thread.
The Async wrapper
But of course if you were spawning threads for all of your I/O,
this would defeat the purpose of using an async runtime in the first
place. Therefore, smol offers another approach, the Async<T>
wrapper.
The idea of Async<T>
is that you can take a blocking abstraction,
like the TcpStream
found in the standard library, and convert it
to be asynchronous by creating a Async<TcpStream>
. This works for
any type that supports the AsRawFd
trait, which gives access to
the underlying file descriptor. We’ll explain that in a bit.
So what can you do with an Async<TcpStream>
? The core operations
that Async<T>
offers are the async functions read_with
and
write_with
. They allow you to wrap blocking operations and have
them run asynchronously. For example, given a socket
of type
Async<UdpSocket>
, you might write the following to send data
asynchronously:
let len = socket.write_with(|s| s.send(msg)).await?;
How the wrappers work: epoll
So how do these wrappers work under the hood? The idea is quite
simple, and it’s connected to how epoll works. The idea with a
traditional Unix non-blocking socket is that it offers the same
interface as a blocking one: i.e., you still invoke functions like
send
. However, if the kernel would have had to block, and the socket
is in non-blocking mode, then it simply returns an error code instead.
Now the user’s code knows that the operation wasn’t completed and it
can try again later (in Rust, this is
io::ErrorKind::WouldBlock
). But how does it know when to try
again? The answer is that it can invoke epoll
to find out when the
socket is ready to accept data.
The read_with
and write_with
methods build on this
idea. Basically, they execute your underlying operation just like
normal. But if that operation returns WouldBlock
, then the
function will register the underlying file descriptor (which was
obtained via AsRawFd
) with smol’s runtime and yield the current
task. smol’s reactor will invoke epoll and when epoll indicates that
the file descriptor is ready, it will start up your task, which will
run your closure again. Hopefully this time it succeeds.
If this seems familiar, it should. Async<T>
is basically the same
as the core Future
interface, but “specialized” to the case of
pollable file descriptors that return WouldBlock
instead of
Poll::Pending
. And of course the core Future
interface
was very much built with interfaces like epoll
in mind.
Ergonomic wrappers
The read_with
and write_with
wrappers are very general but
not the most convenient to use. Therefore, smol offers some “convenience impls”
that basically wrap existing methods for you. So, for example,
given my socket: Async<UdpStream>
, earlier we saw that I can send data
with write_with
:
let len = socket.write_with(|s| s.send(msg)).await?;
but I can also invoke socket.send
directly:
let len = socket.send(msg).await;
Under the hood, this just delegates to a call to write_with
.
Bridging the sync vs sync worlds
stjepang argues that based the runtime around this idea of “bridging”
the sync vs async worlds not only makes for a smaller runtime, but
also has the potential to help bridge the gap between the “sync” and
“async” worlds. Basically, user’s today have to choose: do they base
their work around the synchronous I/O interfaces, like Read
and
Write
, or the asynchronous ones? The former are more mature and
there are a lot of libraries available that build on them, but the
latter seem to be the future.
smol presents another option. Rather than converting all libraries to
async, you can just adapt the synchronous libraries into the async
world, either through Async<T>
, where that applies, or through the
blocking adapters like reader
or writer
.
We walked through the example of the inotify
crate. This is an
existing library that wraps the
inotify interface in the
linux kernel in idiomatic Rust. It is written in a sychronous style,
however, and so you might think that if you are writing async code,
you can’t use it. However, its core type implements
AsRawFd
. That
means that you can create an Async<Inotify>
instance and invoke all
its methods by using the read_with
or write_with
methods (or
create ergonomic wrappers of your own).
Digging into the runtime
In the video, we spent a fair amount of time digging into the guts of how smol is implemented. For example, smol never starts threads on its own: instead, users start their own threads and invoke functions from smol that put those threads to work. We also looked at the details of its thread scheduler, and compaerd it to some of the recent work towards a new Rayon scheduler that is still pending. (Side note, there’s a recorded deep dive on YouTube that digs into how the Rayon scheduler works, if that’s your bag). In any case, we kind of got into the weeds here, so I’ll spare you the details. You can watch the video. =)
The importance of auditing and customizing
One interesting theme that we came to later is the importance of being able to audit unsafe code. stjepang mentioned that he has often heard people say that they would be happy to have a runtime that doesn’t achieve peak performance, if it makes use of less unsafe code.
In fact, I think one of the things that stjepang would really like to see is people taking smol and, rather than using it directly, adapting it to their own codebases. Basically using it as a starting point to build your own runtime for your own needs.
Towards a generic runtime interface?
It’s not a short-term thing, but one of the things that I personally am very interested in is getting a better handle on what a “generic runtime interface” looks like. I’d love to see a future where async runtimes are like allocators: there is a default one that works “pretty well” that you can use a lot of the time, but it’s also really use to change that default and port your application over to more specialized allocators that work better for you.
I’ve often imagined this as a kind of trait that encapsulates the
“core functions” a runtime would provide, kind of like the
GlobalAlloc
trait for allocators. But stjepang pointed out that
smol suggests a different possibility, one where the std library
offers a kind of “mini reactor”. This reactor would offer functions to
“register” sockets, associate them with wakers, and a function that
periodically identifies things that can make progress and pushes them
along. This wouldn’t in and of itself be a runtime, but it would be a
building block that other runtimes can use.
Anyway, as I said above, I don’t think we’re at the point where we know what a generic runtime interface should look like. I’m particularly a bit nervous about something that is overly tied to epoll, given all the interesting work going on around adapting io-uring (e.g., withoutboat’s Ringbahn) and so forth. But I think it’s an interesting thing to think about, and I definitely think smol stakes out an interesting point in this space.
Conclusion
My main takeaways from this conversation were:
- The “core code” you need for a runtime is really very little.
- Adapters like
Async<T>
and offloading work onto thread pools can be a helpful and practical way to unify the question of sync vs async. - In particular, while I knew that Future’s were conceptually quite
close to epoll, I hadn’t realized how far you could get with a
generic adapter like
Async<T>
, which maps between the I/OWouldBlock
error and thePoll::Pending
future result. - In thinking about the space of possible runtimes, we should be considering not only things like efficiency and ergonomics, but also the total amount of code and our ability to audit and understand it.
Comments?
There is a thread on the Rust users forum for this series.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK