![](/style/images/good.png)
![](/style/images/bad.png)
Github RFC: I/O Safety by sunfishcode · Pull Request #3128 · rust-lang/rfcs · Gi...
source link: https://github.com/rust-lang/rfcs/pull/3128
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Raw OS handles such as RawFd
and RawHandle
have hazards similar to raw pointers; they may be bogus or may dangle, leading to broken encapsulation boundaries and code whose behavior is impossible to bound in general.
Introduce a concept of I/O safety, and introduce a new set of types and traits, led by OwnedFd
and BorrowedFd
, to support it.
[Edited to reflect changes to the RFC].
Either way, creating an IO-safe random-access pointer to the mmap device file can mess up the memory safety of rust, right?
Either way, creating an IO-safe random-access pointer to the mmap device file can mess up the memory safety of rust, right?
A safe mmap
wrapper requires preventing writes to the file (if you hand out &T
s or &mut T
s) and truncations (which can lead to SIGBUS
s). This is difficult in general (and making it easier is out of scope here) but there are OS-specific ways to do it in some cases. Some of the ways work by making the file inaccessible to other processes, however they don't always prevent access from the same process. Any code in the process holding a bogus or dangling file descriptor can bypass encapsulation boundaries and operate on the file with the rights of the process. I/O safety prevents safe Rust from doing that.
Either way, creating an IO-safe random-access pointer to the mmap device file can mess up the memory safety of rust, right?
Ah, if you're talking about Linux's /proc/self/mem, yes, that can also break memory safety. Spawning a debugger process can too. These are independent concerns that don't cancel the usefulness of memory safety or I/O safety.
Copy link
the8472 commented 12 days ago
Any code in the process holding a bogus or dangling file descriptor can bypass encapsulation boundaries and operate on the file with the rights of the process. I/O safety prevents safe Rust from doing that.
File::open("/proc/self/fd/5")
File::open("/proc/self/fd/5")
I view that like /proc/self/mem
: we have to basically ignore its existence to be able to say anything reasonable about what Rust programs can and cannot do. Ideally we'd be able to ask the Linux kernel to just turn off these files for our process... that's a different discussion.
Copy link
the8472 commented 12 days ago
Ideally we'd be able to ask the Linux kernel to just turn off these files for our process... that's a different discussion.
That could be done by passing a flag to openat2
, at least for the fd magic symlinks, but not for mem
. But that would break legit uses of the fd
directory such as bash's process substitution feature.
It's true that one can reasonably say that anything to do with File and OpenOptions is arguably unsafe
because we can access things like the proc filesystem with it and thus edit a Rust program's runtime memory. But while marking those as unsafe
in response might be humorous, the joke would be short lived, and I am not entirely sure the laughs would last much longer if we asked the OS to block a bunch of file descriptors. Most OS allow a programmer to invoke ambient authority early and often, and accordingly so does Rust. But that does not mean we should give up on this concern about I/O safety.
I believe the correct focus for unsafe
semantics is on anything that projects into, and can be fully crystallized within, Rust logical control flow. File manipulation by necessity projects both ways: Rust extends requests to the OS, and the OS extends authority to Rust. With that, we can focus our concern on making sure the program does not violate Rust programmatic I/O safety.
That means it's not about whether the OS allows Rust to edit its own runtime memory, or even its binary code, but making sure that the authority to do so is obtained in a lawful manner obtained via File APIs. With an appropriate extension to the language, such as IoSafe
or some other means, we could make sure sure that a Rust program can, following the explicit request of the programmer, soundly invalidate each and every last one of Rust's safety guarantees (and itself as a Rust program!). That's much better than unsoundly invalidating all of Rust's safety guarantees, I think.
Several folks have expressed interest in seeing what a less minimalist change to std
might look like, both in comments above and in offline discussion with @joshtriplett.
I've now put together a prototype of such an API. The high-level summary is that it defines OwnedFd
and BorrowedFd
and related things. BorrowedFd
looks like it'll be nicer than &OwnedFd
because it can use repr(transparent)
and participate in FFI directly. This means users can avoid touching Raw*
values entirely.
The prototype repo, with an example that supports Unix and Windows, is here:
https://github.com/sunfishcode/io-experiment
Feedback welcome!
Yeah that looks great. :) Though I am a bit surprised by OptionFd
... for this to be useful in FFI, I think the "sentry value" needs to be documented. But I am also not sure if it is truly needed -- since rust-lang/rust#74699 landed, we could probably arrange for Option<OwnedFd>
to have the desired representation.
Option<OwnedFd>
instead of OptionFd
would be nice, if feasible. It looks like that would require special-casing it within rustc in places like this and this and possibly others. Does that sound like what you're thinking about?
Also, my prototype doesn't yet deal with this yet, but Windows has multiple representations for optional handles. I'm not a Windows expert, but if I understand correctly, we may still be able to have Option<Handle>
with NULL as the sentry value, along with separate types for file handles with INVALID_HANDLE_VALUE
as the sentry value.
Option instead of OptionFd would be nice, if feasible. It looks like that would require special-casing it within rustc in places like this and this and possibly others. Does that sound like what you're thinking about?
I think it would just require setting the right rustc_layout_scalar_valid_range_start
/rustc_layout_scalar_valid_range_end
like rust-lang/rust#74699 did. No rustc changes should be necessary, only library changes (and documenting that we guarantee the appropriate ABI).
EDIT: Oh, you are pointing to the FFI-compat lints. Yeah those would also need to be adjusted, indeed. (This is the codification of "documenting that we guarantee the appropriate ABI".)
Users could still do unsafe { File::from_raw_fd(5) }
right? How can using IoSafe
helps here? Isn't it the same?
@pickfire that's unsafe
.
Consider this: users can also do unsafe { &*(42 as *const i32) }
, and yet Rust has memory safety. The same works for I/O safety.
Copy link
FloLo100 commented 8 days ago
Users could still do
unsafe { File::from_raw_fd(5) }
right? How can usingIoSafe
helps here? Isn't it the same?
"Dereferencing" filedescriptors not obtained by functions implementing IoSafe
would be considered UB aka "Dont do that".
I've now removed the OptionFd
as @RalfJung suggested, and it works well using Option
.
To support this, I added niche optimizations to the Windows types, so that Option<OwnedHandle>
and Option<OwnedSocket>
have the right representation for FFI in the general case, and added an OptionFileHandle
to cover the INVALID_HANDLE_VALUE
case.
Seeing how well repr(transparent)
works out, and seeing how tricky IoSafe
is turning out to be, I'm becoming much more optimistic about this approach in general.
To support this, I added niche optimizations to the Windows types, so that
Option<OwnedHandle>
andOption<OwnedSocket>
have the right representation for FFI in the general case, and added anOptionFileHandle
to cover theINVALID_HANDLE_VALUE
case.
iirc Windows has two different invalid values it uses inconsistently for handles: NULL
and INVALID_HANDLE_VALUE
. I can't remember which one(s) are used for file, socket, and other handles we care about, this will require extra care to get right, at the minimum the types should be explicitly documented which values they have for niches since that could be a major foot-gun.
iirc Windows has two different invalid values it uses inconsistently for handles:
NULL
andINVALID_HANDLE_VALUE
. Icr which one(s) are used for file, socket, and other handles we care about, this will require extra care to get right, at the minimum the types should be explicitly documented which values they have for niches since that could be a major foot-gun.
That's right; the blog post I linked to above has more info. io-experiment now deals with this, and documents which values are valid in each type, such as here for OptionFileHandle
, the type where INVALID_HANDLE_VALUE
is disallowed.
If possible, I think putting the IoSafe
(or equivalent) traits in core
is a good idea, allowing no_std
crates to handle file descriptors or handles in an abstracted manner, or otherwise use IoSafe
without needing to depend on std
.
iirc Windows has two different invalid values it uses inconsistently for handles:
NULL
andINVALID_HANDLE_VALUE
. Icr which one(s) are used for file, socket, and other handles we care about, this will require extra care to get right, at the minimum the types should be explicitly documented which values they have for niches since that could be a major foot-gun.
"icr"?
"icr"?
I can't remember.
Feedback I've gotten so far on the new OwnedFd
and BorrowedFd
has been positive, and I've now ported a few codebases to use the new prototype, and it's working out well enough that I now think that this is the better design, so I've revised the RFC here to propose this new API instead of IoSafe
. Of course, feedback is still welcome on either approach, and I've left a mention of IoSafe
in the Alternatives section.
Outdated
These two types are conceptual replacements for `RawFd`, and represent owned
and borrowed handle values. `OwnedFd` owns a file descriptor, including closing
it when it's dropped. `BorrowedFd`'s lifetime parameter ties it to the lifetime
of something that owns a file descriptor. These types enforce all of their I/O
RalfJung 3 days ago •
Member
I would phrase it as
BorrowedFd
's lifetime parameter says for how long access to this file descriptor has been borrowed.
Just like with &'a T
, this lifetime may be an under-approximation of the actual lifetime of the underlying object, and that's okay. We often borrow things for less than the full lifetime of an object. For example, the owner might want exclusive access back and then do more things with it.
Do you consider these two formulations to be equivalent? They sound non-equivalent to me, so I wonder what lead you to your choice.
sunfishcode 3 days ago •
Author
Member
You're right. I was just thinking that it's obvious that a user could always drop a BorrowedFd
earlier than the end of the lifetime, but it's better to clarify. I've now updated it to use your proposed wording.
- A `as_filelike_view::<T>()` function returns a `View`, which contains a
temporary `ManuallyDrop` instance of T constructed from the contained
file descriptor, allowing users to "view" a raw file descriptor as a
`File`, `TcpStream`, and so on.
RalfJung 3 days ago
Member
Do you propose to add this to the standard library, or is this just mentioned as a possible use of these APIs?
sunfishcode 3 days ago •
Author
Member
I propose adding this to the standard library. Experience with it in unsafe-io and feedback in this RFC is that it's a relatively common thing to want to do with file descriptors, and it's a safe API that requires unsafe
to implement.
That said, it isn't essential, and could also be provided by an extension trait in a crate.
safety invariants automatically.
it when it's dropped. `BorrowedFd`'s lifetime parameter says for how long
access to this file descriptor has been borrowed. These types enforce all of
their I/O safety invariants automatically.
RalfJung 3 days ago
Member
Thanks!
With this change, I think I am wondering why the lifetime parameter is called 'owned
?
sunfishcode 3 days ago
Author
Member
I read 'owned
as saying "assume there's an OwnedFd
(or equivalent) that this BorrowedFd
can't outlive". I'm not attached to it, and recognize that that may be specific to how I think about lifetimes. I'd be ok with 'a
or another name here.
workingjubilee 2 days ago •
Contributor
Simply 'file
or 'fd
might be fine, because BorrowedFd's reference back to that OwnedFd is trying to represent the "inherent" lifetime of the file descriptor as a constraint.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK