4

RFC: result_ffi_guarantees by Lokathor · Pull Request #3391 · rust-lang/rfcs · G...

 1 year ago
source link: https://github.com/rust-lang/rfcs/pull/3391
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

RFC: result_ffi_guarantees #3391

Conversation

Contributor

@Lokathor Lokathor

commented

Feb 21, 2023

edited by tmandry
lebensterben, GrayJack, tmccombs, andylizi, Aloso, seritools, pravic, dtolnay, and joshlf reacted with thumbs up emoji

# Summary

[summary]: #summary

This RFC gives specific layout and ABI guarantees when wrapping "non-zero" data types from `core` in `Option` or `Result`. This allows those data types to be used directly in FFI, in place of the primitive form of the data (eg: `Result<(), NonZeroI32>` instead of `i32`).

Suggested change
This RFC gives specific layout and ABI guarantees when wrapping "non-zero" data types from `core` in `Option` or `Result`. This allows those data types to be used directly in FFI, in place of the primitive form of the data (eg: `Result<(), NonZeroI32>` instead of `i32`).
This RFC gives specific layout and ABI guarantees when wrapping "non-zero" data types from `core` in `Result`. This allows those data types to be used directly in FFI, in place of the primitive form of the data (eg: `Result<(), NonZeroI32>` instead of `i32`).

Contributor

Author

To the best of my knowledge, having these layout+ABI guarantees on Option hasn't actually been made official by any RFC or similar. The standard library has some examples of where Option wrapping some type will give a same layout, but I'm not sure of anywhere that actually specifies an ABI compatibility.

Contributor

Author

Re-reading that, it seems to assure memory layout not ABI.

I can remove the mentions of Option stuff if folks think it's absolutely necessary, but it's probably better to have things clear and all in one place. I'm very interested in doing low level and ffi work in Rust, and if even I don't know where to find this info, then we probably haven't written it down enough.

8573 reacted with thumbs up emoji

Option-like enums have their layout and ABI guaranteed by a FCP in rust-lang/rust#60300. Although, this is not currently documented in Rust reference but in UCG only.

joshtriplett

added T-lang Relevant to the language subteam, which will review and decide on the RFC. I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting.

labels

Feb 21, 2023

Contributor

Is there any particular downside to just making this guarantee for all enums similar to Option and Result in layout, like Poll for instance?

Contributor

Author

Yes that was mentioned in the Future Work portion. Basically it can probably become a semver hazard sort of thing if it's automatically done for all enums and people aren't opting in to the behavior. Keeping the RFC scoped small makes it more likely to be accepted and completed on the types people use most often.

Contributor

Ahh, I see; I completely missed that part. I guess that what I'm failing to understand is how you can actually run into that kind of hazard unless you use something like #[non_exhaustive], since any changes to the type would be breaking anyway.

The examples need a bit more diversity. They are all of the form Result<T, ()>, while the proposal also covers Result<(), E> or Result<T, ZST>.

This needs some more discussion on which exact zero-sized types should have layout optimization guarantees, because it could be a semver hazard, particularly in FFI. A type struct Error {} could have new fields added in the future. Perhaps limit only to #[repr(C)] ZSTs? Or to structs of the form struct NoParams;, where any field addition is obviously a breaking change?

I have doubts whether the layout guarantees would be useful in practice, though. The primary motivation in the RFC isn't memory usage but FFI ergonomics. With Option<&T>, for example, a raw pointer is already treated as "either nothing or a valid *mut T" in C function APIs, so the guarantee is reasonable. But with fallible APIs, the conventions are all over the place. Some functions return negative integers on error, some - positive ones, some do both of those, some even return 0, or return garbage and set ERRNO. In principle, you could cover those cases, e.g. Result<(), NonZeroI32> covers the "0 for success, nonzero for error" part. But in practice, you would expect specific error codes and not just arbitrary integers, so you'd have to do some error classification on the Rust side anyway.

This means that a separate ABI-stable struct as the return type, with functions success(self) -> T and error(self) -> Error would likely be a better choice. Once the Try-trait gets stable, you could also use all the usual ? sugar with those types.

konsumlamm and 8573 reacted with thumbs up emoji

The point about ZST is also brought up last week by @tmandry during lang team triage meeting (notes). We could use limit it to #[repr(transparent)] or #[repr(C)] ZSTs.

Tbh, I'm not particularly keen on having guarentees on repr(C) ZSTs. They are not a thing in ISO C and compiler extensions may (or may not) do different things with them which makes them potentially an FFI hazzard.

scottmcm reacted with thumbs up emoji

@ChrisDenton Rust guarantees their specific behaviour, so this seems like a non-issue. On the FFI side none of those types would exist anyway, just like Result<_, _> doesn't. Though #[repr(C)] is likely indeed a poor choice for layout optimization guarantees, since one could just as easily add private fields as to a #[repr(Rust)] type. #[repr(transparent)] indeed looks like a better option.

Contributor

Author

repr(transparent) newtypes of () would be sufficient to allow for different errors, but a lot of people would probably ask why they have to put a "useless field" in their error type to make it work right.

Why would you make it a newtype of ()? I'm thinking of the following definitions:

struct ZST;
#[repr(transparent)]
struct ZST();
#[repr(transparent)]
struct ZST {}
#[repr(transparent)]
struct ZST<T>(PhantomData<T>);

Contributor

Author

I had thought that a repr(transparent) needed to have a single non-zero sized field, but double checking using your example.. it seems that there can be one or zero non-zero sized fields.

So sure, repr(transparent) (or fieldless) seems a fine requirement on the ZST. One moment and I'll make an edit.

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns.
See this document for info about what commands tagged team members can give me.

rfcbot

added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it.

labels

Feb 28, 2023

Member

Overall seems reasonable.

@rfcbot concern needs-to-mention-non_exhaustive

It's implied by the definition of non_exhaustive that a non-exhaustive ZST can't be guaranteed here, but since we've had so many issues about it (rust-lang/rust#78586), I'd like to see it called out explicitly in here to make sure we get tests about it.

Member

it seems that there can be one or zero non-zero sized fields.

Zero fields was FCPed in rust-lang/rust#77841

CryZe

commented

Feb 28, 2023

edited

There's been some ideas of potentially modifying the calling convention to support Result more cheaply. Does this RFC interfere with that?

8573 reacted with thumbs up emoji

Contributor

Author

There's been some ideas of potentially modifying the calling convention to support Result more cheaply. Does this RFC interfere with that?

This RFC requires that the combined type be treated as the primitive type, so I guess "yes, it could interfere with that". However, since the primitive types are generally single-register (except and wide pointers and ints bigger than a machine register), this doesn't seem like a strong problem. Other forms of Result (eg: Result<(),()>) could still (in the future) be optimized differently (eg: have the function "return the value" via status flag being set or not).

Zero fields was FCPed in rust-lang/rust#77841

It seems that the reference didn't get an update.

concern needs-to-mention-non_exhaustive

Updated.

Member

one thing i'd specifically want to reserve space for is representing Result errors with unwinding (distinct from panicking) when the user opts into it, probably through an attribute on the error type or on the function. the idea is this is more efficient for functions that very rarely fail, because the generated LLVM IR don't have to construct Result values or match on them.

this may need a specific carve-out exception in the defined ABI for Result, so we should carve it out now before it becomes a breaking change.

Contributor

Author

Hm, could you say more about what you'd expect to be changed within the RFC to have that carved out? Some mentioning in Future Possibilities? I'm completely unfamiliar with how unwinding interacts with the foreign ABI modes.

* `Result<T, E>` where either `T` or `E`:

* Are a zero-sized type with alignment 1 (a "1-ZST").

* Either have no fields (eg: `()` or `struct Foo;`) or have `repr(transparent)` if there are fields.

* Do not have the `#[non_exhaustive]` attribute.

  1. I assume the nested list of requirements is a conjunction, but that should be made explicit.
  2. I am surprised to see repr(transparent) here. In the past we stated that a repr(transparent) in the standard library does not mean this type will not change in the future -- basically repr(transparent) is a private attribute that the library may use, but not clients of the library. If this is intended to deal with issues like repr(transparent) should not consider non_exhaustive types as 1-ZSTs outside their crate rust#78586, I would have expected "has no private field" (which is then trivially true for fieldless types)
8573 reacted with thumbs up emoji

Contributor

Author

  1. yeah it's a "must meet all of these", i can clarify the wording.

  2. I'd be fine changing to "must have no fields" if necessary. Since it's in FCP though maybe a T-lang member can make a clear call here before I touch that part?

The issue is that it's very easy to add a private field and accidentally break the guaranteed layout. Imho there should be some explicit way to opt into such optimizations for ZSTs which are guaranteed to remain ZST. I don't think #[repr(transparent)] is a good solution to the issue, but it's the closest approximation in current Rust, so it's at least something to consider.

Is the "private attribute" opinion documented somewhere? Afaik it isn't, and it's not uncommon to treat it as an API guarantee.

Contributor

Author

I asked Josh via Zulip and he said it would be fine to further restrict the rules here even while FCP is active.

I will make an update later today to simply require no fields at all (and no non_exhaustive). Then it will be the usual semver breaking change if a field is added. I believe that should satisfy things?

No. The thing that really worries me is that, being a guaranteed optimization, breaking the layout won't produce any errors. The optimization will just silently stop applying. This will lead to UB on the FFI side, and I don't see any reliable way to check for such breakage.

An upstream crate makes a semver breaking change by adding some inner field (possibly even zero-sized one, but perhaps not in a guaranteed way). A downstream crate which used the changed type in extern fn() -> Result<&T, ZST> now silently gets UB, and there is no way for them to avoid it other than tracking all changes in all dependencies.

Currently the solution would involve explicitly ABI-stable types, and conversions between them and native Rust types. Such conversions would either work correctly, fail to compile or fail at runtime (assuming they are properly written and verify their preconditions).

Also note that this isn't an issue for current types with guaranteed layout optimizations, since their number is very small and they are all provided by the language. There is only a finite number of non-generic NonZeroX types, and they are all guaranteed by the language. If user-defined niches become a thing, I would be just as worried.

For this reason I would prefer to have an explicit way to specify that the layout optimization must apply, rather than deduce it from implicit properties of code, which may be changed by the programmer without even considering that someone could rely on them.

nacaclanga and 8573 reacted with thumbs up emoji

Contributor

Author

If the crate you're getting a type from puts some attribute on their type, can't they just remove that attribute (presumably as a semver break) in a future version? Then isn't it just as bad for the downstream crates relying on it?

Honestly I don't think people should do this in the first place with any type that's not their own type or (). However, I don't want to disallow user ZSTs entirely because it makes for much better internal error handling compared to having () as the error type.

the point of having an attribute is that explicitly having to remove it makes people consider it a semver break since it's obvious, whereas having it completely implicit makes it much more likely people won't realize they need a semvar break to change their library type -- most people aren't going to think about Result's ABI when they change their library type.

afetisov and kpreid reacted with thumbs up emoji

An upstream crate makes a semver breaking change by adding some inner field (possibly even zero-sized one, but perhaps not in a guaranteed way). A downstream crate which used the changed type in extern fn() -> Result<&T, ZST> now silently gets UB, and there is no way for them to avoid it other than tracking all changes in all dependencies.

We do have improper_ctypes warning though.

improper_ctypes is far too noisy to be useful, so we shouldn't rely on a fairly substandard lint.

Member

one thing i'd specifically want to reserve space for is representing Result errors with unwinding (distinct from panicking) when the user opts into it,

This RFC covers the FFI case, which wouldn't use that optimisation. And if it did then the future RFC proposing that would have a lot of specifying to do, and I think that specifying belongs there rather than here.

Hm, could you say more about what you'd expect to be changed within the RFC to have that carved out? Some mentioning in Future Possibilities? I'm completely unfamiliar with how unwinding interacts with the foreign ABI modes.

I'm basically expecting that the list of requirements to have Result have a defined ABI would have an entry kinda like this:

  • Future RFCs may introduce additional requirements, e.g. some new 1-ZST error types may be marked such that they opt out of stable ABI for Result<T, MyErr> e.g. by representing the Err variant by unwinding instead of returning normally.

this way we tell users they can't just just check the current list of requirements for any arbitrary type and be guaranteed to always forevermore have stable ABI when those requirements are met, future RFCs may introduce exceptions for types using new features.

Member

basically I want to say that the following code is unsound:

pub fn f<T>(v: extern "C" fn() -> u32) -> extern "C" fn() -> Result<NonZeroU32, T> {
    assert!(size_of::<T>() == 0 && align_of::<T>() == 1);
    unsafe { transmute_copy(&v) }
}

Contributor

Author

I think I'm with josh, if there was some new way to work with results that would be some new abi, not the existing abis.

If the rfcs says "you need to meet these requirements, and also maybe we'll add more requirements later", that's not a stable guarantee at all.

Your example code is basically exactly what I want to have be allowed: Assuming some T that's trivially constructable.

I think I'm with josh, if there was some new way to work with results that would be some new abi, not the existing abis.

If the rfcs says "you need to meet these requirements, and also maybe we'll add more requirements later", that's not a stable guarantee at all.

it is a stable guarantee, because it's guaranteed to work with all types not using features defined in any future RFCs adding exceptions, e.g. it always works with Result<NonZeroU32, ()> because it would be a breaking change to add those new features to any existing 1-ZST types because of the ABI change.

Your example code is basically exactly what I want to have be allowed: Assuming some T that's trivially constructable.

but the point is that example code doesn't check for any of the new features that might be on some T passed in by arbitrary user code.

if ZST types had to be explicitly marked as ABI stable for Result, that would resolve my concern about needing a carve-out for unwind-on-error types, since we could simply make that ABI stable for Result attribute mutually exclusive with the unwind-on-error attribute.

@afetisov

@ChrisDenton Rust guarantees their specific behaviour, so this seems like a non-issue. On the FFI side none of those types would exist anyway, just like Result<_, _> doesn't.

It wouldn't matter in this case, since the RFC is only specifying the behavior of Result<T, E>, which would not otherwise have any ABI guarantee even when T and/or E are types that do have an ABI guarantee when standing alone.

...But it's an interesting question whether, in general, #[repr(C)] unit structs should be guaranteed ABI-compatible with C empty structs (which are an extension), at least on platforms whose standard compilers support that extension. I did a bit of research. rustc currently claims they are not FFI-safe, which apparently happened as a result of this issue from 2015. The topic came up again in 2018 with the AArch64 ABI and there is a still-open followup issue, though the question really shouldn't be specific to AArch64.

Admittedly, it's a topic for another thread, especially now that the RFC no longer mentions #[repr(C)]. Just thought I'd point out that the situation is more complicated than 'ZSTs don't exist in FFI'.

Contributor

Author

@scottmcm I believe your concern has been resolved if you want to review again.

Member

@rfcbot resolve needs-to-mention-non_exhaustive

joshtriplett

removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label

Mar 7, 2023

Suggestion: add a new layout attribute, #[repr(ZST)]. This attribute can only be placed on types which are considered to be guaranteed ZSTs according to the current language rules. The attribute doesn't change the fact whether the type is a ZST, but it serves as a public guarantee that the zero-sized layout is intended and can be relied on, and not just a temporary accident. Removing #[repr(ZST)] is a breaking change. Putting it on a non-ZST is a compile error.

More specifically, #[repr(ZST)] can be applied only to structsshould we also include unions?, such that all fields are #[repr(ZST)], including one of the primitive ZST types. The primitive ZSTs include (), [T; 0], PhantomData<T>, and function pointer types (should there be others?).

#[repr(ZST)] can be combined with #[repr(transparent)], but doesn't change the layout in this case, since a transparent type wrapping a ZST is already a ZST. However, #[repr(ZST)] is a public guarantee that a transparent type stays ZST, and thus is a stronger guarantee.

In particular, ZST-based layout optimizations, like the ones proposed in this RFC, apply only to #[repr(ZST)] types, and not to accidentally ZSTs or to #[repr(transparent)] wrappers around #[repr(ZST)] types.

#[repr(ZST)] cannot be combined with #[repr(C)] since the C standard doesn't have the concept of ZST. In particular, C++ compilers turn empty structs into types of size 1.


Is this fine? Perhaps this should be extracted into a separate RFC, but it doesn't have any current purpose without the layout optimization guarantees, so I'd add it to the current RFC.

Is this fine? Perhaps this should be extracted into a separate RFC, but it doesn't have any current purpose without the layout optimization guarantees, so I'd add it to the current RFC.

there is a current purpose:

mod some_crate {
    #[repr(ZST)]
    pub struct SomeStruct {
        f: ()
    }
}

#[repr(transparent)]
pub struct MyStruct {
    a: u8,
    b: some_crate::SomeStruct, // guaranteed to not be compile error
}
afetisov reacted with thumbs up emoji

Member

@afetisov One problem there is that ZST-ness is necessary, but not sufficient. repr(transparent) specifically needs a 1-ZST, because [u128; 0] is a ZST but you still can't have a #[repr(transparent)] struct Foo(u8, [u128; 0]);.

Also, I think that reprs are a poor way to communicate public guarantees. I think that traits communicate "yes, everyone can rely on this" better, especially since the compiler can help with it. And it could let the compiler allow #[repr(transparent)] struct Foo<M: Definitely1ZST>(u8, M); that's impossible right now. Not to mention that it's something that any semver checker would already handle.

madsmtm reacted with heart emoji

Contributor

@rfcbot fcp reviewed

This makes sense to me.

rfcbot

added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised.

and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period.

labels

Mar 14, 2023

bellThis is now entering its final comment period, as per the review above. bell

The primitive ZSTs include (), [T; 0], PhantomData<T>, and function pointer types

Slight correction: function item types. Function pointers have sizes.

rfcbot

added finished-final-comment-period The final comment period is finished for this RFC. to-announce

and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised.

labels

Mar 24, 2023

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

This will be merged soon.

jschwe and Throne3d reacted with hooray emoji

Member

This RFC was discussed in this week's lang team triage meeting. We (I) weren't sure whether the compiler already treats types like Result<NonZeroI32, ()> with the same ABI as C int32_t, vs with the same ABI as C struct { int32_t; }. (The distinction between these 2 in extern "C", unlike in extern "Rust", is why RFC 1758 repr(transparent) was needed.)

lokathor: does compiler already do this?

nikomatsakis: Pretty sure it does.

cramertj: is this insta-stable?

nikomatsakis: I think so, or else we can fcp merge the PR to the reference.

If Result<NonZeroI32, ()> already works like int32_t on all platforms then implementing this RFC is a matter of exempting the appropriate types from improper_ctypes / improper_ctypes_definitions, and testing; if it works like struct { int32_t; } there would need to be codegen changes.

As far as I could tell, the former is the case, at least in my usual suspect ABI (32-bit x86).

// cargo asm --target i686-unknown-linux-gnu

use std::num::NonZeroI32;

#[repr(C)]
pub struct Wrapper {
    pub primitive: i32,
}

pub extern "C" fn primitive() -> i32 {
    1
}

// different code than `primitive`
pub extern "C" fn wrapper() -> Wrapper {
    Wrapper { primitive: 1 }
}

// same code as `primitive`
pub extern "C" fn result_nonzero() -> Result<NonZeroI32, ()> {
    Ok(unsafe { NonZeroI32::new_unchecked(1) })
}
Lokathor reacted with thumbs up emoji

tmandry

merged commit d1df23c into

rust-lang:master

Apr 18, 2023

Contributor

The @rust-lang/lang team has decided to accept this RFC.

To track further discussion, subscribe to the tracking issue here:
rust-lang/rust#110503

jschwe reacted with thumbs up emoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

joshtriplett

joshtriplett left review comments

RalfJung

RalfJung left review comments

nbdd0121

nbdd0121 left review comments

ChrisDenton

ChrisDenton left review comments

programmerjake

programmerjake left review comments

afetisov

afetisov left review comments

workingjubilee

workingjubilee left review comments
Assignees

No one assigned

Labels

disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-lang Relevant to the language subteam, which will review and decide on the RFC. to-announce

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

None yet

17 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK