Github RFC: Pointer metadata & VTable by SimonSapin · Pull Request #2580 · r...
source link: https://github.com/rust-lang/rfcs/pull/2580
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Contributor
SimonSapin commented on Oct 28, 2018
Add generic APIs that allow manipulating the metadata of fat pointers:
- Naming the metadata’s type (as an associated type)
- Extracting metadata from a pointer
- Reconstructing a pointer from a data pointer and metadata
- Representing vtables, the metadata for trait objects, as a type with some limited API
This RFC does not propose a mechanism for defining custom dynamically-sized types, but tries to stay compatible with future proposals that do.
Contributor
Author
SimonSapin commented on Oct 28, 2018
mikeyhew commented on Oct 28, 2018 •
First of all, thanks for opening this rfc. It's the right way to fix the raw::TraitObject API, and is a big step toward custom DST.
My only criticism is I think the Vtable type should be generic over the trait object type, as mentioned in the alternatives section. Having different Metadata
types for each trait object type would help catch errors at compile time. Also, it seems like &'static
would preclude us from implementing trait object types like dyn Trait1 + Trait2
using multiple vtable pointers, and i think it would be nice to avoid making that decision in this RFC.
I like the name Pointee
, I think it's an improvement over Referent
because it's more clear what it means.
Contributor
gnzlbg commented on Oct 28, 2018
cc @ubsan
///
/// [dst]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts
#[lang = "pointee"]
pub trait Pointee {
oli-obk on Oct 28, 2018
Contributor
so... I'm assuming the compiler implements
default impl<T: ?Sized> Pointee for T { type Metadata = &'static Vtable; } impl<T: Sized> Pointee for T { type Metadata = (); } impl Pointee for str { type Metadata = usize; } impl<T: Sized> Pointee for [T] { type Metadata = usize; }
Which means theoretically we could make Vtable
generic over T
allowing the drop_in_place
method to take a raw pointer with the correct pointee type?
SimonSapin on Oct 28, 2018
Author
Contributor
These impls would be accurate in current Rust, but what I had in mind instead was that the compiler would automatically generate impls, similar to what it does for the std::marker::Unsize
trait. As far as the standard library is concerned these impls would be "magic", not based on specialization.
Regardless, yes, making VTable
generic with a type parameter for the trait object type is possible.
Outdated
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)
`VTable` could be made generic with a type parameter for the trait object type that it describes.
This would avoid forcing that the size, alignment, and destruction pointers
SimonSapin on Oct 28, 2018
Author
Contributor
Without a type parameter, x.size()
with x: &'static VTable
necessarily executes the same code for any vtable. With a type parameter, x: &'static VTable<dyn Foo>
and x: &'static VTable<dyn Bar>
are different types and could execute different code. (For example, do table lookup with different offsets.) However, keeping the offset of size
the same within all vtables might be desirable regardless of this API.
Outdated
`VTable` could be made generic with a type parameter for the trait object type that it describes.
This would avoid forcing that the size, alignment, and destruction pointers
be in the same location (offset) for every vtables.
But keeping them in the same location is probaly desirable anyway to keep code size
Outdated
type Metadata;
}
/// Pointers to types implementing this trait alias are
Contributor
Author
SimonSapin commented on Oct 28, 2018
@mikeyhew I’ve very open to adding a type parameter to VTable
, I’ll just wait to get some more feedback on the RFC as-is before making significant changes.
As to supporting super-fat pointers with multiple vtable pointers, as mentioned in the alternatives section I believe this design doesn’t prevent it. Types that don’t exist yet and are added to the language in the future (possibly custom DSTs) can have a different metadata type. For dyn A + B
that could be [&'static VTable; 2]
, for example. This proposal does however force dyn C
with trait C: A + B {}
to keep only having a single vtable pointer.
pub unsafe fn drop_in_place(&self, data: *mut ()) { ... }
}
```
SimonSapin on Oct 28, 2018
Author
Contributor
I came up short trying to think of a reason not to do this at all (as opposed to doing it differently). Suggestions welcome.
Outdated
and (hopefully) more compatible with future custom DSTs proposals,
this RFC resolves the question of what happens
if trait objects with super-fat pointers with multiple vtable pointers are ever added.
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)
Centril on Oct 28, 2018
Contributor
Should then [&'static VTable; 1]
for dyn SomeTrait
be used to make that transition smoother and to fit better with const generics?
SimonSapin on Oct 28, 2018
Author
Contributor
This would make some sense if we were definitely gonna have super-fat pointers with multiple separate vtable pointers as fat pointer metadata. But if we don’t and end up with a different solution to upcasting, we’ll end up with a always-single-item arrays for no reason. This isn’t really the thread to get into that discussion, but my opinion is that super-fat pointer have a significant enough size cost that I’d much prefer a different solution.
Perhaps an alternative for this RFC, more neutral with respect super-fat pointers v.s. not, would be to have type Metadata = VTable<Self>;
for trait objects. (See other comments about VTable
’s possible type paramater.) With the pointer/reference indirection hidden away in private fields of the VTable
type, this design would be compatible with having VTable<dyn A + B>
contain two pointers in the future.
SimonSapin on Oct 28, 2018
Author
Contributor
Also, is there a use case for generic code that accepts any trait object with any number of vtable pointer but not other kinds of DSTs?
Outdated
and (hopefully) more compatible with future custom DSTs proposals,
this RFC resolves the question of what happens
if trait objects with super-fat pointers with multiple vtable pointers are ever added.
(Answer: they can use a different metadata type like `[&'static VTable; N]`.)
Centril on Oct 28, 2018
Contributor
Are we doing the proposals in the right order? Shouldn't we focus on dealing with dyn A + B + C
, upcasting, and such things first? Also, cc #2035.
SimonSapin on Oct 28, 2018
Author
Contributor
I believe the design proposed here is compatible enough with various options for multi-traits trait objects and upcasting such that there isn’t a strong dependency, and we don’t need to block this RFC on everything else being settled.
Outdated
* The name of `Pointee`. [Internals thread #6663][6663] used `Referent`.
* The location of `VTable`. Is another module more appropriate than `std::ptr`?
Centril on Oct 28, 2018
Contributor
and should it be called Dictionary
instead? ("type class dictionary")
kennytm on Oct 28, 2018
Member
Big -1 to calling it Dictionary since this typically means a key-value map (example: C#, Swift, Python).
Furthermore, here in Rust the VTable
is implemented as an array of function pointers, not a HashMap (unlike e.g. Python where it is really implemented as a dict
), so calling it Dictionary obscures the alleged complexity.
SimonSapin on Oct 28, 2018
Author
Contributor
Even if the implementation happened to use HashMap
, I’d prefer VTable
since it’s more descriptive of the role of this type. (As opposed to: dictionary of what?) I believe that vtable is a well-enough established term of art.
Regarding making VTable
generic, if it looks like this:
struct VTable<Dyn> { … }
… then it could be used with any type as a parameter. What does VTable<u32>
mean? If we want to restrict VTable
’s parameter to only types where it makes sense (dyn SomeTrait
, dyn SomeTrait + SomeAutoTrait
, etc.), we’ll need a dedicated public trait:
struct VTable<Dyn> where Dyn: ?Sized + std::marker::DynTrait { … }
Do we want such a trait?
Hmm, maybe we could get away with this?
struct VTable<Dyn> where Dyn: ?Sized + Pointee<Metadata=Self> { … }
Contributor
Gankra commented on Oct 28, 2018
Is there any way to ensure minimal compiler time is wasted performing monomorphization on useless vtable type params? If not, I would rather the API just be less safe.
@gankro the word "useless" sounds a little strong there... the purpose it to make sure you can't use a vtable from a *const dyn Trait
to make a *const dyn SomeOtherTrait
, at least not without doing something obviously hacky like transmute
.
Contributor
Gankra commented on Oct 29, 2018
You're supposing a situation where I somehow am writing code with two vtable types floating around, in which case I have two data pointers, and nothing can stop me from swapping the data pointers, producing the exact same effect.
rfcbot commented on Nov 5, 2020 •
Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members:
Concerns:
- remove-static-bound resolved by #2580 (comment)
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!
See this document for info about what commands tagged team members can give me.
Contributor
nikomatsakis commented on Nov 5, 2020
@rfcbot concern remove-static-bound
As discussed in the lang meeting, the 'static
bound on Metadata
is not necessary and should be removed.
Member
kennytm commented on Nov 6, 2020
since
T: 'static
impliesT::Metadata: 'static
anyway by Rust's rules
But what about T: 'a
? Like what should happen below?
use std::marker::PhantomData; use std::hash::Hash; struct Weird<'a> { x: PhantomData<&'a ()>, } pub trait Pointee { type Metadata: Copy + Send + Sync + Ord + Hash + Unpin /* + 'static */; } impl<'a> Pointee for Weird<'a> { type Metadata = PhantomData<&'a mut &'a ()>; } fn ok<'a, 'b: 'a>( mut a: *const Weird<'a>, b: *const Weird<'b>, ) { // should the next line be rejected? a = b; } fn not_ok<'a, 'b: 'a>( mut a: (*const (), <Weird<'a> as Pointee>::Metadata), b: (*const (), <Weird<'b> as Pointee>::Metadata), ) { // the next line will be E0623 lifetime mismatch due to the invariance a = b; }
* Is this about the field inside `DynMetadata`?
No, this is about the type of the data
argument of from_raw_parts
.
Contributor
nikomatsakis commented on Nov 6, 2020
But what about T: 'a? Like what should happen below?
T: 'a => T::Metadata: 'a
in general, yes.
However, @kennytm, I am now reading your example more closely, and I see I didn't address the substance of it really. The point is that because *const T
(and &T
) are covariant with respect to T
, that implies that the Metadata
might change too as T
is adjusted, but we don't know that the resulting T::Metadata
will be valid. That's...a good point.
Now I'm wondering about other sorts of subtyping that we have in Rust, and whether they to can cause problems. Seems like yes. In particular, we have subtyping from fn
types with bound lifetimes, though I often wish we didn't. So conceivably you could do something like
impl Pointee for Weird<fn(&u8)> { type Metadata = X; } impl<T> Pointee for Weird<T> { type Metadata = Y; }
and that would get us into some awkward trouble using a similar construction, where you "upcast" from *const Weird<fn(&u8)>
to (say) *const Weird<fn(&'static u8)>
(which works).
So the TL;DR is that I don't think that a 'static
bound actually solves the problem you're concerned about, but indeed it is a real problem that we would have to address in some way if we permitted user-provided impls.
@nikomatsakis as far as I understand fn(T) -> ..
is contravariant
with respect to T, and https://github.com/sunshowers/lifetime-variance-example/blob/master/src/lib.rs suggests that that is fine, what can materially go wrong with the playground example you linked?
Contributor
nikomatsakis commented on Nov 10, 2020
@guswynn the problem is that one could, if we permitted user-defined Pointee
impls, conceivably change the Metadata
types for those two types. In that case, if you had a *const Weird<fn(&u8)>
and you upcast it to the supertype *const Weird<fn(&'static u8)>
(for example), the metadata would have changed. That doesn't work because subtype and supertype have to have the same representation (including metadata).
@nikomatsakis does any set of dangerous impl's require full-specialization? as far as I can tell, only min_specialization wouldn't permit this overlap: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=74c83f9a0748c1043e479c0547243968 (actually, #![feature(specialization)]
doesn't even work here either, and I'm not sure the status of specializable associated types, rust-lang/rust#31844 looks like it has some unresolved details)
Contributor
nikomatsakis commented on Nov 14, 2020
@guswynn this isn't really about specialization -- these two impls are not considered specializations of one another, they apply to distinct types. Currently we give a future-compatibility warning, but I expect that we will eventually accept that pattern so as not to break wasm-bindgen.
Member
scottmcm commented on Nov 16, 2020
why
Hash
andOrd
? because of impls on*const T
How much does it matter that these impls care about the data portion? Is it even desirable? IIRC it's an anti-pattern to care about it on *const dyn Trait
anyway since it might have different values in different places.
I guess we're stuck with it for Ord
because of *const [T]
?
Contributor
Author
SimonSapin commented on Nov 18, 2020
Yes it’s a problem that vtable pointers can have different addresses for the same vtable generated independently in separate compilation units, but it’s still a compatibility constraint that Hash
and Ord
are implemented for *const dyn Trait
. At they moment they also hash/compare the vtable pointer address, even though that might give unexpected results. I’m not sure how compatible it would be to change that behavior to ignore vtable pointers entirely.
Wouldn't rustc be able to use a linkonce_odr
comdat
for vtables to merge them, just like C++ does for classes that don't have a specific owning .cpp
file?
Contributor
bjorn3 commented on Nov 18, 2020
That doesn't work across dynamic libraries for example.
That doesn't work across dynamic libraries for example.
That's not entirely true, GCC seems to think it does:
https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gccint/WHOPR.html
The whole program mode assumptions are slightly more complex in C++, where inline functions in headers are put into COMDAT sections. COMDAT function and variables can be defined by multiple object files and their bodies are unified at link-time and dynamic link-time.
Contributor
Amanieu commented on Nov 18, 2020
The whole program mode assumptions are slightly more complex in C++, where inline functions in headers are put into COMDAT sections. COMDAT function and variables can be defined by multiple object files and their bodies are unified at link-time and dynamic link-time.
This is only true on ELF platforms. Windows and macOS don't use ELF and have much more limited run-time linking.
Contributor
Author
SimonSapin commented 18 days ago
I’ve pushed a few changes:
- Remove the
'static
bound as requested. This should resolve the concern filed by @nikomatsakis - Parameterize
DynMetadata
so it’s used likeDynMetadata<dyn SomeTrait>
. After recent discussions I have a mild preference for it but I left a unresolved question about changing it back. MentionDynMetadata<u64>
as a type you can name but not obtain a meaningful value of. - Add an unresolved question about adding
into_raw_parts
, potentially instead of themetadata
function. This was suggested in #2984 (comment)
I’m still in interested in opinions on unresolved questions, in particular the API design details.
Ping @Amanieu, @BurntSushi, @cramertj, @pnkfelix, and @withoutboats for checkboxes in #2580 (comment).
Is
*const ()
the way we want to spell "untyped pointer" here? Is there any way we can make this forward-compatible with a potential future version ofc_void
that's in core rather than just std? (Note that if we end up going with one of the proposals that uses a typed pointer here, that's fine, but if we're going to use an untyped pointer, I'd rather not have multiple incompatible untyped pointer types.)
Sorry I misunderstood this earlier. That’s a good question.
c_void
does exist in core
these days, but at the language level it’s just another Sized
type like ()
(except of size 1). What makes it special is only its use convention in libc
and other FFI crates.
If the language had truly untyped pointers they would indeed be perfect for from_raw_parts
, but it doesn’t and I’m not aware of a proposal to add them. I picked *const ()
as what felt like the closest thing that exist (more so than enum c_void { #[doc(hidden)] __variant1, #[doc(hidden)] __variant2 }
) but I’m open to changing it.
Contributor
nikomatsakis commented 3 days ago
@rfcbot resolve remove-static-bound
rfcbot commented 3 days ago
This is now entering its final comment period, as per the review above.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK