5

Github RFC: Hidden trait implementations by Centril · Pull Request #2529 · rust-...

 3 years ago
source link: https://github.com/rust-lang/rfcs/pull/2529
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Conversation

Copy link

Contributor

Centril commented on Aug 24, 2018

edited

memo Summary

Allow a visibility modifier on a trait implementation such that:

  • it is public wrt. coherence and overlap (this property is meant to preserve coherence).
  • it is not usable in contexts where it is hidden (not visible)

An example:

#[derive(Clone)]
struct Foo(..);

crate impl Copy for Foo {} // Can only be used in the current crate!

sparkling_heart Thanks

To @aturon, @Mark-Simulacrum, and @kennytm for reviewing the draft version of this RFC.

Copy link

Member

scottmcm commented on Aug 24, 2018

edited

How does this interact with blanket impls? Like suppose it were crate impl<T> From<T> for T { ... }. It feels really weird that you end up violating overlap for something you can't even use, and far more than it does for "no, you can't implement that because it's for someone else's type".

Edit: A better description: for a concrete crate impl, it's inobservable to other crates that the impl exists there at all, since they couldn't impl it anyway. But for a generic crate impl, something you can't see affects what you can do, which feels really weird. (And different from things like crate fn and crate const similar.)

Copy link

Contributor

Author

Centril commented on Aug 25, 2018

edited

@scottmcm

How does this interact with blanket impls?

The interaction is described here.
In this case, if a situation occurred where crate A defined crate impl<T> AFrom<T> for T { ... } and crate B, which is dependent on A, defined impl AFrom<BType> for BType { .. }, then an error would occur in crate B with an error message about overlap (I'd also include a note that the implementation in A is hidden to be more helpful..)

While this may seem weird, I think it is also useful in some cases. Consider a conditional and hidden blanket impl like crate impl<T: Copy> AFrom<BType> for BType { .. } This allows crate A to ensure that other crates can't implement something, so crate A retains the ability to do so. One strategy to do so is to intentionally panic in the implementation; but that may not be right in most cases and you'd have to be careful not to use these implementations internally to not panic your own code. For marker traits, you don't need to add any items so there's no need to panic.

However, I don't think this sort of intentional prevention will be the main use for this. It is something that you can do, but the main use case is to allow users to implement traits they actually need to use for internal purposes but to not expose implementations and avoid guaranteeing them.

But for a generic crate impl, something you can't see affects what you can do, which feels really weird.

That depends on what you mean by see here. The error message in the link above is:

error[E0119]: conflicting implementations of trait `crate_A::Property` for type `Thing`:
 --> src/main.rs:<line>:<column>
  |
L | impl Property<Thing> for Thing {
  | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: conflicting implementation in crate `A`:
          - crate impl<T> crate_A::Property<T> for T { ... }
    note: the implementation exists, but is hidden.

Note in particular that the author of crate B can see that there is a hidden implementation in crate A that is in the way. This enables B's author to contact A's author and see if they can perhaps work something out.

As a final note, the reason why E0119 must be emitted is that we must preserve coherence because to violate coherence would be unsound and cause breaking changes to unsafe code which may assume coherence as an invariant. The canonical example of this is the hash table problem.

Copy link

Contributor

TimNN commented on Aug 26, 2018

I think this RFC needs a section about how it interacts with specialisation, consider this:

// crate A

trait Property {
  fn get(&self) -> u32;
}

impl<T> Property for T { 
  default fn get(&self) -> u32 { 42 }
}

crate impl Property for Thing {
  fn get(&self) -> u32 { 4242 }
}

Crate A would see the specialised implementation of Property whereas crate B would see the default implementation. What happens if crate B passes a Thing to function taking <T: Property> of crate A?

Copy link

Contributor

Author

Centril commented on Aug 26, 2018

edited

@TimNN Nicely spotted! I hadn't considered that interaction, so thank you for raising this. :)

Crate A would see the specialised implementation of Property whereas crate B would see the default implementation.

If by "see" we mean "use", then if this were allowed to happen, then it would cause the type system to be unsound because it would be incoherent (i.e: allow different dynamic semantics depending on where the implementation was resolved in).

So I see two principal ways to deal with this:

  1. Let the crate impl Property for Thing be unobservable directly such that you couldn't directly write in crate B: <Thing as Property>::get(&a_thing) but you could observe the implementation indirectly through <T: Property>. Then this wouldn't be incoherent. One could justify this approach by noting that an implementation is already provided, so the specialization is not expected to be so wildly different, and if it were removed, an implementation would be there so no error would occur. Under a situation where parametricity is culturally assumed by users, this might work; however, it could also be considered foot-gunny when specialization of associated types are involved. It may also cause surprises for users.

  2. Require that any specialized implementation have a visibility which is ≥ that of the base implementation which is specialized. This would cause the above snippet to become a type error at the crate impl....
    Currently, I think this is my preferred approach since I think it will be less surprising. Another advantage is that 2) is the more conservative approach and it is forward-compatible with 1) so we could change to 1) if we determine in the future that it is needed and there is popular demand for it (I don't expect so, but who knows...).

What do you think?

Copy link

Contributor

TimNN commented on Aug 26, 2018

What do you think?

(2) would be my preference as well (also because IMO it's simpler to understand than option (1)).

Copy link

Contributor

Author

Centril commented on Aug 26, 2018

@TimNN Yeah that's a great point! In fact, it felt heavier to write the explanation for 1) so I wrote the one for 2) first =P. The explanation for 2) is also more obvious I think since you can fit it into an error message.

Copy link

burdges commented on Aug 26, 2018

edited

I'm kinda dubious about the motivation here: Are we trying to expose TestRng: FromEntropy or just use it internally? Are we not exposing impls because TestRng violates their assumptions, but just happens to need code that looks identical? I suspect so but this hints towards an error in admitting the FromEntropy machinery. Who observes that machinery? This sounds messy.

I suspect this would usually be best addressed with hidden types and simplified delegation, but given that delegation has languished so long..

I'm wondering if some lint scheme might achieve this better, so the impls would technically be exposed, but any user outside the current crate gets a warning specified at trait declaration time.

#[hide(crate,deny,"X is not small, but we audit our internal Copy usage, and fix benchmark regressions.")]
impl Copy for X {}

#[hide(crate,allow,"We use a slow conversion in X: Hash so you should use a precursor to X or batch convert to X::Affine instead.")]
impl Hash for X { ... }

I'm thinking lints prevent crate authors from really depend on hiding impls, so they'll be more conservative, while simultaneously admitting a wider range of warnings:

  • "A: B does not uphold assumption .., so breaks if you do .."
  • "A: B is used internally but is not considered part of the stable API because .."

It just seems like hiding is too blunt an instrument and people who want this should explain when the impl is or is not usable. A lint could permit hiding generic impls that exist for whatever reason, but would probably not hiding anything automatically, instead the hider should write out lints for each problematic trait.

Copy link

Contributor

Author

Centril commented on Aug 26, 2018

@burdges

Are we trying to expose TestRng: FromEntropy or just use it internally?

In the case of TestRng we are trying to make use of FromEntropy internally; but we don't want to expose SeedableRng as a semver guarantee to the outside world. In other words, this is about code reuse without promising things we shouldn't.

Are we not exposing impls because TestRng violates their assumptions, but just happens to need code that looks identical?

TestRng violates no assumptions; we are just not ready to expose the fact that TestRng is SeedableRng because that would set certain properties in stone until the next breaking release.

Other examples, as discussed in the motivation, include not wanting to expose impl From<SizeRange> for Range<usize> as a public API or @nikomatsakis's desire to easily #[derive(Default)] but only for internal usage.

I suspect this would usually be best addressed with hidden types and simplified delegation, but given that delegation has languished so long..

This is discussed in depth in the section on alternatives. Remember that you both have to 1) define the newtype struct, 2) wrap (and sometimes unwrap) the base type in the newtype every time you need the crate-local-only implementations. 3) create the delegating implementations. Thus, comparatively speaking, using forms of GeneralizedNewtypeDeriving or delegation per #2393 requires a considerably larger, relatively speaking, annotation burden as compared to just adding crate in front of impl. This holds in particular when there's not just one trait you have to implement, but several.

Also consider that delegation or newtype-deriving does not generally work and can be quite restrictive. For example, if we take a look at RFC #2393, it does not work at the moment for: associated types, associated consts, fns without a receiver, fns where the receiver is not (&mut?)? self, fns where any argument or the return type is Self. These restrictions rule out quite a few traits which you might want to delegate for. One example is Iterator which due to its associated type, can't be delegated for. Another is PartialEq which can't be delegated for because it has an argument of type Self. A further problem is that sometimes, due to requirements such as T: 'static and because you can't give away ownership of the base object you are newtype deriving over (so the newtype has to be struct New<'a>(&'a BaseType);), delegation and new-type deriving simply won't work.

All in all, I believe using the newtype approach is an indirect approach that does not highlight the author's intent and that adds a non-trivial amount of extra thinking and reading burden on the author. Thus, I think it is a worse solution for readability, maintainability, and for writing ergonomics.

I'm wondering if some lint scheme might achieve this better, so the impls would technically be exposed, but any user outside the current crate gets a warning specified at trait declaration time.

I will add a discussion about this in the section on alternatives. However, when dealing with unsafe code or some sort of API invariant you'd like to maintain, sometimes accidentally exposing an implementation can lead to those invariants breaking or worse yet, unsoundness. Thus, a lint is not a true alternative since it would not allow you to retain these guarantees. If a user truly wants to hide away implementations from other users, it seems to me more principled to offer a type system backed guarantee rather than merely a suggestion.

Furthermore, it is not clear how this #[hide(..)] would interact with #[derive(..)].

I would also like to add that comparatively speaking, crate impl is just adding the notion of visibility modifiers on an item, so it is more in line with Rust as-is today. Thus, I think it is smaller in terms of the cost to the complexity budget.

As for wanting to provide more tailored error messages, we could expose better facilities for custom error messages in general, and hidden implementations could be part of that. For example, you might have some sort of #[on_visibility_error("Sorry we are not exposing this because...")]. This could work generally for visibility modifiers on other items such as structs, fns, etc.

However, I think that most of the time, you would just like to hide an implementation without explaining yourself. In the cases where I have needed this, that applied. If I had to provide an explanation, there is a chance that it would have been too heavy a burden wherefore I might have accidentally publicized the implementation instead.

It sounds like visibility modifiers for impls might not adhere to the same rules as visibility modifiers for inherent items or non-public traits though?

Are you sure it's wise to permit hiding all traits? Sync? Drop? CoerceUnsized?

Can I hide only a supertrait? It sounds strange to hide FnOnce but not FnMut, but we do similar things with hidden traits all the time.

How does hiding interact with generic impls? If I hide X: Iterator then what happens to X: IntoIterator? What happens if my public function returns a Cow<'a,T> where T: Clone is hidden?

It's tempting to keep hidden impls in line with existing hidden traits here. That results in incomplete hiding, but incomplete hiding is what you want at least half the time anyways, and incomplete hiding is at least fixable with more visibility modifiers, ala specialization.

Can you forbid hiding an impl for your own tait via syntaxes enabled by specialization, amybe impl<T: MyTrait> MyTrait for T {}?

I do think lints provide answers to most of these questions, namely the burden is on the hider for both coverage and information. It's impossible for these answers to always be correct though because the space is more complex than with existing visibility modifiers.

Copy link

Contributor

Author

Centril commented on Aug 26, 2018

@burdges

It sounds like visibility modifiers for impls might not adhere to the same rules as visibility modifiers for inherent items or non-public traits though?

In what sense? The differences where they exist are specified in the reference except for the interaction with specialization which I've proposed a rule for in the above discussion.

Are you sure it's wise to permit hiding all traits? Sync? Drop? CoerceUnsized?

Rules for auto traits are specified in point 8.
For example, the following would be rejected:

pub(self) impl !Send for MyUnsafeType {}

As for Drop, it might be prudent to reject implementations with visibility $vis_a for Drop for a type with visibility $vis_b where $vis_a < $vis_b. We could give well defined semantics for the case of $vis_a < $vis_b by saying that the Drop implementation will be used everywhere (so no coherence issues), but that you can not directly pass a type to somewhere <T: Drop> is required if that fact is not visible in your scope. This might be surprising, so we could do a similar restriction as the one proposed in rule 2) for specialization. This rule could be relaxed in the future in a backwards compatible manner.

Can you elaborate on CoerceUnsized?

Can I hide only a supertrait?

No. This is specified in 4.
For example, you may not write:

#[derive(crate Clone, pub Copy)]
struct Foo;

but you may write:

#[derive(pub Clone, crate Copy)]
// equivalently:
#[derive(Clone, crate Copy)]
struct Foo;

How does hiding interact with generic impls? If I hide X: Iterator then what happens to X: IntoIterator?

You can't hide a bound. You can hide the fact that some type ConcreteX implements Iterator. That entails that the fact X: Iterator is not satisfied in some context where the implementation is hidden. If the path to X: IntoIterator goes through X: Iterator (which was not satisfied in the given context) then X: IntoIterator is not satisfied either. This is again specified in 4.

What happens if my public function returns a Cow<'a, T> where T: Clone is hidden?

If T is a generic parameter and has the bound T: Clone then that function may not be called if Clone is a hidden implementation in context. This also follows from 4.

If you believe there are faults in the technical specification of this RFC, I invite you to leave line comments where appropriate so that I may fix those faults.

Can you forbid hiding an impl for your own tait via syntaxes enabled by specialization, amybe impl<T: MyTrait> MyTrait for T {}?

Elaborate? I have discussed the situation wrt. blanket impls here: #2529 (comment).

I do think lints provide answers to most of these questions, namely the burden is on the hider for both coverage and information.

I don't believe in the linting approach because fundamentally, lints are not about encapsulation and API abstraction. We use visibility modifiers for that to enforce properties, so it seems to me natural to use $vis impl.

I'd missed that hidden bounds were forbidden. That makes this less complex and much less useful, which sounds good.

I do still think hidden impls should always be justified in comments. As otherwise downstream could reasonably just fork the upstream under the assumption that upstream was being overly shy about its public API. Clippy could maybe complain if a hidden impl lacked a comment?

I'd expect the push for genericity will create more situations where crate authors might hide impls when they should instead merely provide a lint against using them.

As an example, if you create a HashMap with a non-cryptographic Hasher then repeatedly Cloneing it may increase attack surface. We clearly do not want to remove or hide HashMap: Clone, especially as RandomState is cryptographic, but one could easily imagine crate authors making that choice. Instead, we should merely warn users whenever they utilize the bound HashMap: Clone.

There are also plausible concerns around Default in which one actually should expose Default but also warn the users.

Opposed. Case it addresses is not common enough to warrant extension, cost is nontrivial, risk of mistaking it for modular instances is high, and existing pattern of using private newtypes is sufficient and not too burdensome in the rare cases this occurs.

@graydon

Case it addresses is not common enough to warrant extension

imo One of the use cases is very common but possibly regularly overlooked in crate API design (and buried in the RFC among talk of coherence / newtype).

From the RFC

A further example from proptest is when the author of this RFC mistakenly exposed impl From for Range publicly while the API was only needed internally. Another occasion where a similar scenario occurred was when [email protected] accidentally had a public dependency on regex_syntax due to yet another a From implementation.

The case I'm concerned about I think is the same that happened for regex. When creating an impl Error for your crate, you can't implement the Froms for your dependencies for handy use of ? without exposing them in your public API. imo this nearly nullifies the value of the into performed by ?.

risk of mistaking it for modular instances is high

Personally, I'm not seeing it as something that would be mistaken.

cost is nontrivial,

I can't speak to cost or whether this causes any soundness issues. If this is possible, I think it would be a big help to reduce overhead for API design best practices (currently more trivial to implement Froms when you shouldn't than to hide your implementation details) but I can accept it not happening for cost or technical reasons.

Copy link

Contributor

Author

Centril commented on Jan 12, 2019

@epage

I can't speak to cost or whether this causes any soundness issues.

Unfortunately, the current state of this proposal does when combined with specialization (see #2529 (comment), #2529 (comment)). For a possible resolution of the soundness issue see: #2529 (comment). Since this is rather subtle and not clear whether this is sufficiently scrutable for users, I have been putting of further work on this RFC for now in favor of more pressing concerns.

As for cost, @nikomatsakis would know more; I think Chalk should make this easier.

Copy link

CAD97 commented on Jan 12, 2019

edited

Where I very much want something like this is for implementing traits that mention types from private dependencies. If you'll note, all the concrete examples of use of this potential feature fall into this case.

If we restrict it to this case, standard orphan rules should cover all potential problems with hiding the implementation, so it wouldn't have to "leak" into error messages.

I agree that the "full" scoped implementation is problematic. But I think crate-local implementation of traits mentioning foreign types is a less problematic subset.

(EDIT: specialization and blanket implementation make this not perfectly true. I still think that the "private dependency" case should always be sound, though, because the implementation can always be added later, right? Specialization is a hard problem, and I haven't really kept up with that nor its relevance to this.)

Are you in opposition to just that subset, @graydon? It could be accessible via a broader public/private dependency split (which would otherwise have to forbid said implementations entirely) or more granularly (to allow use with public dependency types).

At the very least, I don't think From<private dependency type> is an uncommon thing to want to have, and don't see how a private newtype can recover all of that functionality. (T::from_type works, of course, but is a hit in usability and means all of the Into::into API ergonomics go out the window as well.)

@CAD97 not sure, that's a different proposal than what's on the table here. I like minimizing accidental exports and global leakage (indeed it's why I argued against having typeclasses initially) but the ship sailed: reasoning about the impl relation is global. Starting to invent secondary ways of making it non-global in addition to the visibility rules around the traits and types involved in the relation seems to me like inviting more confusion than you could gain from avoid from the cases the existing system is misused.

I'm unsure if this has been brought up, seeing as I don't have time to read the entire thread, but it'd also be great if it were considered to allow using this to implement external traits for external types; I can't count the amount of times I've wanted to do this for a binary I've been doing which I then need to pollute with wrapper types (this especially happens with Display in my case of printing a lot of pretty information) or extension traits to make it possible.

Copy link

Member

jswrenn commented on Nov 14, 2020

Apologies if this has already come up, but type privacy (described by RFC 2145) contributes a notion of implementation visibility that can be used to achieve scoped trait impls.

The semantics of this pattern might provide a useful, minimal foundation on which to build a pub(self) impl feature.

related #493

Copy link

Member

scottmcm commented 11 days ago

We discussed this in the backlog bonanza meeting today. We're definitely interested in something like this eventually -- the safe transmute work would like it, for example -- but this is currently the kind of big idea that's not a good fit for the current roadmap cycle, and implementation-wise would probably end up waiting until the chalk work is integrated anyway.

@rfcbot fcp postpone

Copy link

rfcbot commented 11 days ago

edited by cramertj

Team member @scottmcm has proposed to postpone this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Copy link

rfcbot commented 11 days ago

bellThis is now entering its final comment period, as per the review above. bell

Copy link

rfcbot commented yesterday

The final comment period, with a disposition to postpone, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC is now postponed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers

scottmcm

Assignees

Centril

Projects

None yet

Milestone

No milestone

Linked issues

Successfully merging this pull request may close these issues.

None yet

13 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK