4

RFC: Partial Types by VitWW · Pull Request #3420 · rust-lang/rfcs · GitHub

 1 year ago
source link: https://github.com/rust-lang/rfcs/pull/3420
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

RFC: Partial Types #3420

Conversation

@VitWW VitWW

commented

Apr 18, 2023

edited

This RFC proposes Partial Types as universal and type safe solution to "partial (not) borrowing" like problems.

Rendered

struct Point {x: f64, y: f64, was_x: f64, was_y: f64}

let mut p1_full = Point {x: 1.0, y: 2.0, was_x: 4.0, was_y: 5.0};
    // type with full access
    // p1_full : mut Point;
    // p1_full : mut %full Point;


let p_just_x = %{x, %cut} Point {x: 1.0};
    // partial initializing
    // p_just_x : %{%permit Self::x, %deny Self::{y, was_x, was_y}} Point;


let ref_p_just_x = & %max p_just_x;
    // partial not borrowing(referencing)
    // ref_p_justx : & %{%permit x, %deny {y, was_x, was_y}} Point;
 
// partial parameters   
fn x_restore(&mut p1 : &mut %{was_x, %any} Point, & p2 : & %{x, %any} Point) {
    *p1.x = *p2.was_x;
}

// partial arguments: partial not borrowing and partial not referencing
x_restore(&mut %min p1_full, & %min p1_full);

where

  • %full and %{..} are type access;
  • %permit and %deny are field access;
  • %min and %max are access filters;
  • %any and %cut are quasi-fields

This is an alternative to Partial borrowing issue#1215, View patterns internals#16879, Permissions #3380, Field projection #3318, Fields in Traits #1546, ImplFields issue#3269

lebensterben, clarfonthey, chenyukang, konsumlamm, burdges, kennytm, slanterns, ssokolow, and lukechu10 reacted with thumbs down emojiVeykril, clarfonthey, Lokathor, shepmaster, seritools, chenyukang, rodrimati1992, ssokolow, and aznhe21 reacted with confused emoji

ehuss

added the T-lang Relevant to the language subteam, which will review and decide on the RFC. label

Apr 18, 2023

Safe, Flexible controllable partial parameters for functions and partial not consumption (including partial not borrowing) are highly needed and this feature unlock huge amount of possibilities.

Partial borrowing is already possible in Rust, as partial referencing and partial moves.

But partial parameters are forbidden now, as qualified consumption: partial not borrowing, partial not referencing, partial not moving and partial initializing.

I feel like the motivation for partial borrows is very clear, but the motivation for this implementation of it is not. "Unlocking huge amounts of possibilities" doesn't seem terribly convincing to me; for example, we could implement the duck-typing templates system from C++ to unlock loads of possibilities, but we don't because it's clunky, errors at monomorphisation time by design, and encourages bad code. What real cases does this feature unlock that make it feel worthwhile?

lebensterben and felix91gr reacted with thumbs up emoji

First, this proposal almost remain style of using Rust as already is (by contrast to duck-typing).

  1. It is full backward-compatible. This proposal DO NOT change or block any existed feature! Before and after implementation.
  2. It adds some safe flexibility to safe code by safe methods.
  3. Simplicity in binary - Type integrity just say by type to compiler, that some fields are forbidden to use for everyone ever. And that allows to use ordinary references as "partial" and ordinal variables as "partial". No extra actions with variables or pointers are needed.
  4. Any partial type error is a compiler error, all types are erased after type-check, so no extra-cost in binary is needed.
  5. it is universal rule - that mean minimal special cases on implementation.
  6. It is minimal universal-extension - all other proposals propose less than this with more or same cost

We need detailed integrity to write non-virtual specific typed parameters in function, including trait implementation.

An abstractions is added for integrity detailing, we assume that **every** variable is a `struct`-like objects (even if it is not).

What value do we gain from assuming that every variable is a struct? It doesn't really feel simpler to me, and all it requires is adding extra .self specifications to the end of every path.

Good question.

  1. We get unification
  2. Enum allow unit-like values anyway (regardless if we want to represent numbers or not)

p1 : &mut %{self.*} Point;

p1 : &mut %{self.x, self.y, self.z, self.t, self.w} Point;

p1 : &mut %{self.{x, y, z, w}} Point;

It would make way more sense to me if these were listed as Self::field instead of self.field, especially with *, for two reasons:

  • One, it looks closer to the existing use syntax
  • Two, Self:: is used for type information whereas self. is used for expressions, and this is type information

Agree. I've change to "self." to "Self::".
But it has a disadvantage on Impl, when we get "Inner Self " and "outer Self".

// case (D7)
impl Point {
    pub fn x_refmut(&mut self : &mut %{Self::x, %any} Self) -> &mut f64 {
        &mut self.x
    }
}

As I see, in detailed access can be nothing else then Self::, so it is fully unnecessary and I remove it.
So, code looks prettier!

impl Point {
    pub fn x_refmut(&mut self : &mut %{x, %any} Self) -> &mut f64 {
        &mut self.x
    }
}

We assume, that each field could be in one of 2 specific field-integrity - `%fit` and `%deny`.

We also must reserve as a keyword a `%miss` field-integrity for future ReExtendeded Partial Types, which allows to create **safe** self-referential types.

`%fit` is default field-integrity and it means we have an access to this field and could use it as we wish. But if we try to access `%deny` field it cause a compiler error.

I'm going to be honest, even after reading this multiple times, this doesn't make any sense to me at all. It doesn't help that "fit" is a word that has other meanings, and doesn't immediately strike me as being an abbreviation of "field integrity."

If the purpose of partial types is to only borrow a portion of a type, then why would you need to label what's included? Shouldn't the type be fully defined by what portion is being used?

Author

Ok, I've replaced "fit" to "permit" (and "integrity" to "access") for more obvious meaning.

About labels : it is a good question.
What is really change? It just add additional labeling wanted to borrow fields.

or use `where` clause if integrity is extra verbose:

```rust

// case (D6)

// FROM case (D5)

fn x_store(&mut p1 : &mut %fit_sv_x PointExtra, & p2 : & %fit_x PointExtra)

where %fit_sv_x : %{self.saved_x, %any},

%fit_x : %{self.x, %any}

{

*p1.saved_x = *p2.x

}

fn x_restore(&mut p1 : &mut %fit_x PointExtra, & p2 : & %fit_sv_x PointExtra)

where %fit_sv_x : %{self.saved_x, %any},

%fit_x : %{self.x, %any}

{

*p1.x = *p2.saved_x;

}

```

This doesn't seem like good syntax to me, since where is usually used to indicate bounds on parameters, whereas this is defining parameters instead. Wouldn't it make more sense to simply use type aliases for this?

For example, type PointSaveXMut<'a> = &'a mut %{self.saved_x, %any} PointExtra could then allow using PointSaveXMut<'_> directly.

VitWW reacted with thumbs up emoji

Author

Good point! I like it! I added your example

Contributor

So, I've left a lot of comments, but I have two main thoughts on the RFC.

The first one is that this RFC is… very difficult to understand. I don't think it properly explains things, it makes a few weird assumptions about the type system, and doesn't explain the motivation properly.

However, after thinking about it, I do think that partial types in some form is probably the best solution to the partial borrow problem. It makes a lot of sense to simply ignore parts of a type itself when borrowing, since the language already considers it unsafe to write to padding and otherwise "undefined" portions of structs. The flexibility is also justified because it helps avoid the case where you need to create dedicated "subtypes" when refactoring something that operates on a larger struct so that you can split up the data properly.

This RFC recommends partial types through the guise of "integrity," which doesn't really make sense. The defining feature of partial types should just be what primitive fields are included, and there shouldn't be any extra caveats on that. If a particular field needs special treatment, we have dedicated wrapper types for that, like MaybeUninit.

Additionally, this RFC doesn't comment at all on how arrays, enums, and unions are included. It makes sense to not include anything besides basic structs and integer primitives at first, but not knowing at all how this could be extended to, for example, enums, makes it difficult to adopt. Partial types on enums in particular explicitly requires enum subset types, where only particular variants are allowed; otherwise, how can you choose what fields to borrow if you're not sure if they even exist?

Essentially… I wouldn't be in favour of this RFC being adopted, even its syntax, without heavy revision. However, I like the idea of partial types, and feel like I would like to see them in the language.

VitWW, chenyukang, and AaronKutch reacted with thumbs up emoji

Contributor

the language already considers it unsafe to write to padding and otherwise "undefined" portions of structs.

I would say that it's actually the opposite: writing to (u16, u8) always semantically writes uninit to the padding byte.
That's why it's not safe to (for example) cast &mut (u16, u16) to &mut (u16, u8).

Contributor

Also what is a "sub-full_type", which is mentioned at the start of the guide section?

Note: It is highly recommended to deprecate operator % as a remainder function (it is still no ambiguities to write "\s+%\s+"), and replace it with another operator (for example: %mod / %rem / mod / rem) to not to be confused by type access.

This is probably the most unlikely part of the proposal. I cannot imagine that being a realistic outcome.

Author

VitWW

commented

Apr 19, 2023

edited

@clarfonthey Thanks for review!

However, after thinking about it, I do think that partial types in some form is probably the best solution to the partial borrow problem. It makes a lot of sense to simply ignore parts of a type itself when borrowing, since the language already considers it unsafe to write to padding and otherwise "undefined" portions of structs. The flexibility is also justified because it helps avoid the case where you need to create dedicated "subtypes" when refactoring something that operates on a larger struct so that you can split up the data properly.

A good advantage to my proposal is allowing incremental adding detailed access to different types, assuming that rest are "%full" but has no internal structure (like Units/ numbers).

Struct and Tuples are first and most important candidates to have detailed access.

Unions from the type system (and detailed access) are indistinguishable from Structs, but since Units were unsafe already, they remain to be unsafe with detailed access.

For arrays and vectors it is unclear how to add detailed access.

Author

@Lokathor
As I mention it is not a mandatory to deprecate "mod %" operator.

Contributor

the language already considers it unsafe to write to padding and otherwise "undefined" portions of structs.

I would say that it's actually the opposite: writing to (u16, u8) always semantically writes uninit to the padding byte. That's why it's not safe to (for example) cast &mut (u16, u16) to &mut (u16, u8).

Huh, I wasn't aware this is how the semantics actually work. Is there any actual benefit to "poisoning" the padding bytes like this, or could this potentially be modified for the case of partial types?

Contributor

Huh, I wasn't aware this is how the semantics actually work. Is there any actual benefit to "poisoning" the padding bytes like this,

It means that you can put the type in a register, or two registers, or in memory, and not have to worry about preserving the padding bytes (depending on where the data is stored it may be faster to either copy the padding or clear it, or even put it in a place where there is no data storage for the padding at all).

or could this potentially be modified for the case of partial types?

I don't see any reason why not; the compiler has all the information regarding a partial type copy and can desugar it to a series of field copies. I imagine there are some restrictions on when a partial type copy is legal though, since the borrow checker supports paths like foo[_].x (meaning "the x field of some array element of foo") which I'm not sure make sense as standalone types.

clarfonthey reacted with thumbs up emoji

Isn't https://github.com/nikomatsakis/fields-in-traits-rfc the proper venue for partial borrowing?

I do wonder if "fields" maybe a distraction here, like maybe trait should've associated lifetime-like objects which limit method disjointness, without saying how the impls realize the disjointness.

pub trait T {
    /// 'a,'b,'c are lifetime-like annotation which reference disjoint fields  
    disjoint 'a, 'b, 'c;

    /// 'x,'y are also reference disjoint fields, but they've no disjoitness from `a,'b,'c
    /// except lifetime +'z outlives fields within lifetime +'b, making them a subset.
    disjoint 'x, 'y, 'z: 'b;  

    /// The compiler infers a mapping from lifetime-like annotation to fields which
    /// satisfies all trait methods, but the trait itself is unambiguous about which
    /// methods can be called together.
    fn foo(& +'b mut 'y self) -> & +'z Foo;
    /// The borrow from bar prevents a borrow from foo, but not visa versa. 
    fn bar(& +'y mut self) -> &'y mut Bar;
}

I've not thought this all through, and fields-in-traits provide a cleaner solution for many cases, but actually an inference based solution brings many advantages by virtue of never discussing fields, ala no Self::x etc. They'd work in associated types too.

Author

@burdges I added this RPC as alternatives and mention it.
Lifetimes was added to Rust because compiler is "weak" and lifetimes help him to understand what is happening with references and other variables.
But "Fields in Traits" RFC is based on assumption of "clever" compiler, who could understand what to do if we add more lifetimes.

My proposal of partial(-access) types just uses what already is in Rust. And as a result we have a mathematical guarantee that it is safe parallel using variables as using by sequence.

We know three approaches to this problem (1) explicitly named fields like this proposal, (2) expose fields via some trait mechanism, or (3) infer the fields somehow. Of these, (2) and (3) work for traits, but afaik (1) makes no sense for traits, which makes proposals like this one not too useful in practice.

I think (2) lacked any inference originally, so no it did not require a "cleaver compiler" originally, but not sure exactly how https://github.com/nikomatsakis/fields-in-traits-rfc evolved later.

An inference solution ala (3) requires a "cleaver compiler" but likely not so much different than what's being done anyways. In essence, "access filters" become relative lifetimes, aka lifetime modifiers relative to 'self, so an associated item of the kind type 'a<'self>: 'self . All the methods, associated types, etc. specify what fields they access, return, etc. so the compiler finds if your impl admit some partial assignment of fields to relative lifetimes. It rejects your impl if no such partial assignment exists.

In practice, we'd typically have two or three methods which require simultaneous mutable borrows, so this looks roughly like

trait T {
    // type 'a<'self>: 'self;  // suppressed as redundant 
    // type 'b<'self>: 'self;  // suppressed as redundant
    // type 'c<'self>: 'self;  // suppressed as redundant
    disjoint 'a, 'b, 'c;
    fn foo(&'a mut 'c self) -> Foo<'a>; 
    fn bar(&'b+'c mut self) -> &'b mut Bar; 
}

We need no knowledge of the fields when using this trait, but our trait requires the methods foo and bar never collide on the mutable borrows they return.

It typically matters more what borrows outlive a method than what actually goes into the self receiver. In this example, 'c says foo and bar could touch other fields separate from those they return, so if omitted then foo and bar cannot overlap at all in what they access. As written, you could've distinct closures using foo and bar due to 'c being mutable in bar, but you could do so if instead you wrote fn bar(&'b mut 'c self) -> &'b mut Bar;

You cannot so easily distinguish "what I need now" vs "what I need later" if you focus upon the fields more explicitly, ala (1) or (2), because those fields no longer have meaning to the outgoing borrowed types Foo<'a> or &'b mut Bar. At best your access patterns would become littered with explicit lifetimes.

I'm convinced the compiler could always infer these relative lifetimes correctly without help from the programmer. If I'm wrong then you could imaging aiding the compiler, like:

impl T for MyT {
    type 'a<'self>: 'self = { field1, mut field2 };
}

We've now come back to roughly your access patterns, except they describe what borrows outlive method bodies better, and they appear more readable ala separate lines, etc.

Author

VitWW

commented

Apr 20, 2023

edited

@burdges
Your code is a "Guide-level explanation": IF I write like this, then compiler do the rest.
FT-RFC is a variation of "partial parameters" proposal.
Unfortunately, it is not enough. Compiler is still needed "partial (not)consumption".

My proposal is honest and it is minimal.
In brief, FT-RFC add additional layer of disjoint Traits over my proposal. That mean my proposal is simpler.

So, based on this, I know for example, that FT-RFC still needs multi-selfs if we need several functions which read common field, but write different fields:

pub fn mf1_rfc(&mut self1 : &mut %{field1, %any} Self, &self2 : & %{common, %any} Self)  
{ /* ... */ }
    
pub fn mf2_rfc(&mut self1 : &mut %{field2, %any} Self, &self2 : & %{common, %any} Self)  
{ /* ... */ }

Author

VitWW

commented

Apr 20, 2023

edited

@clarfonthey Since you understand core idea of partial types and that this idea is minimal and best solution, we could cooperate together about this proposal.
I'm open and ready for drastic heavy changes.

Partial types in theory works with any Product Type (PT = T1 and T2 and T3 ....), but not with sum Types (ST = T1 or T2 or T3 ...)
So, most promised candidates are Structs (maybe Units) and Tuples.
In theory partial types are extended to arrays and vectors, but I'm still unclear how.

Main idea - to give to compiler a mathematical guarantee that it is safe parallel using variables as using them by sequence

VitWW

marked this pull request as draft

April 24, 2023 16:05

Author

This proposal is hard to read, most reactions are "confusions".
So I decided to fully rewrite RFC and close this one.

sparkling_heart Thanks

To @clarfonthey for their reviews of the draft proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

clarfonthey

clarfonthey left review comments
Assignees

No one assigned

Labels
T-lang Relevant to the language subteam, which will review and decide on the RFC.
Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

None yet

6 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK