8

Tracking issue for RFC 2046, label-break-value · Issue #48594 · rust-lang/rust ·...

 2 years ago
source link: https://github.com/rust-lang/rust/issues/48594
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Tracking issue for RFC 2046, label-break-value #48594

2 of 4 tasks

Centril opened this issue on Feb 27, 2018 · 171 comments

2 of 4 tasks

Comments

Contributor

Centril commented on Feb 27, 2018

edited

This is a tracking issue for RFC 2046 (rust-lang/rfcs#2046).

Steps:

Unresolved questions:

EliSnow, emilyaherbert, stanislav-tkach, mikhail-krainik, ActuallyaDeviloper, YohananDiamond, Lynnesbian, maroider, ahlinc, mpfaff, and 14 more reacted with thumbs up emojiest31, phaylon, dhardy, Avi-D-coder, emilyaherbert, Coder-256, Caduser2020, YohananDiamond, maroider, miraclx, and 9 more reacted with hooray emoji All reactions

Centril

added B-RFC-approved Approved by a merged RFC but not yet implemented. T-lang Relevant to the language team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.

labels

on Feb 27, 2018

Contributor

SoniEx2 commented on Feb 28, 2018

Using "return" would have interesting implications for labeled ? (tryop questionmark operator thingy).

Contributor

est31 commented on Mar 1, 2018

Use return as the keyword instead of break?

@mark-i-m and @joshtriplett have already spoken out against return, but I'll join in given that it is still apparently an unresolved question.

break (and continue) are only usable with a loop.
If you can break something, you can continue it. (I don't think there's an obvious syntax to pick for continue on a block.)

In C, C++, Java, C#, JavaScript and probably more languages you are usually breaking a switch statement in order to prevent fall-through. Rust has solved this much nicer with | in patterns but people coming from those languages won't really see break as something only for for loops. Especially as Java and JavaScript expose the feature via break as well and not return.

The "rules to remember" argument works into the other direction really well however. From what I can tell it is a commonality of the mentioned languages as well as Rust that return only applies to functions and nothing else. So if you see a return, you know that a function is being left.

Labeling a block doesn't cause errors on existing unlabeled breaks

First, I think this happens very rarely as the labeled break feature is admittedly not something that will be used 10 times per 1000 lines. After all, it will only apply to unlabeled breaks that would cross the boundary of the block, not unlabeled breaks inside the block. Second, users of Rust are accustomed to complaints / error messages by the compiler, they will happily fix them! Third (this is the strongest point I think), if instead of labeling a block you wrap it into a loop, you already need to watch out for unlabeled breaks and there is no error message that conveniently lists the break statements, you've got to hunt for them yourself :).

mark-i-m and eternaleye reacted with thumbs up emoji All reactions

Contributor

nikomatsakis commented on Mar 1, 2018

Especially as Java and JavaScript expose the feature via break as well and not return.

This to me is the killer point. break from blocks is a thing in many languages. return from blocks...not so much.

estebank, Centril, varkor, SoniEx2, ahlinc, and tuguzT reacted with thumbs up emoji All reactions

Contributor

Author

Centril commented on Mar 1, 2018

Personally, I share @joshtriplett's view on using break instead of return, but it seemed to me like the discussion hadn't been resolved on the RFC... If you believe the question is resolved in the lang team, tick the box with a note =)

Contributor

est31 commented on Apr 15, 2018

Just saying that I'm working on this. Don't need mentor instructions. Just to not duplicate any efforts. Expect a PR soon.

Centril, scottmcm, and Pauan reacted with hooray emoji All reactions

Member

scottmcm commented on Apr 15, 2018

edited

I'm still in favour of return over break, but I can agree to disagree here. Question resolved.

SoniEx2, dobrakmato, and ahlinc reacted with thumbs up emojiKolsky reacted with thumbs down emojiest31, Centril, and ciphergoth reacted with heart emoji All reactions

Contributor

topecongiro commented on May 21, 2018

Currently (with rustc 1.28.0-nightly (a1d4a9503 2018-05-20)) rustc does not allow unsafe on a labeled block. Is this expected?

Member

scottmcm commented on May 21, 2018

@topecongiro Yes, I believe it's intentional that this is currently allowed only on plain blocks. It might change in future, but given that this is such a low-level and unusual feature, I'm inclined towards that being a feature rather than a restriction. (On the extreme, I certainly don't want else 'a: {.)

mark-i-m, topecongiro, est31, and miraclx reacted with thumbs up emoji All reactions

Member

mark-i-m commented on May 21, 2018

edited

Definitely agree. Unsafe + unusual control flow sounds like something to discourage.

In a pinch, though, you could use:

'a: {
unsafe {...}
}

Right?

topecongiro, scottmcm, and miraclx reacted with thumbs up emoji All reactions

Contributor

SoniEx2 commented on May 22, 2018

edited

Actually, altho else does create a new lexical scope, it's not a block. The whole if-else is a block (kinda). So no, you wouldn't get else 'a: { you'd get 'a: if ... else {.

Member

eddyb commented on May 22, 2018

else contains a block (expression). there is no "new lexical scope" without blocks.
An even worse surface syntax position than else would be 'a: while foo 'b: {...}.
(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Centril, miraclx, and jonas-schievink reacted with heart emoji All reactions

Contributor

Author

Centril commented on May 22, 2018

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

That's a great observation!

miraclx reacted with thumbs up emoji All reactions

Contributor

SoniEx2 commented on May 22, 2018

I think labels should be part of block-containing expressions, not blocks themselves. We already have precedent for this with loop. (As it happens, a plain block itself is also a block-containing expression. But things like if and loop are block-containing expressions without being blocks I guess.)

(Things like while or for shouldn't support label-break-value, because they could or could not return a value based on whether they complete normally or with a break.)

Member

mark-i-m commented on May 22, 2018

@eddyb

(interestingly enough, continue 'a is break 'b, we might want to rely on that at least internally)

Only if break 'b re-checks the loop condition...

Member

eddyb commented on May 23, 2018

@mark-i-m It's equivalent to 'a: while foo {'b: {...}}, the break wouldn't check the loop condition, the loop itself would, because the loop condition is checked before each iteration of the body block.

Member

mark-i-m commented on May 23, 2018

Woah, I find that highly unintuitive. I expect break 'b to be basically goto 'b, meaning we never exit the loop body and the condition is not checked again...

Member

mark-i-m commented on May 23, 2018

Oh man_facepalming I see...

Member

mark-i-m commented on May 23, 2018

This is why I don't like labeled break/continue :/

Contributor

rpjohnst commented on May 23, 2018

Well, we specifically don't have the ability to label these weird inner blocks, so I don't see the problem. break always means "leave this block" and, given the above restriction, there's no way for that to mean anything other than "goto the spot after the associated closing brace."

scottmcm, RagnarGrootKoerkamp, JoJoJet, and riking reacted with thumbs up emoji All reactions

Member

mark-i-m commented on May 23, 2018

My confusion was not specific to weird inner blocks, but I don't really want to reopen the discussion. That already happened and the community decided to add it.

Contributor

SoniEx2 commented on May 24, 2018

Okay, I understand accessibility is a big issue with programming languages... however, labeled break is extremely useful if you write code like me.

So, how can we make labeled break more accessible?

scottmcm reacted with confused emojimark-i-m reacted with heart emoji All reactions

Member

mark-i-m commented on May 24, 2018

So, how can we make labeled break more accessible?

That's a great question. Some ideas I had:

  • We should collect some samples of how people use this in the wild. We can look for undesirable patterns or lazy habits.
  • We should have opinionated style around when it is considered poor practice to use labeled break/continue.
  • We may be able to add lints for some patterns that could be turned into loops/iterator combinators mechanically (I can't think of any such patterns off the top of my head, though).

As a first (admittedly biased) sample, my last (and first) encounter with labeled break in real code was not stellar: https://github.com/rust-lang/rust/pull/48456/files#diff-3ac60df36be32d72842bf5351fc2bb1dL51. I respectfully suggest that if the original author had put in slightly more effort they could have avoided using labeled break in that case altogether... This is an example of the sort of practice I would like to discourage if possible.

Contributor

rpjohnst commented on May 24, 2018

That's... not labeled break?

scottmcm and Pauan reacted with thumbs up emoji All reactions

Contributor

Coder-256 commented on May 30

edited

do { ... } while(cond) -> loop { ...; if !cond { break; } }

@jgarvin Not sure about their exact use case, but one issue with that method is it doesn't work properly with continue, though it's still fairly trivial to accomplish with a macro (note: I'd replace first in that example with something less likely to overlap with existing variables, e.g. __do_loop_first)

@buzmeg In the meantime until this is stabilized, as mentioned in the RFC there is always the alternative of using a loop, e.g. 'label: loop { ...; break 'label foo; }. Everything that can be done with this PR is already possible; ie. this PR is not strictly necessary to enable any new behavior, it's just syntax sugar since loops can be a bit messy/hard to read. The overall structure remains nearly identical, so you may still need to restructure things.

Of course in some very complex cases there is just no real replacement for goto, but this PR only partly solves that problem anyway. IMHO, goto is still far too problematic to be worth supporting, but this is a good feature that fills in part of the gap, and I would also like to see it stabilized.

EDIT: I mistakenly wrote that this RFC would enable certain code that is not possible to write today; I think this used to be true, but I think that now the compiler is smart enough to recognize that a loop will only run once if it ends with break. Also, seems that macros are somewhat hygenic (perhaps the reference should be updated?)

WaffleLapkin and jgarvin reacted with heart emoji All reactions

Contributor

WaffleLapkin commented on May 31

@Coder-256 rust macros are (somewhat) hygienic, so first can't actually overlap with anything. See playground. The only thing that I know of that can break is const, but constants usually are named in screaming snake case, so they shouldn't clash with first either.

Coder-256 reacted with thumbs up emoji All reactions

buzmeg commented on May 31

edited

I'm confused why lack of do/while would require a lot of restructuring? do { ... } while(cond) -> loop { ...; if !cond { break; } } doesn't look like it introduces any new nesting.

This is an effect of two interacting issues

If your C-code is

do {
    do {
        <some long code>
        if error goto cleanup;
    while (cond1)

    <retry loop>
    if error goto cleanup;
} while (cond2)

recoverycode: ...

That winds up:

'exittorecovery: loop {
    loop {
        loop {
            <some long code>
            if error {break 'exittorecovery}  // Don't put this after the cond1 test or it's a bug
            if cond1 {break;}
        }

        <retry loop>
        if error {break 'exittorecovery}  // Don't put this after the cond2 test or it's a bug
        if cond2 {break;}
    }
    break 'exittorecovery;  // Don't forget this or it's a bug (fortunately it's a pretty obvious infinite loop)
}

recoverycode; ...

This is one of the few instances where Rust introduces mechanisms for a bug that doesn't exist in C.

It's not possible to execute code after the conditional test in C because of the structure of do/while while it's quite possible in the Rust replacement idiom. In addition, the C code has no "break" statements so it's not possible to confuse which break goes where while in the Rust code it's actively difficult to keep track of the various break statements.

I'm certainly not going to argue for "goto" as that's simply not possible with Rust semantics. Neverthless, not having do/while makes this a lot uglier than it should be.

However, syntax does sometimes matter for humans.

Aside: Yes, you can argue that the code shouldn't be written that way in Rust. And you would be correct. However, some of us have to start with legacy codebases before they can be modernized. This isn't theoretical: See: https://github.com/ARMmbed/DAPLink/blob/main/source/daplink/cmsis-dap/DAP.c

Of course, I'm not an expert, so it is entirely possible I'm missing something obvious in Rust that would make this discussion moot. If so, I apologize for the noise.

Contributor

digama0 commented on May 31

edited

@buzmeg Why not rename 'exittorecovery to 'cleanup? You should just think of break 'cleanup as being the rust way to write goto cleanup in the C code.

As for flipping the if statements, it's true that you could put code after the test but that's not necessarily a bug, the rust code is just more general about how you want to structure the loop exit test. Obviously if you want behavior equivalent to a do-while loop though it should be the last thing in the loop.

I think the code as written is essentially exactly the way it should be written (assuming rust doesn't spawn any new syntax in the near future), and there is a pretty mechanical way to obtain that code by syntactic transformation of the C code you started with. The do-while transformation has already been explained and is easy enough to understand, and the goto transformation looks like this:

code { goto foo; }
foo:

==>

'foo: loop {
  code { break 'foo; }
  break 'foo;
}

This only works for downward goto (for upward goto you would use continue 'foo instead), but most cleanup blocks in C have this structure. (Alternatively, you can and in most cases should use Drop impls for this, but this is the most faithful translation of the C code.) The only thing you have to watch out for is that any unlabeled break or continue (assuming the label construct is itself inside a loop) needs to be explicitly targeted at the outer loop.

Contributor

clarfonthey commented on Jun 5

I don't think this was explicitly part of the RFC but another thing that might be worth including before stabilisation: right now labels aren't allowed before statements or expressions, particularly ones that feel like blocks, e.g. the below just shows an unhelpful syntax error:

let val = 'label: match val {
    // ...
};

I think that this could be implemented for all expressions since even something weird like the below could be possible:

let val = 'label: (1, if other == 3 { break 'label (2, 3) } else { other });

But obviously, I feel like this would make the most sense to explicitly deny for now and then allow only specific subsets that make sense, like match and if expressions.

At least until those types of expressions are valid, I think that the parser should accept any 'label: expr and recover to assuming 'label: { expr }, with an automatically applicable fix for it. We can change this to just accepting these at a later point if we feel like it's a good idea.

RagnarGrootKoerkamp, Coder-256, zesterer, bluebear94, sharnoff, lucy, and afetisov reacted with thumbs up emojigThorondorsen and WaffleLapkin reacted with eyes emoji All reactions

Member

Amanieu commented on Jul 14

I used this feature in my code for a loop which needs to check some complex conditions for a special case, and otherwise fall back to a normal case:

for ... {
    // Special case which requires multiple complex conditions.
    'skip: {
        if !condition1() {
            break 'skip;
        }

        if !condition2() {
            break 'skip;
        }

        if !condition3() {
            break 'skip;
        }

        handle_special_case();
        // Falls through to normal case.
    }

    handle_normal_case();
}
repnop reacted with thumbs up emojimark-i-m reacted with confused emoji All reactions

Member

jyn514 commented on Jul 16

edited

Stabilization proposal

The feature was implemented in #50045 by @est31 and has been in nightly since 2018-05-16 (over 4 years now).
There are no open issues other than the tracking issue. There is a strong consensus that break is the right keyword and we should not use return.

There have been several concerns raised about this feature on the tracking issue (other than the one about tests, which has been fixed, and an interaction with try blocks, which has been fixed).

Many different examples of code that's simpler using this feature have been provided:

Additionally, @petrochenkov notes that this is strictly more powerful than labelled loops due to macros which accidentally exit a loop instead of being consumed by the macro matchers: #48594 (comment)

@nrc later resolved their concern, mostly because of the aforementioned macro problems.
@joshtriplett suggested that macros could be able to generate IR directly
(#48594 (comment)) but there are no open RFCs,
and the design space seems rather speculative.

@joshtriplett later resolved his concerns, due to a symmetry between this feature and existing labelled break: #48594 (comment)

@withoutboats has regrettably left the language team.

@joshtriplett later posted that the lang team would consider starting an FCP given a stabilization report: #48594 (comment)

Thus, I move to stabilize label-break-value (RFC 2046).

To make it easier to see exactly what changes are proposed, I've also opened a stabilization PR: #99332. This should not be merged until and unless the FCP finishes.

Report

Interactions with other features

Labels follow the hygiene of local variables.

label-break-value is permitted within try blocks:

let _: Result<(), ()> = try {
    'foo: {
        Err(())?;
        break 'foo;
    }
};

label-break-value is disallowed within closures, generators, and async blocks:

'a: {
    || break 'a
    //~^ ERROR use of unreachable label `'a`
    //~| ERROR `break` inside of a closure
}

label-break-value is disallowed on BlockExpression; it can only occur as a LoopExpression:

fn labeled_match() {
    match false 'b: { //~ ERROR block label not supported here
        _ => {}
    }
}

macro_rules! m {
    ($b:block) => {
        'lab: $b; //~ ERROR cannot use a `block` macro fragment here
        unsafe $b; //~ ERROR cannot use a `block` macro fragment here
        |x: u8| -> () $b; //~ ERROR cannot use a `block` macro fragment here
    }
}

fn foo() {
    m!({});
}
Rodrigodd, Skgland, jhpratt, slanterns, est31, jonhoo, Nilstrieb, lcnr, JamesHallowell, AhmadAlHallak, and Nokel81 reacted with thumbs up emojiWaffleLapkin, nyuichi, lcnr, JamesHallowell, glaebhoerl, repnop, SoniEx2, and afetisov reacted with heart emoji All reactions

Member

jyn514 commented on Jul 16

@joshtriplett could you please start an FCP?

Member

joshtriplett commented on Jul 17

@jyn514 Thank you for the detailed summary!

@rfcbot merge

jyn514 and RalfJung reacted with heart emoji All reactions

rfcbot commented on Jul 17

edited by scottmcm

Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot

added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it.

labels

on Jul 17

Contributor

lcnr commented on Jul 19

Really exited about this heart I don't need it too frequently, but it feels intuitive and there were a bunch of places where it would have meaningfully simplified my code.

label-break-value is permitted within try blocks:

let _: Result<(), ()> = try {
    'foo: {
        Err(())?;
        break 'foo;
    }
};

Did you intend to show that you can exit try blocks using label-break-value? Which is also the behavior for labelled loops and what I expect.

#![feature(try_blocks)]
#![feature(label_break_value)]
fn main() {
    'foo: {
        let _: Result<(), ()> = try {
            Err(())?;
            break 'foo;
        };
    }
}
jyn514 reacted with heart emoji All reactions

Member

jyn514 commented on Jul 19

ah, good catch, thanks - yes, both are allowed. I took my example directly from one of the existing tests, I'll add yours to the same test in #99332.

Member

jyn514 commented 18 days ago

Hi @cramertj @nikomatsakis @scottmcm - have you gotten a chance to look at this change? I'm happy to answer questions or do more research. Message ID: ***@***.***>
DesmondWillowbrook, slanterns, and JamesHallowell reacted with thumbs up emoji All reactions

Member

scottmcm commented 6 days ago

edited

I remembered the labels involved in this as not being hygienic, but when I went to test that out I was pleased to see that I was wrong about that: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=02c73a78c1098db8fe2e8dd1624f205b. So that resolves my biggest worry about this (since I suspect 90% of loop labels in the wild are just 'a, and that people would pick the same for in-macro uses of this).

One objection here a few years ago mentioned graydon's limits to (some) growth blog post that I wanted to address specifically, since how I see that applying here is different.

First, I think we've already paid most of the costs here because of the existence of labelled loops. 'a: while x > y { was already there in Rust 1.0. The extension of that to break 'a value for 'a: loop in RFC#1624 has been stable for over 5 years. And while neither is common (nor something you'd put in the first few weeks of a rust course, likely), I've not seen a bunch of confusion or bugs related to them. So yes, 'a: { is technically another thing to learn/grok/evaluate, but given that one already needs to do the same for 'a: loop {, 'a: while, and 'a: for, I just don't see it as a problem. (And we already have any "why can't I use that label as a reference lifetime?" confusion possibilities from labelled loops, so this doesn't introduce such things.)

Second, this is the structured programming scope exiting construct. All of those that we have already -- break, continue, and return -- are just sugar for this. For example, loop { … A …; continue; … B … } is loop { 'continue: { … A …; break 'continue; … B … } }, loop { … A …; break; … B … } is 'break: { loop { … A …; break 'break; … B … } }, and fn foo() { … A …; return; … B … } is fn foo() { 'fn: { … A …; break 'fn; … B … } }. Thus the overall size of the "core" of rust plausibly decreases with this feature, as the semantics of a bunch of different things can be described by translation to the one core thing. And the translation of this into a CFG is well-understood, so it's completely disappeared by the time we're in MIR, meaning that this adds no complexity to things working at that level -- like borrow checking or MIR semantics projects. Not to mention that, on an implementation complexity front, the compiler will likely need to support it internally anyway as part of implementing other features, meaning exposing it has even less impact than it might seem. (Notably, the ? RFC ended up needing this to describe its behaviour, as further demonstration that a Rust specification will likely end up containing something like this anyway.)

So this isn't something I want to see often in Rust code, but I think it's well worth supporting for those few places where it's appropriate, since the cost seems to be low to me. For example, I could imagine a r-a refactoring for inlining foo() where sometimes the cleanest way for it to do that would be as a 'foo: { ... } block so that return xs in the inlined function can become break 'foo xs. Certainly most of the time I'd expect it try to rewrite it to if-elses or something to avoid needing that, but I'm confident there could still be cases where that's not the best option. (Honestly, with a time machine I think I'd put this in 1.0 and never have labelled loops at all -- that way the unusual thing of it would stand out even more, the grammar is that much simpler, there's no extra continue 'foo concept, and one can still do all the same things.)

I still weakly prefer return 'label for this over break 'label, but as I said above, but I'm fine with break since others seem to have much stronger opinions here that break is the better choice.

@rfcbot reviewed


Oh, one other thing as I was re-skimming the thread. I still consider it a good thing that this requires a block. This isn't the kind of thing that's so pervasive that eliding it is worth allowing. I do agree it would be possible to allow it in more places, but I really don't want that. (This reminds me of attributes, for example -- in general I think it's good to put attributes on blocks, so it's clear to what they apply, rather than support them everywhere and subject people to wondering what a + #[b] c * d means. And similarly, I really don't want to have to define what a + 'b: c * break 'b d means. Having a block gives the label an obvious lexical scope for resolution -- a + 'b: { c } * break 'b d is then clearly an error, and a + 'b: { c * break 'b d } clearly works, but is silly, giving a dead code lint.)

jonhoo, crumblingstatue, johnschug, zesterer, eddyb, est31, ModProg, DesmondWillowbrook, varkor, and Rageking8 reacted with thumbs up emojijyn514, ChayimFriedman2, slanterns, clarfonthey, zesterer, est31, rodrimati1992, and afetisov reacted with heart emoji All reactions

rfcbot

added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label

6 days ago

rfcbot commented 6 days ago

bellThis is now entering its final comment period, as per the review above. bell

slanterns, crumblingstatue, Skgland, WaffleLapkin, est31, a1phyr, jhpratt, ModProg, jyn514, zesterer, and 2 more reacted with hooray emoji All reactions

rfcbot

removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label

6 days ago

Contributor

nikomatsakis commented 3 days ago

@rfcbot reviewed

Member

RalfJung commented yesterday

edited

label-break-value is disallowed within closures, generators, and async blocks:

To be clear, I assume it is allowed if both the break and the label are inside the closure/generator/async -- what is forbidden is crossing that boundary?

scottmcm, est31, and Skgland reacted with thumbs up emoji All reactions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

No one assigned

Labels
B-RFC-implemented Approved by a merged RFC and implemented. B-unstable Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-label_break_value `#![feature(label_break_value)]` final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. S-tracking-needs-summary It's hard to tell what's been done and what hasn't! Someone should do some investigation. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

43 participants
and others

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK