7

proc-macro: Stop wrapping `ident` matchers into groups by petrochenkov · Pull Re...

 2 years ago
source link: https://github.com/rust-lang/rust/pull/92472
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Conversation

Copy link

Contributor

@petrochenkov petrochenkov commented on Jan 1

ident is always a single token and can be treated in the same way as tt.
r? @Aaron1011

Copy link

Contributor

Author

petrochenkov commented on Jan 1

@bors try

Copy link

Contributor

bors commented on Jan 1

hourglass Trying commit c179945 with merge ef8c760...

Copy link

Contributor

bors commented on Jan 1

sunny Try build successful - checks-actions
Build commit: ef8c760 (ef8c760c4c3d93b561a5b431502e92f4b4773aca)

Copy link

Contributor

Author

petrochenkov commented on Jan 1

@craterbot check

Copy link

Collaborator

craterbot commented on Jan 1

ok_hand Experiment pr-92472 created and queued.
robot Automatically detected try build ef8c760
mag You can check out the queue and this experiment's details.

information_sourceCrater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Copy link

Collaborator

craterbot commented on Jan 1

construction Experiment pr-92472 is now running

information_sourceCrater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Copy link

Collaborator

craterbot commented on Jan 4

tada Experiment pr-92472 is completed!
bar_chart 44 regressed and 16 fixed (203512 total)
newspaperOpen the full report.

warning If you notice any spurious failure please add them to the blacklist!
information_sourceCrater is a tool to run experiments across parts of the Rust ecosystem. Learn more

Copy link

Member

Aaron1011 commented on Jan 5

This seems analogous to automatically performing cfg-expansion before invoking attribute macros. It's "usually" the right thing to do, but it hides information (#[cfg] attributes, the Span of the group) that some macros might want to use.

Copy link

Contributor

Author

petrochenkov commented on Jan 7

@Aaron1011
See #91166 for some background for this PR.

I think groups is a way to preserve parsing priorities first of all, not to maintain the macro_rules matcher substitution stacks.
(From that point of view ident doesn't need to be grouped, it's not affected by parsing priorities as a single token.)
Otherwise we would wrap tokens substituted with tt too.
Moreover, even ident doesn't preserve the whole substitution stack, only the last substitution.

I think it would be useful to maintain these stacks in the compiler somehow, including for tts, for which they are currently not tracked at all.
That would be useful for spans, because the interpolated token spans are too damn useful for joining them with neighboring spans, like getting a reasonable span for the $a + b expression, which currently happens if $a is an ident, but not if $a is a tt.
I just don't think groups is the right tool for that.

Copy link

Contributor

Author

petrochenkov commented on Jan 7

The main source of regressions here is rhai which generates a definition and use of a variable with different contexts (one is plain Span::call_site() and another is inherited from a part of input), but those two contexts happen to coincide if ident matchers are wrapped (the group is used as that "part of input").
So, it's not about group parsing, just an accidental hygiene issue.
(I'll look at other regressions later.)

Copy link

Contributor

Author

petrochenkov commented on Jan 8

Besides rhai there are only 3 non-spurious root regressions:

  • ruly-4.1.0 and displaydoc-lite-0.1.3 - the proc macros are internal, do manual parsing, and expect a fixed input, which is generated by macro_rules and happened to include groups at the time of writing. The macros panic if the input differ from the expected.
  • MNTRA.compiler_project.6bd7adf91e1532d3ebac49efef132b4d456cc757 - some kind of a hygiene context mismatch similar to rhai, I didn't figure out the precise details.

So, I think the conclusion is that we should be able make this change, if we want it.

Copy link

Contributor

Author

petrochenkov commented on Jan 8

FWIW, among those "16 fixed" there are some legitimate fixes for regressions caused by (not so) recent proc macro bugfixes that resulted in new groups being produced :)

Copy link

Member

dtolnay commented on Jan 8

I agree — I think we can go forward with this. Great!

Copy link

Member

JohnCSimon commented on Jan 23

triage: looks like this is ready for review

@rustbot label: -S-waiting-on-author +S-waiting-on-review

Copy link

Contributor

Author

petrochenkov commented on Jan 27

This is an observable language and library change that needs to go through some team.
I'm not sure which exactly, so I'll nominate it for both, please remove the nomination if you think it's not yours responsibility.

Code matched by macro_rules matchers like ty or expr (and using TokenKind::Interpolated internally in rustc) is converted to token streams when passed to proc macros.
For majority of matchers, with exception of tt, such token streams are wrapped into a TokenTree::Group with Delimiter::None to preserve parsing priorities (the typical example being $expr * 3 where $expr is 1 + 2, which is tokenized as ⟪ 1 + 2 ⟫ * 3 instead of 1 + 2 * 3).

This PR removes this wrapping from the ident matcher which is always a single token, so it's not affected by any parsing priorities and is very similar to tt.

  • Before: $ident + 1 => ⟪ i ⟫ + 1.
  • After: $ident + 1 => i + 1.

dtolnay

changed the title [experiment] proc-macro: Stop wrapping ident matchers into groups

proc-macro: Stop wrapping ident matchers into groups

on Feb 8

Copy link

Member

dtolnay commented on Feb 8

@rust-lang/lang, @rust-lang/libs-api:
@rfcbot fcp merge

See petrochenkov's summary of the change in #92472 (comment).

We discussed this change in the libs-api meeting on February 2 and I believe those present had no objections at the time.

Copy link

rfcbot commented on Feb 8

edited

Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

Copy link

Member

dtolnay commented on Feb 8

FCP for lang is about the vague question "what happens if identifiers are not one token in the future". In the libs-api meeting we felt it would be okay to wrap multiple-token identifiers in a None group even while continuing to have no None group around identifiers which are a single token, which will always exist.

Copy link

Member

Aaron1011 commented on Feb 8

edited

Note that this will remove some Span information that proc-macros could have been using: #92472 (comment). I don't know of any case where this is used in practice (and as @petrochenkov said, groups are probably the wrong tool for making this information available) - the most obvious one would be setting token spans to 'nicer' locations (e.g. a location within a macro_rules definition). However, I think everyone should be aware that this has a (small) chance of removing functionality that a proc-macro is actually relying on.

Copy link

Contributor

nikomatsakis commented on Feb 9

@rustbot label +relnotes

This clearly needs a note in the release notes -- can someone knowledge (@dtolnay or @petrochenkov, perhaps?) draft one?

Copy link

Contributor

nikomatsakis commented on Feb 9

@rfcbot concern documentation

@petrochenkov are the current tokenization rules documented somewhere (e.g., the rust reference?). Some quick searching did not find them, if so.

I am happy to make this change but I'd like the current tokenization rules to be documented in the reference (or in some location linked to by the reference).

Copy link

Contributor

Author

petrochenkov commented 29 days ago

edited

@nikomatsakis

are the current tokenization rules documented somewhere (e.g., the rust reference?)

Probably not.
Not so long ago pretty-printing was often used for tokenization, so there was not many rules besides "try as best as we can".
I'll try to document this in the reference.

Copy link

Member

scottmcm commented 29 days ago

The references says, regarding macro_rules https://doc.rust-lang.org/reference/macros-by-example.html#transcribing

The ident, lifetime, and tt fragment types are an exception, and can be matched by literal tokens.

That implies to me that something like this behaviour has basically already happened for MBE, and thus it makes sense to do for proc macros too.

(Please correct me if I'm understanding this wrong. I'm pretty new to looking at macro details.)

@rfcbot reviewed

Copy link

Contributor

nikomatsakis commented 29 days ago

@petrochenkov that would be awesome -- it doesn't have to be the world's best docs, if it has ssome "TODO" items seems ok, but having some docs with key concerns would be great.

Copy link

Contributor

nikomatsakis commented 24 days ago

@rfcbot reviewed

Copy link

Contributor

nikomatsakis commented 10 days ago

@rfcbot resolve documentation

Copy link

rfcbot commented 10 days ago

bellThis is now entering its final comment period, as per the review above. bell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

No reviews

Projects

None yet

Milestone

No milestone

Linked issues

Successfully merging this pull request may close these issues.

None yet

12 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK