Emit simpler code from format_args by dtolnay · Pull Request #91359 · rust-lang/...
source link: https://github.com/rust-lang/rust/pull/91359
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Conversation
I made this PR so that cargo expand
dumps a less overwhelming amount of formatting-related code.
println!("rust")
Before:
{ ::std::io::_print(::core::fmt::Arguments::new_v1(&["rust\n"], &match () { _args => [], })); };
After:
{ ::std::io::_print(::core::fmt::Arguments::new_v1(&["rust\n"], &[])); };
println!("{}", x)
Before:
{ ::std::io::_print(::core::fmt::Arguments::new_v1( &["", "\n"], &match (&x,) { _args => [::core::fmt::ArgumentV1::new( _args.0, ::core::fmt::Display::fmt, )], }, )); };
After:
{ ::std::io::_print(::core::fmt::Arguments::new_v1( &["", "\n"], &[::core::fmt::ArgumentV1::new(&x, ::core::fmt::Display::fmt)], )); };
(rust-highfive has picked a reviewer for you, use r? to override)
// Right now there is a bug such that for the expression:
// foo(bar(&1))
// the lifetime of `1` doesn't outlast the call to `bar`, so it's not
// valid for the call to `foo`. To work around this all arguments to the
I am planning to make Clippy less sensitive to macro changes at some point in the near future (rust-lang/rust-clippy#7843). You might want to hold this until that lands.
Labeling blocked on rust-lang/rust-clippy#7843.
@rustbot label -S-waiting-on-review +S-blocked
The latest upstream changes (presumably #91945) made this pull request unmergeable. Please resolve the merge conflicts.
- Rebase to resolve conflict with #91881
Clippy should be out of your way after the next sync!
@rustbot label -S-blocked +S-waiting-on-review
Let's check if this has any implications for compile time.
@bors try @rust-timer queue
Awaiting bors try build completion.
@rustbot label: +S-waiting-on-perf
Thanks for the PR, @dtolnay! Looks like a really nice simplification. I left a couple of suggestions below.
Is there anything that could go subtly wrong with this? Like hygiene or borrow checking? Should we ask someone with more experience in macro expansion to take a look too?
// we can generate simpler code.
let nicely_ordered = fmt_arg_index_and_ty
.clone()
.is_sorted_by(|(i, _), (j, _)| (i < j).then_some(Ordering::Less));
Is that different from .is_sorted_by_key(|(i, _)| i)
?
Yeah, we need to catch permuted as well as repeated elements. Yours is true on 0 0
which would result in an incorrect expansion.
That's very subtle. It would be great if you could add a comment. And maybe find a simpler (if more verbose) way to express this. I personally cannot decode what's going on here in a reasonable amount of time.
array_windows
(#75027) may be clearer for this. The sequence has no indices out of order or repeated if: for every adjacent pair of elements, the first one's index is less than the second one's index.
// Figure out whether there are permuted or repeated elements. If not,
// we can generate simpler code.
//
// The sequence has no indices out of order or repeated if: for every
// adjacent pair of elements, the first one's index is less than the
// second one's index.
let nicely_ordered =
fmt_arg_index_and_ty.array_windows().all(|[(i, _i_ty), (j, _j_ty)]| i < j);
// The following Iterator<Item = (usize, &ArgumentType)> has one item
// per element of our output slice, identifying the index of which
// element of original_args it's passing, and that argument's type.
let fmt_arg_index_and_ty = self
Would it make sense to use a SmallVec for this?
SmallVec<(usize, &ArgumentType)>
instead of Iterator<Item = (usize, &ArgumentType)>
? I'm not sure why that would be better. Do you think any of the code below would get clearer? We only ever need to iterate it forward.
The sequence would be "cached" on the stack instead of evaluated twice. But I'm fine with keeping this as an iterator either.
Is there anything that could go subtly wrong with this? Like hygiene or borrow checking? Should we ask someone with more experience in macro expansion to take a look too?
From what I know about macros, I do not expect any difference in hygiene or borrow checking between the two expansions.
The only other thing to be mindful of is const promotion. The kind of thing in the following snippet -- even though both macros nominally appear equivalent, only one of them works when a promotion is required. However in the PR we are only going in the direction of can't promote ⟶ can promote so nothing would break, and conversely in format_args nothing relies on promotion anyway because the relevant argument of core::fmt::Arguments::new_v1
is args: &'a [ArgumentV1<'a>]
, not &'static [ArgumentV1<'a>]
, so there isn't any case of new code that would work after the PR that didn't work before the PR.
macro_rules! a { ($e:expr) => { match ($e,) { (e,) => &[e], } }; } macro_rules! b { ($e:expr) => { &[$e] }; } fn main() { let _: &'static [i32] = a!(1); // fail let _: &'static [i32] = b!(1); // ok }
error[E0716]: temporary value dropped while borrowed --> src/main.rs:4:22 | 4 | (e,) => &[e], | ^^- | | | | | temporary value is freed at the end of this statement | creates a temporary which is freed while still in use ... 16 | let _: &'static [i32] = a!(1); // fail | -------------- ----- in this macro invocation | | | type annotation requires that borrow lasts for `'static`
…, so there isn't any case of new code that would work after the PR that didn't work before the PR.
To elaborate slightly: if core::fmt::ArgumentV1::new
were made const fn
in the future then this PR would make a difference. It would mean that the following would work whereas it wouldn't before:
let _: core::fmt::Arguments<'static> = format_args!("{}", 1);
where the formatted values are all consts.
Today ArgumentV1::new is not const so the PR doesn't make any code newly compile.
Try build successful - checks-actions
Build commit: 3a6b090 (3a6b09020f099db8b66b764a71e2ef5608e6c4ea
)
Finished benchmarking commit (3a6b090): comparison url.
Summary: This change led to very large relevant mixed results in compiler performance.
- Moderate improvement in instruction counts (up to -2.0% on
full
builds ofcranelift-codegen
) - Very large regression in instruction counts (up to 7.4% on
full
builds ofkeccak
)
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.
Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.
Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged
along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.
@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression
@bors retry
Yield priority to 1.58.1.
Test timed out
@bors retry
The "x86_64-msvc-1" job timed out during "Testing rustc_macros stage1 (x86_64-pc-windows-msvc -> x86_64-pc-windows-msvc)". All other jobs succeeded. Gonna gamble that this is an infra glitch and not an msvc-specific issue with this PR. We'll find out after the next round.
@bors retry
Yielding priority to 1.58.1 security point release.
Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 0bcacb3 to master...
Finished benchmarking commit (0bcacb3): comparison url.
Summary: This change led to large relevant mixed results in compiler performance.
- Large improvement in instruction counts (up to -2.2% on
full
builds ofcranelift-codegen check
) - Large regression in instruction counts (up to 3.0% on
full
builds ofhtml5ever opt
)
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.
Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged
along with sufficient written justification. If you cannot justify the regressions please open an issue or create a new PR that fixes the regressions, add a comment linking to the newly created issue or PR, and then add the perf-regression-triaged
label to this PR.
@rustbot label: +perf-regression
the latest performance report nearly exactly matches the predicted performance and @Mark-Simulacrum had a great analysis for where that came from.
@rustbot label: +perf-regression-triaged
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK