New issue

Some perf optimizations and logging #87203

Conversation

Copy link

Contributor

jackh726 commented 7 days ago

Various bits of (potential) perf optimizations and some logging additions/changes pulled out from #85499

The only not extremely straightforward change is adding needs_normalization in trait::project. This is just a perf optimization to avoid fold through a type with only opaque types in UserFacing mode (as they aren't replaced).

This should be a simple PR for anyone to review, so I'm going to let highfive assign. But I'll go ahead and cc @nikomatsakis in case he has time today.

Copy link

Collaborator

rust-highfive commented 7 days ago

r? @matthewjasper

(rust-highfive has picked a reviewer for you, use r? to override)

Copy link

Contributor

Author

jackh726 commented 7 days ago

Copy link

Collaborator

rust-timer commented 7 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Copy link

Contributor

bors commented 7 days ago

Trying commit 5e1b781 with merge e2e811f...

The only not extremely straightforward change is adding needs_normalization in trait::project. This is just a perf optimization to avoid fold through a type with only opaque types in UserFacing mode (as they aren't replaced).

Is there a chance you can extract it to a separate commit? It seems like that would help a targeted revert (if necessary) and review.

Copy link

Contributor

Author

jackh726 commented 7 days ago

The only not extremely straightforward change is adding needs_normalization in trait::project. This is just a perf optimization to avoid fold through a type with only opaque types in UserFacing mode (as they aren't replaced).

Is there a chance you can extract it to a separate commit? It seems like that would help a targeted revert (if necessary) and review.

Sure, I'll do that when the try build has finished.

Copy link

Contributor

bors commented 7 days ago

Try build successful - checks-actions
Build commit: e2e811f (e2e811f95353b3dc2ef55ab8201023faade67192)

Copy link

Collaborator

rust-timer commented 7 days ago

Copy link

Contributor

nikomatsakis commented 7 days ago

r=me once perf is complete

jackh726

force-pushed the

jackh726:logging

branch 2 times, most recently from 5a65ad1 to eb5b836

6 days ago

Copy link

Collaborator

rust-timer commented 6 days ago

Finished benchmarking try commit (e2e811f): comparison url.

Summary: This change led to significant mixed results in compiler performance.

Very large improvement in instruction counts (up to -21.5% on full builds of deeply-nested-async-check)
Moderate regression in instruction counts (up to 1.9% on incr-unchanged builds of deeply-nested-async-debug)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR led to changes in compiler perf.

Next Steps: If you can justify the regressions found in this perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Copy link

Contributor

Author

jackh726 commented 6 days ago

Copy link

Collaborator

rust-timer commented 6 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Copy link

Contributor

bors commented 6 days ago

Trying commit fe78232 with merge 22d8be5...

Copy link

Contributor

bors commented 6 days ago

Try build successful - checks-actions
Build commit: 22d8be5 (22d8be524c5ecb4aaef5b971d28e73641eeee4db)

Copy link

Collaborator

rust-timer commented 6 days ago

Copy link

Collaborator

rust-timer commented 6 days ago

Finished benchmarking try commit (22d8be5): comparison url.

Summary: This change led to significant mixed results in compiler performance.

Very large improvement in instruction counts (up to -21.4% on full builds of deeply-nested-async-check)
Moderate regression in instruction counts (up to 2.1% on incr-unchanged builds of deeply-nested-async-opt)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf +perf-regression

Copy link

Contributor

Author

jackh726 commented 6 days ago

Ok, so looks like inlining changes did help a bit. But still seeing a regression. So let's continue to search.

Copy link

Contributor

Author

jackh726 commented 6 days ago

Copy link

Collaborator

rust-timer commented 6 days ago

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

Copy link

Contributor

bors commented 6 days ago

Trying commit 4288833 with merge c935d20...

Copy link

Contributor

tmiasko commented 6 days ago

cachegrind diff from previous try build for ctfe-stress-4-check:

--------------------------------------------------------------------------------
Ir                    file:function
--------------------------------------------------------------------------------
444,251,381 (54.05%)  ???:rustc_mir::interpret::place::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::eval_place
281,869,444 (34.29%)  ???:rustc_middle::ty::structural_impls::<impl rustc_middle::ty::fold::TypeFoldable for &rustc_middle::ty::TyS>::has_type_flags
 50,234,830 ( 6.11%)  ???:rustc_mir::interpret::place::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::ref_to_mplace
 34,603,022 ( 4.21%)  ???:rustc_mir::interpret::step::<impl rustc_mir::interpret::eval_context::InterpCx<M>>::eval_rvalue_into_place

has_type_flags definitely should have been marked as inline. As far as I can see it was also partially responsible for regression in eval_place.

Copy link

Contributor

bors commented 6 days ago

Try build successful - checks-actions
Build commit: c935d20 (c935d2011581bef6e0a684430dcf22ad2a8ecd32)

Copy link

Collaborator

rust-timer commented 6 days ago

Copy link

Collaborator

rust-timer commented 6 days ago

Finished benchmarking try commit (c935d20): comparison url.

Summary: This change led to significant improvements in compiler performance.

Very large improvement in instruction counts (up to -21.7% on full builds of deeply-nested-async-check)

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Copy link

Contributor

Author

jackh726 commented 6 days ago

Okay readded the inlining change since that did seem good. Removed the has_type_flags for this PR, but I do that could be a nice win in a future PR, since it's a bit intersection vs a fold.

@bors r=nikomatsakis

Copy link

Contributor