Don't allocate on SimplifyCfg/Locals/Const on every MIR pass by miguelraz · Pull...
source link: https://github.com/rust-lang/rust/pull/110477
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Conversation
Contributor
Hey! 👋🏾 This is a first PR attempt to see if I could speed up some rustc internals.
Thought process:
pub struct SimplifyCfg {
label: String,
}
in compiler/src/rustc_mir_transform/simplify.rs fires multiple times per MIR analysis. This means that a likely string allocation is happening in each of these runs, which may add up, as they are not being lazily allocated or cached in between the different passes.
...yes, I know that adding a global static array is probably not the future-proof solution, but I wanted to lob this now as a proof of concept to see if it's worth shaving off a few cycles and then making more robust.
Collaborator
(rustbot has picked a reviewer for you, use r? to override) |
added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
labels
Collaborator
Some changes occurred to MIR optimizations cc @rust-lang/wg-mir-opt |
changed the title
try interning SimplifyCfg strings
Don't allocate on SimplifyCfg/Locals/Const on every MIR pass
Member
Do you have any evidence to suggest that these are expensive operations? I don't think this code is really worth the extra complication and possibly introducing new panic edges to the compiler just to avoid some string formatting. |
This comment has been minimized.
added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label
Contributor
Try build successful - checks-actions |
This comment has been minimized.
Collaborator
Finished benchmarking commit (b89ba5d): comparison URL. Overall result: improvements - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results CyclesResults |
removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label
Apparently keccak and codegen-cranelift have been particularly noisy recently, so I've been advised to ignore those. As for the other positive perf results, I'll have to take a closer look at them to see if they're legit or just noise as well. (Or maybe I'll queue another perf run in the morning and see if these perf results stick?) |
Contributor
comment worth noting from a sidebar conversation about (current behavior, as it relates to this PR):
|
Member
Anyways, @miguelraz, I did some thinking. I think the right approach for this would be to make Something like:
|
Contributor
The improvement is probably legit. I've seen major regressions in the past resulting from over-calling |
Contributor
This would also reduce the size from the size of the string ref (pointer and length) to the size of the enum, which will save about 7~15 bytes per instance of this struct that is in memory at any given moment. Not the most important win compared to allocation overhead, but y'know, everything counts in large amounts. |
Member
cachegrind results from libc Debug Full:
lots of noise, but there are some memcpys and core::fmt, so this looks like a legit improvement. Awesome! |
Member
I quickly profiled a few other allocation sites inside |
Member
Contributor
Even simpler: we can make |
Contributor
Author
@cjgillot yes, I just realized that and push that very change, thanks for the tip! |
@miguelraz can you squash this into one commit? Other than that, this PR looks good to go.
Member
@bors r=compiler-errors |
added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
pub fn new(label: &str) -> Self { |
||
SimplifyConstCondition { label: format!("SimplifyConstCondition-{}", label) } |
||
} |
||
pub enum SimplifyConstConditionPassName { |
(nit) no need to block the PR on this, but if you touch this code again in the future could you rename this to just SimplifyConstCondition
? That would make it consistent with all the other ones
Contributor
Author
Thanks for this, finally got around to this fix.
#110657
Contributor
Test successful - checks-actions |
Collaborator
Finished benchmarking commit (9e7f72c): comparison URL. Overall result: improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results CyclesResults |
Contributor
Author
For the record, this PR was a revived attempt of #108026. |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK