Introduce `DynSend` and `DynSync` auto trait for parallel compiler by SparrowLii...
source link: https://github.com/rust-lang/rust/pull/107586
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Introduce DynSend
and DynSync
auto trait for parallel compiler
#107586
Conversation
Contributor
part of parallel-rustc #101566
This PR introduces DynSend / DynSync
trait and FromDyn / IntoDyn
structure in rustc_data_structure::marker. FromDyn
can dynamically check data structures for thread safety when switching to parallel environments (such as calling par_for_each_in
). This happens only when -Z threads > 1
so it doesn't affect single-threaded mode's compile efficiency.
r? @cjgillot
added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
labels
@@ -952,6 +958,151 @@ fn analysis(tcx: TyCtxt<'_>, (): ()) -> Result<()> { | ||
Ok(()) |
||
} |
||
fn non_par_analysis(tcx: TyCtxt<'_>) -> Result<()> { |
This code duplication is really unfortunate.
Contributor
Author
That's right. Haven't thought of an elegant way to write it yet, but I'll fix that soon
// runtime whether these non-shared data structures actually exist. |
||
unsafe impl<'tcx> DynSendSyncCheck for TyCtxt<'tcx> { |
||
#[inline] |
||
fn check_send_sync(&self) { |
Can you use let GlobalCtxt { a, b, c } = self
for exhaustiveness checking?
Contributor
Author
Yea, it makes sense!
// Only set by the `-Z threads` compile option |
||
pub unsafe fn set_parallel() { |
||
let p = SyncUnsafeCell::raw_get(&PARALLEL as *const _); |
||
*p = true; |
||
} |
First of all, it would be great to have a doc comment here, especially given that this is an unsafe
function. Second of all, at first glance it seems like this can be more simply written as *PARALLEL.get() = true
, am I missing something? Lastly, is is_parallel
hot? Can we use an AtomicUsize
instead?
Even if it's a little hot, it's unlikely that an atomic integer will have a performance impact, since this is just reading from it.
Contributor
Author
Thanks for the review! PARALLEL
will only be set once, so I want to take advantage of this to minimize the cost of reading it. with_context_opt
might be hot, but I doubt the necessary to check thread safety here. Except this I think is_parallel()
is not hot, since it is only used in relatively top-level logic to determine whether to switch to a parallel environment.
You should first benchmark it before going for the more unsafe variant. Atomics have no to minimal overhead depending on the exact use and ordering (which I think can be relaxed here because we don't need to sync any other writes?).
Contributor
Author
OK. I changed to AtomicBool instead. Can you help run a perf?Thanks!
I think we can just use Relaxed, yea
This comment has been minimized.
added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label
Contributor
Try build successful - checks-actions |
This comment has been minimized.
Collaborator
Finished benchmarking commit (62ba597): comparison URL. Overall result: regressions - ACTION NEEDEDBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @bors rollup=never Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)Results CyclesResults |
added perf-regression Performance regressions
and removed S-waiting-on-perf Status: Waiting on a perf run to be completed.
labels
Contributor
Author
I think there may be two reasons for regression : |
Contributor
I've been thinking over how to best approach specialization. I think the dynamic dispatch entry point to It seems like a good idea to land locks with a runtime switch first so there is an optimized baseline to compare with specialization. I suggest finishing my branch by extracting just the lock implementation and moving it to a new |
Contributor
Author
In my local test, Also as I guessed, |
Contributor
Author
Can we run another perf? Thanks! |
This comment has been minimized.
added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label
Contributor
Try build successful - checks-actions |
This comment has been minimized.
Contributor
The latest upstream changes (presumably #110243) made this pull request unmergeable. Please resolve the merge conflicts. |
@SparrowLii if I read the comment thread correctly @cjgillot should approve this PR after one last rebase? Thanks! @rustbot author |
added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
Contributor
Author
@cjgillot Can it be merged now? : ) I don't have privileges so I need your help |
added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
labels
Contributor
@bors r+ rollup=never |
added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
Contributor
Test successful - checks-actions |
Collaborator
Finished benchmarking commit (dd8ec9c): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results CyclesThis benchmark run did not return any relevant results for this metric. Binary sizeThis benchmark run did not return any relevant results for this metric. Bootstrap: 660.401s -> 660.339s (-0.01%) |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK