Introduce `DynSend` and `DynSync` auto trait for parallel compiler #107586

Conversation

Contributor

part of parallel-rustc #101566

This PR introduces DynSend / DynSync trait and FromDyn / IntoDyn structure in rustc_data_structure::marker. FromDyn can dynamically check data structures for thread safety when switching to parallel environments (such as calling par_for_each_in). This happens only when -Z threads > 1 so it doesn't affect single-threaded mode's compile efficiency.

r? @cjgillot

FilipAndersson245 reacted with thumbs up emoji

rustbot

added A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

labels

Feb 2, 2023

compiler/rustc_interface/src/passes.rs

Outdated

		@@ -952,6 +958,151 @@ fn analysis(tcx: TyCtxt<'_>, (): ()) -> Result<()> {
		Ok(())
		}

		fn non_par_analysis(tcx: TyCtxt<'_>) -> Result<()> {

This code duplication is really unfortunate.

Contributor

Author

That's right. Haven't thought of an elegant way to write it yet, but I'll fix that soon

compiler/rustc_middle/src/ty/context.rs

Outdated

		// runtime whether these non-shared data structures actually exist.
		unsafe impl<'tcx> DynSendSyncCheck for TyCtxt<'tcx> {
		#[inline]
		fn check_send_sync(&self) {

Can you use let GlobalCtxt { a, b, c } = self for exhaustiveness checking?

Contributor

Author

Yea, it makes sense!

compiler/rustc_data_structures/src/sync.rs

Outdated

		// Only set by the `-Z threads` compile option
		pub unsafe fn set_parallel() {
		let p = SyncUnsafeCell::raw_get(&PARALLEL as *const _);
		*p = true;
		}

First of all, it would be great to have a doc comment here, especially given that this is an unsafe function. Second of all, at first glance it seems like this can be more simply written as *PARALLEL.get() = true, am I missing something? Lastly, is is_parallel hot? Can we use an AtomicUsize instead?

Even if it's a little hot, it's unlikely that an atomic integer will have a performance impact, since this is just reading from it.

Contributor

Author

Thanks for the review! PARALLEL will only be set once, so I want to take advantage of this to minimize the cost of reading it. with_context_opt might be hot, but I doubt the necessary to check thread safety here. Except this I think is_parallel() is not hot, since it is only used in relatively top-level logic to determine whether to switch to a parallel environment.

You should first benchmark it before going for the more unsafe variant. Atomics have no to minimal overhead depending on the exact use and ordering (which I think can be relaxed here because we don't need to sync any other writes?).

WaffleLapkin reacted with thumbs up emoji

Contributor

Author

OK. I changed to AtomicBool instead. Can you help run a perf?Thanks!
I think we can just use Relaxed, yea

WaffleLapkin reacted with heart emoji

This comment has been minimized.

rustbot

added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

Feb 2, 2023

Contributor

Trying commit 4086eba with merge 62ba597...

Contributor

Try build successful - checks-actions
Build commit: 62ba597 (62ba597a41741055fcf131dcee8b691cc9445515)

This comment has been minimized.

Collaborator

Finished benchmarking commit (62ba597): comparison URL.

Overall result: regressions - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean	range	count
Regressions (primary)	0.6%	[0.3%, 1.2%]	134
Regressions (secondary)	0.7%	[0.1%, 1.9%]	77
Improvements (primary)	-	-	0
Improvements (secondary)	-	-	0
All (primary)	0.6%	[0.3%, 1.2%]	134

Max RSS (memory usage)

Results

Cycles

Results

rustbot

added perf-regression Performance regressions

and removed S-waiting-on-perf Status: Waiting on a perf run to be completed.

labels

Feb 2, 2023

Contributor

Author

I think there may be two reasons for regression :
1 multiple if let Some(shared_data) = shared_data.as_ref() in graph.rs.
2 is_parallel() was hot and inefficient.
I'll try to fix them tomorrow

Contributor

I've been thinking over how to best approach specialization. I think the dynamic dispatch entry point to rustc_query_impl would be a good place to branch. Using proof objects instead of GAT seems more flexible, at least for locks. I can write up some details on those later.

It seems like a good idea to land locks with a runtime switch first so there is an optimized baseline to compare with specialization. I suggest finishing my branch by extracting just the lock implementation and moving it to a new lock module under sync. You can also add a mode module with a global atomic with 3 states (uninit, on, off). Use a compare and swap to ensure it can only move from uninit to one of the other states. I'm not quite sure what's going on with the DynSendSyncCheck trait, but the manual listing of fields is a bit awkward. We can however literally copy Send and Sync from the standard library and I'd suggest doing so, placing them in a marker module under sync with a rename.

Contributor

Author

In my local test, is_parallel() is to be the main cause of regression. After I changed it to const fn and always return false, the regression was not visible. In addition, the performance of AtomicBool, SyncUnsafeCell, and static mut is not much different.

Also as I guessed, with_context_opt is hot, calling is_parallel() in it looks like the main reason.

Contributor

Author

Can we run another perf? Thanks!

This comment has been minimized.

rustbot

added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label

Feb 3, 2023

Contributor

Trying commit b2d7910 with merge 2949dcd...

Contributor

Try build successful - checks-actions
Build commit: 2949dcd (2949dcde96d9502e79a5af27f252db8c97e8533e)

This comment has been minimized.

Contributor

The latest upstream changes (presumably #110243) made this pull request unmergeable. Please resolve the merge conflicts.

@SparrowLii if I read the comment thread correctly @cjgillot should approve this PR after one last rebase? Thanks!

@rustbot author

SparrowLii reacted with eyes emoji

rustbot

added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

May 3, 2023

Contributor

Author

@cjgillot Can it be merged now? : ) I don't have privileges so I need your help

cjgillot

added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

labels

May 7, 2023

Contributor

@bors r+ rollup=never

Contributor

Commit d7e3e5b has been approved by cjgillot

It is now in the queue for this repository.

bors

added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

May 13, 2023

Contributor

Testing commit d7e3e5b with merge dd8ec9c...

Contributor

Test successful - checks-actions
Approved by: cjgillot
Pushing dd8ec9c to master...

bors

added the merged-by-bors This PR was explicitly merged by bors label

May 13, 2023

bors

merged commit dd8ec9c into

rust-lang:master

May 13, 2023

12 checks passed

Collaborator

Finished benchmarking commit (dd8ec9c): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 660.401s -> 660.339s (-0.01%)

Reviewers

Zoxc

Zoxc left review comments

cjgillot

cjgillot left review comments

nnethercote

nnethercote left review comments

bjorn3

bjorn3 left review comments

WaffleLapkin

WaffleLapkin left review comments

Nilstrieb

Nilstrieb left review comments

Assignees

cjgillot

Labels

A-query-system Area: The rustc query system (https://rustc-dev-guide.rust-lang.org/query.html) A-translation Area: Translation infrastructure, and migrating existing diagnostics to SessionDiagnostic merged-by-bors This PR was explicitly merged by bors S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Milestone

1.71.0

Development

Successfully merging this pull request may close these issues.

None yet

13 participants

Introduce `DynSend` and `DynSync` auto trait for parallel compiler by SparrowLii...

Introduce `DynSend` and `DynSync` auto trait for parallel compiler #107586

Conversation

This comment has been minimized.

This comment has been minimized.

Overall result: regressions - ACTION NEEDED

Instruction count

Max RSS (memory usage)

Cycles

This comment has been minimized.

This comment has been minimized.

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

Recommend

如何通过Meta目录工具，实现高效转化潜在客户？

在 Ubuntu Server 22.04 上安装 gcc-13

艺人口嗨，笑果遭殃

投融快讯 | 中昊芯英获数亿元Pre-B轮融资；鲲游光电完成新一轮数亿元融资；恒瑞源正完...

Implement `EphemeralKeyInterface` for `MockDb` · Issue #1191 · juspay/hyperswitc...

Aqua Security releases Real-Time CSPM to tackle multi-cloud security risks

那个有趣的灵魂——左耳朵耗子

Single Abstract Method Traits

微软在Windows上开放新版Phone Link应用

The IntelliJ Rust Blog

About Joyk

Introduce `DynSend` and `DynSync` auto trait for parallel compiler by SparrowLii...

Introduce DynSend and DynSync auto trait for parallel compiler #107586

Conversation

This comment has been minimized.

This comment has been minimized.

Overall result: regressions - ACTION NEEDED

This comment has been minimized.

This comment has been minimized.

Overall result: no relevant changes - no action needed

Recommend

About Joyk

Introduce `DynSend` and `DynSync` auto trait for parallel compiler #107586