0

Tracking Issue for `#[instruction_set]` attribute (RFC 2867) · Issue #74727 · ru...

 2 years ago
source link: https://github.com/rust-lang/rust/issues/74727
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Tracking Issue for #[instruction_set] attribute (RFC 2867) #74727

1 of 3 tasks

nikomatsakis opened this issue on Jul 24, 2020 · 31 comments

Comments

Contributor

nikomatsakis commented on Jul 24, 2020

edited by pnkfelix

This is a tracking issue for the RFC "#[instruction_set(...)] attribute" (rust-lang/rfcs#2867).

The feature gate for the issue is #![feature(isa_attribute)].

The lang-team liaison for this feature is @pnkfelix.

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also uses as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

  • How do we ensure that instruction_set and inline assembly always interact correctly? This isn't an implementation blocker but needs to be resolved before Stabilization of the attribute. (Currently, LLVM will not inline a32 functions into t32 functions and vice versa, because they count as different code targets. However, this is not necessarily a guarantee from LLVM, it could just be the current implementation, so more investigation is needed.)

Implementation history

GrayJack reacted with heart emoji All reactions

nikomatsakis

added B-RFC-approved Approved by a merged RFC but not yet implemented. T-lang Relevant to the language team, which will review and decide on the PR/issue. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature.

labels

on Jul 24, 2020

jonas-schievink

added A-codegen Area: Code generation O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state

labels

on Jul 24, 2020

nikomatsakis

added the F-isa_attribute Related to #[instruction_set] attribute introduced by RFC 2867 label

on Jul 28, 2020

Contributor

xd009642 commented on Aug 22, 2020

I'm going to take a shot at implementing this RFC. I've already got an initial idea of where to start, adding it to CodegenFnAttrs in librustc_middle. From there I imagine it's just a case of getting it to the llvm codegen. I'll likely open a WIP PR and ask any questions I have there once I've made some progress

Contributor

xd009642 commented on Sep 1, 2020

@pnkfelix do you mind if I message you for guidance once I have a more concrete idea of what my questions are?

Member

jonas-schievink commented on Sep 2, 2020

edited

I'll write up some pointers.

  • The symbols to be used in the attribute syntax have to be added to symbol.rs.
  • The instruction_set attribute has to be added as a gated attribute in builtin_attrs.rs
  • A new instruction_set field needs to be added to CodegenFnAttrs, and a new InstructionSet enum needs to be added.
  • The codegen_fn_attrs query has to look at the attributes and compute the right value for CodegenFnAttrs. It also has to check:
    • That the instruction_set feature flag is enabled. (actually, adding the attribute might already do this)
    • That the current compilation target (available in tcx.sess.target.target) supports switching between ARM and Thumb mode (notably, this is the case for the thumbv4t-... target, but not for the thumbv6/7/8m-... targets; I'm not sure what the best way to codify this is, maybe by extending target specs).
    • That the attribute is syntactically correct.
    • ...and otherwise, emit an appropriate error (that should also be tested for).
  • The LLVM codegen backend has to be adjusted to apply the target feature attribute to the function, this happens in attributes::from_fn_attrs.
    Thumb mode is enabled on ARM targets by enabling the thumb-mode target feature. If thumb mode is to be disabled, the target feature has to be turned off by prefixing it with - instead of +. This can be probably be done here.
xd009642 reacted with thumbs up emoji All reactions

Contributor

xd009642 commented on Sep 2, 2020

edited

So I've added the symbols to symbol.rs, and instruction_set to CodegenFnAttrs added an InstructionSet enum, updated codegen_fn_attrs (although not all of the sub-bullet points, it's very happy path only right now).

I'll look at the target stuff next since that seems to be the biggest part I'm missing in codegen_fn_attrs and I guess talk to @Lokathor to see what thumb targets I need to do this for.

Thanks that fills in a lot of my open questions smile

Contributor

Lokathor commented on Sep 3, 2020

I don't know the best way to, from inside the compiler, tell if the target is an ARM arch target. And, even then, to tell if the default codegen will be a32 or t32.

I can say that, looking at the platform support page, all the tier2 and tier3 targets that start with arm are ARM with a32 as the default codegen, and then all the targets in tier2 and tier3 that start with thumb are ARM with t32 as the default codegen. For example, thumbv4t-none-eabi, thumbv6m-none-eabi, and so on.

Ideally this works for any target_arch="arm" target, even a custom target profile.

Contributor

ketsuban commented on Sep 3, 2020

If there's already a way to tell a target that starts with arm from one that starts with thumb (which tells you whether the default is -thumb_mode or +thumb_mode) I think I agree with @jonas-schievink that the best way forward is probably a new boolean field (has_thumb_interworking?) in the target spec.

Contributor

xd009642 commented on Sep 6, 2020

So I've attempted to do all the points laid out above to the best of my effort (minus the tests for appropriate errors as I had a feeling my errors are likely to change). There's probably a lot to change as this is my first time working on the compiler, but I think it's ready for some initial feedback - #76260 .

Current status from today's @rust-lang/lang meeting: updating to account for targets that only support one or another instruction set.

Contributor

xd009642 commented on Sep 28, 2020

@joshtriplett are there minutes for the meeting? Or should I just wait till the RFC or issue is updated and adjust the PR?

Contributor

Lokathor commented on Sep 28, 2020

@xd009642 I was at the meeting today and attempted to convey the status, though I may have done so inexpertly.

What I said was that:

  • Your PR is waiting for some T-compiler feedback about how to handle targets that can't switch their instruction set.
  • That probably the solution would be to add an extra flag about if the target allows instruction set switching.

I don't think that the RFC needs an update at all, or that your PR needs any changes other than whatever T-compiler asks for. Feel free to work out any blockers/concerns on the PR with them, and also feel free to comment anything more if I missed a detail.

Contributor

xd009642 commented on Sep 28, 2020

So far there's an additional has_thumb_interworking option on the targets, and only ones with that as true can have the instruction_set applied otherwise it's an unsupported target (false being the default). But yeah I need some CI feedback and potentially some feedback on the UI tests

Contributor

Lokathor commented on Oct 9, 2020

Update: PR merged, starting tomorrow you can use it in your nightly work, presumably we'll have a trial period of at least few releases before considering possible stabilization.

lifning, Skirmisher, memoryruins, Sh3Rm4n, and HeroicKatora reacted with hooray emoji All reactions

Is it possible for core library functions to get inlined despite not being specified as one ISA or the other?
My use case is wanting to possibly take advantage of instructions like a32's ldmia/stmia for some heavy lifting in a perf-critical bit of interrupt handler code living in IWRAM, but looking at rustc's emitted asm, i notice usages of things like core::array::<impl core::ops::index::Index<I> for [T; N]>::index still require branching into thumb_func's that live in ROM.

Contributor

ketsuban commented on Nov 25, 2020

The problem as I understand it is there's only one (T32) copy of the standard library and Rust just emits calls into it despite the function being A32, which LLVM then refuses to inline.

A solution could be for Cargo's unstable build-std attribute to be aware of has_thumb_interworking and include a copy of the standard library compiled for the other code target.

lifning commented on Nov 25, 2020

edited

There're definitely other things besides core that the lack of ability to inline kind of makes a drag too, such as the phantom_fields defined by https://docs.rs/crate/gba-proc-macro (which i think my current best bet is to just.. duplicate each function into foo() and a32_foo() versions and call the latter in my code, but I don't like the idea of forking every dependency crate and making duplicate a32 versions of every stub function i need to call)

For this case it would definitely benefit me to be able to compile arbitrary dependencies in both modes too, though I'm not sure what that would look like ideally. A lot of code here isn't currently delivering on the zero-cost abstraction promise, in other words :P

Contributor

Lokathor commented on Apr 7, 2021

edited by pnkfelix

Update 2021-04-06

Picking a target that supports using both a32 and t32 code, and that is tier2 so we can see it on goldbolt, please look at the following code: https://rust.godbolt.org/z/9v914d8fe https://rust.godbolt.org/z/91cPYq8bn. Note the build args.

  • -Copt-level=3: this is as release as it gets
  • --target armv5te-unknown-linux-gnueabi: all code (including core) is a32 by default, and t32 can be used.

The exact details of the ARM calling conventions aren't quite important here, but the thing to note within the assembly output is:

  • vol_read_a32 gets inlined. It uses the read_volatile method which is also inlined. You get a fully zero-cost ldr r4, [r0].
  • vol_read_t32 does not get inlined. It uses the read_volatile method which doesn't get inlined, you get a shockingly expensive double function call for what should be a single instruction.

Meanwhile let's look at if we can paper over the issue with inline assembly:

  • asm_read_a32 is inlined, but now it's two instructions instead of one. more on that in a moment.
  • asm_read_t32 is not inlined but (because it's not calling to an intrinisc) at least it's only a single function call and not a double function call.
  • A moment ago I said there's an extra move instruction. Depending on how you jigger the assembly you can get that move extra move either in the asm_read_a32 or the asm_read_t32. Assuming that you use the same assembly pattern for both functions, whichever pattern you pick you have to slow one of them down by a single move. That's not zero cost.

"Loka that's just one example though?" Well, sure, but it's a very important one because being able to use these zero-cost intrinsics is the sales pitch of rust. Here's another example: https://rust.godbolt.org/z/bG3bhTd4z:

  • core::iter::traits::iterator::Iterator::copied: does literally nothing at all.
  • core::slice::<impl [T]>::iter: a single add operation.
  • core::convert::num::<impl core::convert::From<u8> for u32>::from: this does a useless input&255 op as a way to turn u8 to u32 for reasons I can't even guess at. Note that if we remove the instruction_set attribute from this function the properly inlined version doesn't have the 255 anywhere in the code, so somehow llvm is just tripping itself over nothing at all.

Summary

  • Does It Work? Yes
  • Is It Zero Cost? Not even remotely
  • Is It Fit To Be Stable? Interesting question I'm so glad you asked. I think arguably yes. Like the one situation where people are using this (folks doing GBA stuff) is already stuck on Nightly until either the thumbv4t-none-eabi target becomes Tier 2 so that we don't have to use build-std, or build-std becomes stable. So, making this stable immediately isn't going to suddenly unblock Stable GBA developments. However... I don't think that there's a way to improve the developer interface for this attribute. The interface between the programmer and the compiler of "this is how you declare this quality for the function you're describing to the compiler" is so simple that there's not really any other way that it could work. The performance problems are "just" a codegen quality issue. That's a very load-bearing "just", but it's true. The problem is that rust relies a crazy amount on inlining, more than you ever normally think about, and so for the codegen here to get better we need to make inlining happen properly for functions using the non-default instruction set. Maybe more stuff gets inlined in MIR, maybe rustc looks at the call-graph of alternate instruction set functions and then for each thing it calls that doesn't have inline assembly and doesn't have a specific instruciton_set assigned it just emits LLVM IR for the function twice, once in a32 and once in t32, and then LLVM can do its thing at the cost of a little compile time. Whatever the fix ends up being, I don't think it would change the current developer interface for using the attribute in code.
  • Overall Current Status: It was mentioned in a lang team meeting that some things can be rated as +0 instead of +1, and right now this is a good example of a +0 unstable feature.
ketsuban reacted with thumbs up emoji All reactions

Contributor

repnop commented on Apr 7, 2021

one thing for consideration is having this not be its own stand-alone attribute applied to functions, but being an available option as a #[repr(instruction_set = "...")] instead since its affecting the representation of the function. food for thought :)

Contributor

Author

nikomatsakis commented on May 13, 2021

@Lokathor any update this month? Thanks for the update last month, we have a meeting next Wed and we'll talk it over.

Contributor

Lokathor commented on May 13, 2021

I did have a build that managed to create an illegal instruction in debug mode but not release mode, though that might be related to the incremental compilation bug that was recently resolved. I haven't had time to check again since that announcement.

Contributor

Author

nikomatsakis commented on Jul 7, 2021

Hey @Lokathor -- so we discussed this in our planning meeting today. Re-reading the April update you posted, it seems like this may indeed be ready for stabilization. Do you know whether anyone is using it in practice? In the blog post I publish today I can put out a "call for testing", as well.

Contributor

Lokathor commented on Jul 7, 2021

I've only heard from the five or so people working on the gba crate. We have working examples of using the attribute in the examples folder of the crate.

Contributor

Lokathor commented on Jun 21

edited

@nikomatsakis Having worked with this some more, I can say that things are ready for stabilization.

the code that you get out of this is pretty bad, but that's just a quality-of-life issue. The interface for how a user marks functions with this attribute wouldn't change at all even if the compiler gets smarter about codegen.

EDIT: particularly, this works with inline_asm, which was the major concern way back when.

joshtriplett

added the I-lang-nominated Indicates that an issue has been nominated for discussion during a lang team meeting. label

on Jun 21

Member

joshtriplett commented on Jun 21

We discussed this in today's @rust-lang/lang meeting. Kicking off stabilization of this seems fine. @pnkfelix, as the liaison, want to start a stabilization FCP if you concur?

joshtriplett

added the S-tracking-ready-to-stabilize This is ready to stabilize; it may need a stabilization report and a PR label

on Jul 13

Member

pnkfelix commented on Aug 2

Okay, I reviewed the update from last year and fixed the first godbolt link (asm! has stabilized since it had been posted).

Its an interesting question, whether to stabilize something that has clear performance issues (in terms of not meeting typical zero-cost expectations). But that is a quality-of-life issue; if users can find utility in this feature even in its current state, then that's an argument for stabilization.

I'll go ahead and fire off a stabilization FCP.

@rfcbot fcp merge

rfcbot commented on Aug 2

edited by cramertj

Team member @pnkfelix has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

rfcbot

added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it.

labels

on Aug 2

Contributor

xd009642 commented on Aug 2

First stabilisation report so let me know if I've done anything wrong pray

Request for Stabilization

Summary

The instruction set attribute provides a new function attribute for architectures which support more than one form of machine code. Currently, this is just implemented for CPUs in the ARM family which support a program switching between "ARM code" and "Thumb code". Although other instruction sets could be added in future as unstable features.

These two instruction sets appear as follows:

Arm code:

#[instruction_set(arm::a32)]
fn arm_code_fn() {
    // function compiled using arm code
}

Thumb code:

#[instruction_set(arm::t32)]
fn thumb_code_fn() {
    // function compiled using thumb code
}

Documentation

Currently, the only usage is homebrew GameBoy Advance development for ARMv4t and the RFC. Given the number of uses of this feature is currently limited to these people this is potentially fine.

Tests

Unresolved Questions

  • Code generation is poor, what work can improve this in future is unresolved (to my knowledge)

Mark-Simulacrum

removed the I-lang-nominated Indicates that an issue has been nominated for discussion during a lang team meeting. label

23 days ago

Contributor

Author

nikomatsakis commented 23 days ago

@rfcbot concern documentation-pr

We need an open PR adding this to the Rust reference!

Contributor

xd009642 commented 23 days ago

Contributor

Author

nikomatsakis commented 9 days ago

@rfcbot resolve documentation-pr

Thanks @xd009642 !

xd009642 reacted with thumbs up emoji All reactions

rfcbot

added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label

8 days ago

rfcbot commented 8 days ago

bellThis is now entering its final comment period, as per the review above. bell

rfcbot

removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label

8 days ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

No one assigned

Labels
A-codegen Area: Code generation B-RFC-approved Approved by a merged RFC but not yet implemented. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. F-isa_attribute Related to #[instruction_set] attribute introduced by RFC 2867 final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. O-Arm Target: 32-bit Arm processors (armv6, armv7, thumb...), including 64-bit Arm in AArch32 state S-tracking-ready-to-stabilize This is ready to stabilize; it may need a stabilization report and a PR T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects

None yet

Milestone

No milestone

Development

No branches or pull requests

11 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK