5

Boucher: rustc_codegen_gcc can now bootstrap rustc

 2 years ago
source link: https://lwn.net/Articles/889989/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Boucher: rustc_codegen_gcc can now bootstrap rustc

[Posted April 1, 2022 by jake]
On his blog, Antoni Boucher updates the status of rustc_codegen_gcc, which "is a GCC codegen for rustc, meaning that it can be loaded by the existing rustc frontend, but benefits from GCC by having more architectures supported and having access to GCC’s optimizations". A significant milestone has been reached: "the GCC codegen has made enough progress to be able to compile rustc itself". For the Rust programming language, rustc is the standard compiler, so this work will eventually allow programs to be built for a number of architectures that are not supported by rustc. He also made progress beyond just building the compiler as he "was able to compile rustc using the GCC codegen and use the resulting rustc to compile a Hello World".

(Log in to post comments)

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 15:08 UTC (Fri) by flussence (subscriber, #85566) [Link]

If there's anyone keeping track of the details I have a question: how much does this shorten the bare-metal bootstrap chain to get to a modern Rust by? I'm vaguely aware there was a project to do this for an entire OS but afaik Rust needed to be built up via its own lineage previously.

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 16:03 UTC (Fri) by Gaelan (subscriber, #145108) [Link]

The previous state of the art was mrustc [0], which is implemented in C++ and is generally capable of compiling a rustc a few versions behind the latest; then you use that to compile a newer version of rustc, then use that version of rustc to compile an even newer rustc, and so on until you’re at the latest version.

[0]: https://github.com/thepowersgang/mrustc

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 16:09 UTC (Fri) by moltonel (guest, #45207) [Link]

This doesn't change the bootstrap chain, it "only" opens up the possibility to (cross)compile to platforms only supported by GCC. Note that platform support also needs to added to rustc/stdlib itself, regardless of the backend used.

Bootstrapping is typically done by cross-compiling using rustc version N-1. If you want to bootstrap from a C compiler, you can use mrustc which compiles rustc-1.54 as C source and use that to build 1.55, 1.56 etc up to the version you need. That mrustc chain gets shortened about once a year.

There's a longterm goal to be able to compile rustc N using rustc N-2 or older, but it'll be a while yet. There's also gccrs which will use gcc's bootstrap machinery, but it's unclear how desirable it'll be as a Rust compiler.

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 15:15 UTC (Fri) by artefact (guest, #154379) [Link]

>this work will eventually allow programs to be built for a number of architectures that are not supported by rustc

By rustc's LLVM codegen, which is the current default. rustc_codegen_gcc allows rustc to target architectures supported by gcc.

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 18:32 UTC (Fri) by jhoblitt (subscriber, #77733) [Link]

I understand it is the early days but have there been any comparisons [yet] of the asm generated between the llvm and gcc backends? The README includes the statement "A secondary goal is to check if using the gcc backend will provide any run-time speed improvement for the programs compiled using rustc.".

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 19:15 UTC (Fri) by david.a.wheeler (subscriber, #72896) [Link]

> I understand it is the early days but have there been any comparisons [yet] of the asm generated between the llvm and gcc backends?

I wouldn't bother checking those comparisons right now. They just got it working at *all*. GCC's back-end does a lot, but I expect that this new front-end will need to provide more information & be refined further to fully use the GCC back-end's optimizations.

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 1, 2022 22:10 UTC (Fri) by developer122 (subscriber, #152928) [Link]

What I'd like to know is about correctness. One recent rust community discovery is that GCC and LLVM don't agree on how 128 bit numbers should be expressed in memory, for example. The stuff of nightmares: https://gankra.github.io/blah/c-isnt-a-language/

Unfortunately, if going through several rounds of GCC optimization that may be hard to verify just by comparing binaries.

Boucher: rustc_codegen_gcc can now bootstrap rustc

Posted Apr 2, 2022 12:06 UTC (Sat) by Vipketsh (guest, #134480) [Link]

The int128 thing isn't great but I would hazard a guess that it's not a big deal because it has not been widely used in ABIs. It has been found and, I would hope, the appropriate fix will be made to whichever compiler needs to be fixed.

Otherwise, I have to say that article seems to be more about generating hysteria against C rather than trying to get people to be aware of the issues. What that article paints as being C's evil spewed onto the world (ABI) is actually being used by *every* compiled language, not because it originates in C, but because it has been tuned to the platforms involved. I shudder to imagine the mess if C, rust, etc. all had their own private calling conventions and structure layouts.

All examples given are not wrong, but painted in a way that the difficulty in matching their calling convention is exclusively C's fault and that somehow calling C code can not be avoided. It can, it just may be easier to, say, call into GTK (despite all the pains involved) than write a new toolkit in your new language. The author's first example is possibly the worst: if you want to interact with an OS (make I/O) you need to match *some* convention -- the OS defined one (system calls) or some wrapping thereof (e.g. C). Neither may be easy, but that is not C's fault. The rant about parsing C being difficult is about singling out a specific language and making it look as bad as possible. Firstly, there is no reason to have to interact with C (as mentioned above) and secondly every single programing language is filled with quirks and hard to parse.

The intended humor about a long target list is dishonest at best and manipulative at worst. ABIs are matched to the target architecture (ARM, x86, etc.) not only for performance reasons (e.g. endianess) but also out of necessity coming from inherent differences in how the machines operate (e.g. where return addresses are placed). It can not be avoided, nor is it C's fault. Then there is the historical context that ABIs have been changed at various times to get some more convenient property (e.g. performance). There is no suggestion of what an alternative could be.

The example about opaque structs and symbol versioning is just pain wrong. The whole point of using an opaque struct in an ABI is so you can change the struct without changing the ABI. There is no reason that you have to version symbols and very few libraries do so. Furthermore, if you wish to maintain old and new versions of your APIs, there is no reason to hide them behind symbol versions -- just expose both and the user can select (at their leisure) which one to use so the whole problem explained is side-stepped.

The minidump example tries to shoehorn a problem (fixed binary file layout) into a structure layout ABI problem. Whomever designed it made a choice to carefully have the same structure layout as the file layout, but that is not the only way nor is it impossible to handle if the structure would not match the file layout. Pretty much every function from the windows API could have been used as an example here, with the difference that the hysteria about reserved fields and structure size alignments could not be written.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK