RFC: Syntax for embedding cargo-script manifests by epage · Pull Request #3503 ·...
source link: https://github.com/rust-lang/rfcs/pull/3503
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
RFC: Syntax for embedding cargo-script manifests #3503
Conversation
This is for the T-lang side of #3502
Example:
#!/usr/bin/env cargo
```cargo
[dependencies]
clap = { version = "4.2", features = ["derive"] }
```
use clap::Parser;
#[derive(Parser, Debug)]
#[clap(version)]
struct Args {
#[clap(short, long, help = "Path to config")]
config: Option<std::path::PathBuf>,
}
fn main() {
let args = Args::parse();
println!("{:?}", args);
}
added the T-lang Relevant to the language subteam, which will review and decide on the RFC. label
Member
In PL/Rust (a Rust subset that works as a postgres procedural language handler) we use a somewhat hacky syntax like (see https://github.com/tcdi/plrust/blob/main/doc/src/dependencies.md for example source)
I greatly prefer the approach in this RFC (and will likely push PL/Rust transition to it if the RFC is accepted) but it's probably worth noting as prior art. |
This comment was marked as resolved.
Member
If you think this RFC a little bit further, then you could generalize this to make rustc ignore all ``` delimitered blocks, to say that rustc should just ignore all these sections and leave them to other tools similarly to how it does it already for the shebang. And then you'd see how similar triple backtick blocks are to IMO it's bad style to add cargo specific extensions to Rust's syntax. Just say that ``` delimitered blocks are ignored by rustc and that parser implementations are suggested to add them. |
Contributor
Author
Thanks! I've added this as prior art and would love more interoperability (one of the stated motivations for the cargo script RFC) |
Outdated
Comment on lines
306 to 316
### Alternative 7: Extended Shebang |
||
````rust |
||
#!/usr/bin/env cargo |
||
# ```cargo |
||
# [dependencies] |
||
# foo = "1.2.3" |
||
# ``` |
||
fn main() {} |
||
```` |
Contributor
Author
@bstrie (moved here for context / easier to follow)
I would like to consider alternative 7, the extended shebang. I don't think the backticks and the redundant
cargo
specifier should be necessary, producing this:#!/usr/bin/env cargo # [dependencies] # foo = "1.2.3" fn main() {}
I think this looks quite good. It's fewer lines than the proposed syntax, and mirrors both the shebang syntax and the attribute syntax.
I understand that people might want this to be generalizable/extensible, but this could suffice for now and any discussion about how to generalize the "info" portion can be left for a future discussion. If people think that it's important to make it generalizable right now, then I'd be interested to hear some concrete use cases.
EDIT: although I suppose an unmentioned downside of this is that
# [dependencies]
might look a bit too much like an attribute.
Contributor
Author
As you mentioned, there is the syntax ambiguity and backticks let us "escape" this block of #
lines.
Yes, the cargo specifier is redundant to the reader but each use of cargo
is for a different purpose
- One is for execution
- One is for parsing
I've noted in the Future Possibilities that we could relax the requirement on having cargo
in the infostring in the future. I would lean towards defaulting to cargo rather than parsing the shebang because shebang parsing is messy.
We also wanted to leave the door open (slightly) for adding additional frontmatter blocks, like if we decide to embed lockfiles.
Contributor
Author
Is this meant to positive suggest something or to point out a slipper slope? I'm not too sure the intent.
I'm not seeing how this is different than the suggested future state. We are starting with it being locked to cargo initially as we work out the design / usage and then can remove that restriction which is noted Future Possibilities. |
Outdated
[drawbacks]: #drawbacks |
||
- A new concept for Rust syntax, adding to overall cognitive load |
||
- Requires people escape markdown code fences with an extra backtick which they are likely not used to doing (or aware even exists) |
Contributor
Does this refer to markdown inside multiline TOML strings? It's not really clear. Do any of Cargo's manifest fields even support markdown syntax?
Contributor
It may be referring to general usage outside of Rust source files, for example if I wanted to express this syntax here in this comment I would have to escape the backticks somehow (or indent by four spaces, but that doesn't allow syntax highlighting).
Contributor
Ooohh I see. In that case there's also the option of using the (older, I think?) syntax of prefixing the .rs
snippet with four spaces instead of ````
, though I'm not sure it it's as well-known as code fences, and it has the downside of not supporting language tags.
Contributor
Author
394387b adds some context. This is about sharing snippets over github or zulip. Since you are putting a code fence inside of a code fence, the outer one needs to use 4 backticks
Outdated
- Parsers are available to make this work (e.g. `syn`) |
||
Downsides |
||
- The `cargo` macro would need to come from somewhere (`std`?) which means it is taking on `cargo`-specific knowledge |
Contributor
For the macro approach, I don't think it would be necessary to embed any Cargo-specific knowledge in std. In every other approach the data here is stored in a glorified comment, which means we're fine if it gets thrown away as far as Rust is concerned. The macro here could simply expand to nothing, and trust that other tooling will parse the macro body as needed (which is easier than it sounds, since the macro body should just be treated as raw tokens rather than anything that needs parsing). Rather than calling it cargo, call it build!
or meta!
or something. Although I suppose the fact that it will still need to lex to Rust tokens might be limiting compared to a string or a comment, unless we want to make it magical.
Contributor
Author
I added mention of meta!
in 42077b7. I don't bother exploring it due to the other problems with macros.
# Future possibilities |
||
[future-possibilities]: #future-possibilities |
||
- Treat `cargo` as the default infostring |
What is the main reason for not including this in this proposal?
Contributor
Author
I clarified this a little in 842f722
Contributor
Author
Basically, we want to start with the absolute minimal approach and see what we feel needs to be relaxed from there (which is backwards compatible) rather than make a lot of assumptions and then regret them,
This comment was marked as resolved.
Contributor
Author
Member
To the end user? Yes. But the "end user" view of a language is broken all the time. Say when |
- When discussing with a Rust crash course teacher, it was felt their students would have a hard time learning to write these manifests from scratch |
||
- Unpredictable location (both the doc comment and the cargo code block within it) |
||
- From talking to a teacher, users are more forgiving of not understanding the details for structure data in an unstructured format (doc comments / comments) but something that looks meaningful, they will want to understand it all requiring dealing with all of the concepts |
||
- The attribute approach requires explaining multiple "advanced" topics: One teacher doesn't get to teaching any attributes until the second level in his crash course series and two teachers have found it difficult to teach people raw strings |
I don't fully buy into this objection and the one above: not everything has to be explained as a pre-requisite to it being used in a course.
Let's take println!
as an example: the Rust book introduces println!
in the first chapter but it doesn't provide any macro discussion beyond "println!
is a macro, we'll talk about that later", where "later" is in chapter 19 (!).
I don't see what makes a cargo
attribute any different here from println!
or a #[derive(Debug)]
.
Elaborating further: aren't we over-indexing on language newcomers here?
Contributor
Author
Let's take println! as an example: the Rust book introduces println! in the first chapter but it doesn't provide any macro discussion beyond "println! is a macro, we'll talk about that later", where "later" is in chapter 19 (!).
While I'm speaking for myself and not that person, I feel there is a big difference between println!
and
#![cargo(manifest = r#"
[package]
edition = "2018"
"#)]
With println!
the name (mostly) makes sense and it was a weird !
after it that can be glossed over. Thats less the case with attributes.
Elaborating further: aren't we over-indexing on language newcomers here?
I suspect we under-index on language newcomers.
That aside, one of the big use cases for this specific feature is helping users of all levels figure out how to write the code they need, including looking at written material (books, blogs, messages from coworkers, etc).
I'm not saying we should discount newcomers entirely, but
including looking at written material (books, blogs, messages from coworkers, etc).
mostly involves copy-pasting snippets. After looking at one or two examples they should become accustomed to the syntax and able to use it, regardless of them understanding the ins and outs of attributes.
Contributor
Yes, just teach this as a magic syntax at first. Later on people will understand what it means.
Contributor
Author
I'm not referring to user perception but how we parse the syntax. And yes, we have an edition right around a corner but it would be very limiting if a feature like this can only b used with |
I think it's worth noting that the proposed syntax exactly reverses the meaning of fenced code blocks: these "turn code off" but in markdown they turn code on. I also think it's worth leaving a fly-by comment at least to advocate for not using markdown fenced code blocks (I will try to respond; personal life means I may not, thus fly-by. I care but time). It seems to me that the biggest motivation for this is perceived learnability, but the effect is that in all contexts that use markdown not taking the extra step of escaping means breaking the code. This is called out in the RFC but if I don't have a lot of time and I want to give someone a repro of something I want to just copy/paste it and not remember that there's this extra specific Rust step. Someone has to deal with that. I am almost sure that even Rust veterans reporting issues who happen to be in a hurry at the moment are going to forget. It's an annoying chore for the reporter or an annoying chore for the maintainer and it's often already hard enough to get people reporting/finding time to deal with things. Of course if this is ever extended that goes from annoying to potentially much more problematic: should it ever not only be used at the top of files, then it's time to play hunt the problems and, from that perspective, begins closing doors to any future extensions. The only advantage I can see for making this choice is learnability, and I'm skeptical that the effect is as large as it seems (for one thing Jira doesn't really use markdown as it is; the implicit assumption is that a new coder knows markdown). The static site generator alternative of using I will add that I was involved in another discussion like this back on the un-indented strings RFC, was told in that discussion that fences could still be escaped, and had to look it up for this comment anyway. Not only for fences, but how to write ``` by itself in a sentence. Not to mention that markdown isn't exactly a standard beast so who knows if what I used on GitHub works elsewhere? I don't personally know of problematic implementations but given the lack of standards/conformation to spec, are we even sure there isn't some issue tracker somewhere where "overloading" markdown like this isn't possible because it doesn't respect the escaping? It seems to me that the unspoken assumption is "if x claims to be using markdown, then x will work in this specific way" and that's never really been true. TLDR: it really feels messy when there's other non-overloaded syntactical constructs that would work, most of which are even in the RFC. Learners only learn it once, the rest of us deal with it forever |
Contributor
Author
If advocating for another solution, please also address the concerns with that format. |
Contributor
Author
Sorry, forgot to call this out earlier but I think an important note for anyone reviewing this RFC is that this is not a cargo team decision but a language team decision. There are subjective aspects of this. There are aspects where people will prioritize things differently than others. I've geared things towards what I expect will work for the language team and will adjust as they direct otherwise. This does not mean that input isn't useful but it has already improved the RFC and can help provide more perspectives for my recommendation and for the language team in their decision. Let's make sure we recognize that multiple experienced, well reasoned people can come to different conclusions on this and it might not look like our ideal (myself included). |
I would be happy with any other option. I'm not addressing specific alternatives because all other alternatives have two extremely useful properties:
I am not aware of any other language which knowingly decided that requiring editing the program in order to paste into issue trackers etc. would be necessary because it chose to embed a subset of markdown because (as the RFC currently reads to me)it decided that was the most learnable option. And indeed, here is what I would consider a likely problem with learnability anyway:
I'm not the person to ask about readability. I'm actually blind and in addition I'm one of those weirdos who doesn't find C++ that bad so any opinion I have there is likely not great. For the record I favor As an example of a "fast" (e.g. ctrl+a ctrl+v) system where I'd expect this to break, I've seen Slack treat markdown in a copy/paste as formatting. I believe you have to toggle something on to get markdown (I forget; it's useful for me because I can type formatting without clicking around). Snippets exist and clearly are the right answer but me and my coworkers don't bother with them half the time. One great thing about Rust is that while it has edge cases, most of them don't compile and most of the rest don't break the program; markdown fences, by contrast, silently break sharing with others. |
Contributor
Author
@bestouff that is a slight variant of one of the options and I'd recommend talking to the downsides if proposing it. Note that you left off the manifest being a string. That is dependent on whether the attribute parsing code can correctly handle TOML syntax (now and in the future) being embedded with in it which I've not tried to verify. In using strings, unless we try to shift people's style to single quotes, it will likely require a raw string literal. |
Contributor
@epage Yes sorry I found it simpler to learn than Alternative 3 but not sure it's rustc-parseable. I removed my comment but you were too quick !
|
Contributor
With my lang hat on, I don't see a reason we should RFC a feature that only allows At the language level we should acknowledge that not all projects get to use cargo, and the generalization here seems trivial to do in the RFC. Note that I'm fine with the RFC being conservative in other ways (only allowing one, right after the shebang, etc). Taking off my lang hat now – using |
Outdated
- Users can edit/copy/paste the manifest without dealing with leading characters |
||
Downsides |
||
- Too general that people might abuse it |
Contributor
As a general comment.. I don't agree with this as a downside. I'm not even sure what you mean by "abuse" since that varies greatly by use case. Will people embed 1,200 source files in a single file? In general probably not, but if they do, they probably have a good reason.
For example, if I want to run a minimizer like creduce (which supports Rust) to reproduce a compiler issue, it requires embedding my reproducer in a single file. Some tooling assumes this because C/C++ compilation units can always be embedded in a single file, unlike Rust. Then the tool can take care of minimizing for me. Obviously I wouldn't do this for every day software development, but in my mind it's a completely valid use case.
added the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label
Contributor
Author
Had considered loosening this up before any more official word from the lang team and realized there are syntax questions we don't really have an answer to (and our source of inspiration doesn't have good answers for). I expanded on this and also gave a suggested starting point for syntax if we decide to bring those decisions into this RFC.
I figured what the string should be would be best left for #3502 (as noted here) which goes into more detail. Feel free to add your thoughts there! If you feel that is a t-lang or a joint t-lang + t-cargo decision, we can talk about it! |
So, I've been following along a bit and had a couple of comments. I am also an instructor and a CS Ed researcher. I think that no matter what we do, we are going to increase the cognitive load for students unless it's extraordinarily explicit in its immediate interpretation. I don't think that code fences give you that. The reason for that is 2-fold. First, the code fence means that you have no distinguishing factor inline with the embedded toml. This means that students will basically have to switch modes when they are looking at different parts of the file. The only way this becomes even slightly viable is with syntax highlighting but even then, now you have to mix syntax highlighting for toml (which isn't a lot but does exist) and rust into one file adding to the complexity of ensuring you have a diverse enough color palette to maintain good contrast and instant recognition. Second, code fences are not explicit in their intent. There's nothing about them that inherently says, oh, by the way, we have an embedded file here. I don't think anyone has really managed to do that phenomenally from what I've seen but I will grant that I haven't looked that hard for languages that embed other languages inside of them. JSX, Bash here docs, and Doc comments come to mind but that's all I've got currently. My personal opinion is that if the goal is to ensure minimal increase in cognitive load, there are a few general approaches that I would suggest:
But a bit more thought out and considered in the extension to the syntax.
All of these feel to me like they address the idea of remaining explicit and not locking people into cargo. Final thoughts, even if we do use code fences or whatever, many students don't fully comprehend what a use or import or whatever your language adds to support including libraries and other code until much later. They just type it out because they know that if they don't have it, their code will break. I don't think it's wrong to simply provide a well documented template file to students at the beginning. Make sure it covers the common libraries they will be using. Then, through the semester/quarter you take opportunities to come back to it and refine their understanding of why the things in the template are there until they can begin adding dependencies on their own. As it is, as an instructor, I prefer to autograde students programs and I would never in a million years use their dependency list. That's a quick way for a clever student to decide that they want to access the container that the autograder is running in and then try to either manipulate it, access the answers when they shouldn't using something I'm not familiar with, or do something more malicious or mischievous. It's the same reason playground doesn't let you use your own cargo.toml file. It's just not worth the risk. Especially when the machines I'm using are either mine, my universities, or a free services. All of which would probably not take kindly to a student mucking around inside of their stuff or not take kindly to what may appear to be a spike of usage from me. |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
No one assigned
None yet
No milestone
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK