RFC: `c"…"` string literals by m-ou-se · Pull Request #3348 · rust-lan...
source link: https://github.com/rust-lang/rfcs/pull/3348
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Conversation
Member
Author
m-ou-se commented Nov 15, 2022
Three weeks ago, the lang team said they would be interested in potentially doing this in the future. So here's an RFC. :) |
Contributor
clarfonthey commented Nov 15, 2022
I'm on board. I'd even consider that a future extension might be to allow One other potential thing to thing about is whether |
Accepted escape codes: [Quote](https://doc.rust-lang.org/reference/tokens.html#quote-escapes) & [Unicode](https://doc.rust-lang.org/reference/tokens.html#unicode-escapes) & [Byte](https://doc.rust-lang.org/reference/tokens.html#byte-escapes). |
||
Unicode characters are accepted and encoded as UTF-8. That is, `c"🦀"`, `c"\u{1F980}"` and `c"\xf0\x9f\xa6\x80"` are all accepted and equivalent. |
I wish byte string literals had this support too, so big on this!
Member
Author
m-ou-se Nov 15, 2022
It might be worth proposing that in a separate RFC. That would also resolve one unresolved question of concat_bytes
, if we accept that mixing UTF-8 and non-UTF-8 in byte strings is okay.
Wrote an RFC for that: #3349
I was hoping to make things like An alternative would be to allow literals like |
One concern I have is that if single-letter prefixes become common, extending the language with new prefixes can become confusing. Although, if |
Member
nagisa commented Nov 15, 2022
I have two rhetorical questions with regards to the RFC text:
|
Do we even support
The exact same as would happen when using regular string literals. For example, |
- Also add `c'…'` C character literals? (`u8`, `i8`, `c_char`, or something more flexible?) |
||
- Should we make `&CStr` a thin pointer before stabilizing this? (If so, how?) |
I think this should be a blocker on stabilization, yeah.
I don't see how this feature is blocked by that at all really. It produces an &'static CStr
regardless of what &CStr
itself is made of.
@Kixiron To be clear, I think considering that question should be a blocker for stabilization.
Given that a major use case of this will be FFI, it seems important that we have a simple, not-error-prone way of passing a C string to C functions. If we decide that &CStr
wasn't that mechanism, then we should decide what that mechanism should be, and make sure c"..."
works well with that.
rfcbot commented Nov 17, 2022 •
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it.
labels
removed the I-lang-nominated Indicates that an issue has been nominated for prioritizing at the next lang team meeting. label
Member
Author
m-ou-se commented Nov 17, 2022
I implemented this yesterday evening, but it got a bit ugly/verbose because this literal type accepts yet another different set of escape codes. I'll clean it up more before sending a PR, but the code would get simpler if #3349 were also accepted. :) |
Member
Author
m-ou-se commented Nov 17, 2022
Ah sorry for the confusion. I should clarify: I think I do not think that we should have many different prefixes (for paths, os strings, wide strings, utf16 strings, allocated strings, Cows, etc. etc. etc.), because that quickly gets out of hand. The macro shorthand syntax could've helped with that, but there are other options too: The "custom string literal processors" thing that @the8472 mentions sounds like what I described in #3348 (comment), some kind of "const" |
I'm still confused as to why macros and prefixes are being presented as incompatible ideas. If we don't want to play the "add a prefix" game for every type and #3349 is accepted, why can't we have a macro like this: |
To be honest, this is not the first time I’m finding it unfortunate that editions are being used as a motivation for dismissing concerns about future-proofness of proposed changes. Changing some existing behaviour across an edition boundary isn’t as trivial of a question as “is the change technically feasible?” It is also a question of how widespread is the use of that functionality, how confusing it would be to have multiple flavours of the language where specific syntactic constructs behave differently, etc. In practice there are likely very few syntactic constructs we would be willing to change, even if we do horribly regret them. More likely we’d deprecate the old and add some new construct instead, possibly acknowledging that it is less-than-ideal thing to do but still a superior alternative. |
Lonami commented Nov 17, 2022
I haven't seen it mentioned, but perhaps the following alternative could be considered: The compiler would refuse to compile |
I'd argue that's what the c prefix is for. Also, the compiler would have to reject strings that don't have the nul byte, so it seems like requiring it explicitly would just be an annoying thing you have to do to please the compiler. |
As an additional data point, in gtk-rs we have a very similar use case and these C string literals would almost fit there but we need the UTF-8 guarantee that normal strings would give. Currently there's a Not sure if there's anything that could make adapt this proposal to also allow for that use case but with |
The "non-portable" argument against having EDIT: to be clear, I'm not saying that |
# Motivation |
||
[motivation]: #motivation |
||
Looking at the [amount of `cstr!()` invocations just on GitHub](https://cs.github.com/?scopeName=All+repos&scope=&q=cstr%21+lang%3Arust) (about 3.2k files with matches) it seems like C string literals |
Contributor
nikomatsakis commented Nov 29, 2022
@rfcbot reviewed |
added the final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. label
rfcbot commented Nov 29, 2022
This is now entering its final comment period, as per the review above. |
removed the proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. label
added finished-final-comment-period The final comment period is finished for this RFC. to-announce
and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised.
labels
rfcbot commented Dec 9, 2022
The final comment period, with a disposition to merge, as per the review above, is now complete. As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed. This will be merged soon. |
CAD97 commented Dec 13, 2022
Just a minor additional note: I want to second that even if Just having (Polymorphic string literals is probably the ideal long-term position, but having However, as a data point, the |
Contributor
tmandry commented Dec 14, 2022
Huzzah! The @rust-lang/lang team has decided to accept this RFC. To track further discussion, subscribe to the tracking issue here: |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK