Add `Iterator::array_chunks` (take N+1) by WaffleLapkin · Pull Request #100026 ·...
source link: https://github.com/rust-lang/rust/pull/100026
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Add Iterator::array_chunks
(take N+1)
#100026
Conversation
Contributor
WaffleLapkin commented 18 days ago
A revival of #92393.
r? @Mark-Simulacrum
cc @rossmacarthur @scottmcm @the8472
I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is try_fold
implementation, I've left the "custom" one for now, not sure what exactly should it use.
added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label
Collaborator
rustbot commented 18 days ago
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label
This comment has been minimized.
Note that I'm not sure if this the right approach and some design work is needed (#81615). |
Member
scottmcm commented 18 days ago
+1 that this should just use https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.next_chunk -- conveniently, that should mean that this PR gets substantially simpler, since the |
added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
let mut acc = init; |
||
let mut iter = self.iter.by_ref().rev(); |
||
// NB remainder is handled by `next_back_remainder`, so |
||
// `next_chunk` can't return `Err` with non-empty remainder |
||
// (assuming correct `I as ExactSizeIterator` impl). |
||
while let Ok(mut chunk) = iter.next_chunk() { |
||
chunk.reverse(); |
||
acc = f(acc, chunk)? |
||
} |
Contributor
the8472 18 days ago
by_ref
AND a double-reverse? that must be horrendously slow.
Imo we either need a next_chunk_back
or only limit it to forward iteration. The former could be done in a separate PR, but we need to decide on an approach.
Contributor
Author
WaffleLapkin commented 15 days ago
@rustbot ready |
added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
labels
self.next_back_remainder(); |
||
let mut acc = init; |
||
let mut iter = self.iter.by_ref().rev(); |
suggestion: You have a Sized
iterator here (since it's a field), so you can avoid some of the by_ref
optimization penalty by using https://github.com/rust-lang/rust/blob/master/library/core/src/iter/adapters/by_ref_sized.rs instead of by_ref()
.
Self: Sized, |
||
F: FnMut(B, Self::Item) -> B, |
||
{ |
||
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0 |
To avoid making a closure type generic over I
too:
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0 | |
self.try_fold(init, NeverShortCircuit::wrap_mut_2(f)).0 |
(and the same in rfold
)
Member
scottmcm left a comment
I have a few comments, but overall I think this is essentially fine for nightly. (Lots of open questions for stabilization, but that's ok.)
Can you please open a tracking issue, add it to the unstable attributes, and address my other comments?
added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
labels
// Take the last `rem` elements out of `self.iter`. |
||
let mut remainder = |
||
// SAFETY: `unwrap_err` always succeeds because x % N < N for all x. |
||
unsafe { self.iter.by_ref().rev().take(rem).next_chunk().unwrap_err_unchecked() }; |
I don't think I'll block the PR on this, but unwrap_err_unchecked
here scares me -- I added a note to the tracking issue.
My instinct for now is that %N
is definitely trustworthy, so that combined with take
is sufficient to say this is ok. But I'm still afraid, since it's not TrustedLen
, and wonder if someone can come up with an evil example here where it somehow gets a full chunk and thus is UB in safe code.
(At least the obvious things, like a len()
that always says usize
, is protected against because of the modulo.)
As already noted in another comment I don't like this entire method. Either we should add next_chunk_back
or remove the DEI impl.
So, an evil impl can return less than rem
elements, but actual <= rem < N
still holds (I don't think there is any way to trick Take
into returning more than rem
elements) so the chunk will be an error no matter what.
An evil impl can just return a wrong length, so after this iterator returns a number of elements not divisible by N
. But this is fine too, an Err
will just be ignored:
Contributor
Author
WaffleLapkin 6 days ago
That being said, as @the8472 already said, we should probably rewrite the DEI impl anyway (I'll try to do this, after this lands).
Yeah, the %
is certainly fine. What I'm not yet convinced of is that there's no way to make take
misbehave somehow -- for example, Take::try_fold
uses the inner try_fold
, so maybe there'd be a way for that to be implemented wrongly-but-not-UB that could result in too many things getting put in the array somehow.
But I agree that just making a nicer implementation without the unsafe{}
is the right plan.
Contributor
Author
WaffleLapkin 6 days ago
Actually you are right
Overriding try_fold
with some nasty unsafe to skip over Break(_)
allows you to trick Take
into returning more elements that it should (playground).
n
can underflow, if a bad iterator impl skips over Break(_)
:
The fact that unsafe code can't trust Take
is scary.
I think the problem there is still in the unsafe
in Misbehave
-- it protects against double-drop, but duplicating arbitrary values can violate safety invariants in general. (For example, if the type is !Clone + !Default
, I can use you moving it into me to track resource consumption, like a ZST tracker for a global semaphore.) So that specific example is just "well unsound unsafe
-using code breaks everything".
But yeah, even though I've not been able to come up with a safe trick that would do it, things along those lines are what make me worried about it. (And certainly if specialization existed then it would be possible to do particularly weird things in safe code because it can violate parametricity even worse than in normal code.)
Contributor
Author
WaffleLapkin 5 days ago
The example could check, that type_id::<B>() == type_id::<usize>()
or something similar, which would IMO make the code sound.
Yeah, I thought about that one, but TypeId::of::<B>()
needs B: 'static
, which we don't have here. Regardless, it's yet another chink in the armour that makes be scared we'll break through eventually.
Member
scottmcm commented 7 days ago
Running over this again I noticed one thing I'm wondering about, but I don't think it needs to block going to nightly. Thanks for pushing this through! @bors r+ |
Contributor
bors commented 7 days ago
added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.
labels
Contributor
the8472 commented 7 days ago
Clearly a bug! :D |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Successfully merging this pull request may close these issues.
None yet
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK