6

Add `Iterator::array_chunks` (take N+1) by WaffleLapkin · Pull Request #100026 ·...

 2 years ago
source link: https://github.com/rust-lang/rust/pull/100026
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Add Iterator::array_chunks (take N+1) #100026

Conversation

Contributor

@WaffleLapkin WaffleLapkin commented 18 days ago

A revival of #92393.

r? @Mark-Simulacrum
cc @rossmacarthur @scottmcm @the8472

I've tried to address most of the review comments on the previous attempt. The only thing I didn't address is try_fold implementation, I've left the "custom" one for now, not sure what exactly should it use.

jdahlstrom and Dushistov reacted with thumbs up emojiscottmcm, Dushistov, compiler-errors, and sdroege reacted with heart emoji All reactions

rustbot

added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label

18 days ago

Collaborator

rustbot commented 18 days ago

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with @rustbot label +T-libs-api -T-libs to tag it appropriately. If this PR contains changes to any unstable APIs please edit the PR description to add a link to the relevant API Change Proposal or create one if you haven't already. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

rust-highfive

added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label

18 days ago

This comment has been minimized.

Contributor

the8472 commented 18 days ago

edited

Note that iter.next_chunk() exists now (#98326). And I expect these things to optimize poorly by default, so we'll need specialized implementations (as in #98553) for all sources that can yield chunks more efficiently and then plumb that through adapters that don't alter lengths like Map, Copied, Take etc. otherwise one won't get the autovectorized code one might expect.

I'm not sure if this the right approach and some design work is needed (#81615).

Member

scottmcm commented 18 days ago

+1 that this should just use https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.next_chunk -- conveniently, that should mean that this PR gets substantially simpler, since the unsafe can live over there instead of here.

scottmcm

added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

18 days ago

let mut acc = init;

let mut iter = self.iter.by_ref().rev();

// NB remainder is handled by `next_back_remainder`, so

// `next_chunk` can't return `Err` with non-empty remainder

// (assuming correct `I as ExactSizeIterator` impl).

while let Ok(mut chunk) = iter.next_chunk() {

chunk.reverse();

acc = f(acc, chunk)?

}

Contributor

@the8472 the8472 18 days ago

by_ref AND a double-reverse? grimacing that must be horrendously slow.

Imo we either need a next_chunk_back or only limit it to forward iteration. The former could be done in a separate PR, but we need to decide on an approach.

scottmcm reacted with thumbs up emoji All reactions

Contributor

Author

@WaffleLapkin WaffleLapkin 18 days ago

I'm sorry you had to see this impl sweat_smile

All reactions

Contributor

Author

WaffleLapkin commented 15 days ago

@rustbot ready

rustbot

added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

labels

15 days ago

self.next_back_remainder();

let mut acc = init;

let mut iter = self.iter.by_ref().rev();

suggestion: You have a Sized iterator here (since it's a field), so you can avoid some of the by_ref optimization penalty by using https://github.com/rust-lang/rust/blob/master/library/core/src/iter/adapters/by_ref_sized.rs instead of by_ref().

All reactions

Self: Sized,

F: FnMut(B, Self::Item) -> B,

{

self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0

To avoid making a closure type generic over I too:

Suggested change
self.try_fold(init, |acc, x| NeverShortCircuit(f(acc, x))).0
self.try_fold(init, NeverShortCircuit::wrap_mut_2(f)).0

(and the same in rfold)

All reactions

Member

@scottmcm scottmcm left a comment

I have a few comments, but overall I think this is essentially fine for nightly. (Lots of open questions for stabilization, but that's ok.)

Can you please open a tracking issue, add it to the unstable attributes, and address my other comments?

All reactions

scottmcm

added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.

labels

10 days ago

// Take the last `rem` elements out of `self.iter`.

let mut remainder =

// SAFETY: `unwrap_err` always succeeds because x % N < N for all x.

unsafe { self.iter.by_ref().rev().take(rem).next_chunk().unwrap_err_unchecked() };

I don't think I'll block the PR on this, but unwrap_err_unchecked here scares me -- I added a note to the tracking issue.

My instinct for now is that %N is definitely trustworthy, so that combined with take is sufficient to say this is ok. But I'm still afraid, since it's not TrustedLen, and wonder if someone can come up with an evil example here where it somehow gets a full chunk and thus is UB in safe code.

(At least the obvious things, like a len() that always says usize, is protected against because of the modulo.)

All reactions

Contributor

@the8472 the8472 7 days ago

edited

As already noted in another comment I don't like this entire method. Either we should add next_chunk_back or remove the DEI impl.

scottmcm reacted with thumbs up emoji All reactions

Contributor

Author

@WaffleLapkin WaffleLapkin 6 days ago

edited

So, an evil impl can return less than rem elements, but actual <= rem < N still holds (I don't think there is any way to trick Take into returning more than rem elements) so the chunk will be an error no matter what.

An evil impl can just return a wrong length, so after this iterator returns a number of elements not divisible by N. But this is fine too, an Err will just be ignored:

All reactions

Contributor

Author

@WaffleLapkin WaffleLapkin 6 days ago

That being said, as @the8472 already said, we should probably rewrite the DEI impl anyway (I'll try to do this, after this lands).

All reactions

Yeah, the % is certainly fine. What I'm not yet convinced of is that there's no way to make take misbehave somehow -- for example, Take::try_fold uses the inner try_fold, so maybe there'd be a way for that to be implemented wrongly-but-not-UB that could result in too many things getting put in the array somehow.

But I agree that just making a nicer implementation without the unsafe{} is the right plan.

All reactions

Contributor

Author

@WaffleLapkin WaffleLapkin 6 days ago

Actually you are right fearful

Overriding try_fold with some nasty unsafe to skip over Break(_) allows you to trick Take into returning more elements that it should (playground).

n can underflow, if a bad iterator impl skips over Break(_):

The fact that unsafe code can't trust Take is scary.

All reactions

I think the problem there is still in the unsafe in Misbehave -- it protects against double-drop, but duplicating arbitrary values can violate safety invariants in general. (For example, if the type is !Clone + !Default, I can use you moving it into me to track resource consumption, like a ZST tracker for a global semaphore.) So that specific example is just "well unsound unsafe-using code breaks everything".

But yeah, even though I've not been able to come up with a safe trick that would do it, things along those lines are what make me worried about it. (And certainly if specialization existed then it would be possible to do particularly weird things in safe code because it can violate parametricity even worse than in normal code.)

All reactions

Contributor

Author

@WaffleLapkin WaffleLapkin 5 days ago

The example could check, that type_id::<B>() == type_id::<usize>() or something similar, which would IMO make the code sound.

All reactions

Yeah, I thought about that one, but TypeId::of::<B>() needs B: 'static, which we don't have here. Regardless, it's yet another chink in the armour that makes be scared we'll break through eventually.

WaffleLapkin reacted with thumbs up emoji All reactions

Member

scottmcm commented 7 days ago

Running over this again I noticed one thing I'm wondering about, but I don't think it needs to block going to nightly. Thanks for pushing this through!

@bors r+

WaffleLapkin reacted with heart emoji All reactions

Contributor

bors commented 7 days ago

pushpin Commit 5fbcde1 has been approved by scottmcm

It is now in the queue for this repository.

bors

added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.

and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author.

labels

7 days ago

Contributor

the8472 commented 7 days ago

(take N+1)

Clearly a bug! :D

WaffleLapkin reacted with laugh emoji All reactions

bors

merged commit 482a6ea into

rust-lang:master

5 days ago

10 checks passed

rustbot

added this to the 1.65.0 milestone

5 days ago

rust-timer

added a commit to rust-lang-ci/rust that referenced this issue

5 days ago

WaffleLapkin

deleted the array-chunks branch

5 days ago

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Assignees

scottmcm

Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects

None yet

Milestone

1.65.0

Development

Successfully merging this pull request may close these issues.

None yet

9 participants

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK