6

Rust iterators tips and tricks

 3 years ago
source link: https://robinmoussu.gitlab.io/blog/post/2021-03-25_rust_iterators_tips_and_tricks/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
A few tips on how to implement iterators.
source: Image by DarkmoonArt_de - https://pixabay.com/fr/illustrations/boussole-carte-nautique-antique-3408928/ [free for commercial use]

Tips n°1: add an iter() function for your custom collection

If you create your own collection, for example a struct that encapsulate a Vec, you will probably want to provide an iter() function. This way users of your collection will be able to access the elements of your collection, without leaking implementation details. Of course, you could create a new type that implement the Iterator trait, but to be honest, even if it’s not complicated it’s fastidious! Fortunately, there is a much easier way to do it:

struct MyCollection {
    data: Vec<i32>, // or any other type that itself has an `iter()` method
    // ...
}

impl MyCollection {
    fn iter(&self) -> impl Iterator {
        self.data.iter()
    }
    // ...
}

And that’s all!

ducky
Really?
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
 --> src/lib.rs:8:19
  |
7 |     fn iter(&self) -> impl Iterator {
  |             ----- this data with an anonymous lifetime `'_`...
8 |         self.data.iter()
  |         --------- ^^^^
  |         |
  |         ...is captured here...
  |
note: ...and is required to live as long as `'static` here
 --> src/lib.rs:7:23
  |
7 |     fn iter(&self) -> impl Iterator {
  |                       ^^^^^^^^^^^^^
help: to declare that the `impl Trait` captures data from argument `self`, you can add an explicit `'_` lifetime bound
  |
7 |     fn iter(&self) -> impl Iterator + '_ {
  |                                     ^^^^

That’s right, I forgot to the '_ lifetime marker for the return type of iter(). Let fix that:

fn iter(&self) -> impl Iterator + '_ {
    self.data.iter()
}

This is called lifetime elision. It’s strictly equivalent to this:

fn iter<'a>(&'a self) -> impl Iterator + 'a {
    self.data.iter()
}

But fortunately, the compiler is smart enough to understand that the anonymous lifetime '_ should be bound to the lifetime of the only reference &self.

ducky
Thanks for the details

You’re welcome!

And sorry, I forgot to induce you my rubber duck friend. He is very useful when programming.

Playground

Tips n°2: returning one of multiple iterators of different types

If you come from high level language, at one point you will probably try to create a function like this one:

fn forward_or_backward<T>(v: &Vec<T>, forward: bool) -> impl Iterator + '_
{
    if forward {
        v.iter()
    } else {
        v.iter().rev()
    }
}

Both v.iter() and v.iter().rev() returns a type that implements the Iterator trait, so it should work, shouldn’t it?

ducky
Not exactly. The concrete type of all the branches of any conditional expression (like if, match, any kind of loop, …) must match.

Oh right. I wish that the compiler would be smart enough to create a newtype automatically, but that’s not the case currently and the lang team is already working on much more important and exciting features.

Anyway, lets look at the error:

error[E0308]: `if` and `else` have incompatible types
 --> src/main.rs:6:9
  |
3 | /     if forward {
4 | |         v.iter()
  | |         -------- expected because of this
5 | |     } else {
6 | |         v.iter().rev()
  | |         ^^^^^^^^^^^^^^ expected struct `std::slice::Iter`, found struct `Rev`
7 | |     }
  | |_____- `if` and `else` have incompatible types
  |
  = note: expected type `std::slice::Iter<'_, _>`
           found struct `Rev<std::slice::Iter<'_, _>>`
help: you could change the return type to be a boxed trait object
  |
1 | fn forward_or_backward<T>(v: &Vec<T>, forward: bool) -> Box<dyn Iterator<Item=&T> + '_>
  |                                                         ^^^^^^^                       ^
help: if you change the return type to expect trait objects, box the returned expressions
  |
4 |         Box::new(v.iter())
5 |     } else {
6 |         Box::new(v.iter().rev())

The advice isn’t bad per see, but why would we want to pay for dynamic allocation and dynamic dispatch when we could use static dispatch. Let’s implement it:

First, we will need an enum to store either branch. We could totally use something like the either crate, but for the sake of explaining the details, we will hand roll our own implementation.

enum Either<Left, Right> {
    Left(Left),
    Right(Right),
}

Now we want to implement the Iterator trait for our newtype. Of course, we can only do it if both Left and Right are themselves iterators. And both of those iterators must yield elements of the same type.

impl <Left, Right, Item> Iterator for Either<Left, Right>
where
    Left: Iterator<Item=Item>,
    Right: Iterator<Item=Item>,
{
    type Item = Item;
    fn next(&mut self) -> Option<Self::Item> {
        match self {
            Self::Left(it) => it.next(),
            Self::Right(it) => it.next(),
        }
    }
}
ducky
Shouldn’t we add a specialization for the nth() and fold() method?

The documentation says:

Also note that Iterator provides a default implementation of methods such as nth and fold which call next internally.

However, it is also possible to write a custom implementation of methods like nth and fold if an iterator can compute them more efficiently without calling next.

So, it’s not required, but it would probably be a good idea to delegate the call to nth() and fold() to the implementation of Left and Right, just like we did for next(). Since any methods of the Iterator trait could have been specialized in either the Left or the Right type, it would be a good to do it for all functions.

The same reasoning can be applied to trait like DoubleEndedIterator, ExactSizeIterator, and FusedIterator.

Let’s take for example the ExactSizeIterator trait. We want to implement it only if both variants implement it. And if they do, we want to forward the implementation to the underlying type:

// implement ExactSizeIterator iif both branches implement that trait
// themselves.
impl<Left, Right> ExactSizeIterator for Either<Left, Right>
where
    Left: ExactSizeIterator,
    Right: ExactSizeIterator,
{
    fn len(&self) -> usize {
        match self {
            Self::Left(it) => it.len(),
            Self::Right(it) => it.len(),
        }
    }
    fn is_empty(&self) -> bool {
        match self {
            Self::Left(it) => it.is_empty(),
            Self::Right(it) => it.is_empty(),
        }
    }
}
ducky
Didn’t you forget something?

Don’t tell me I forgot something again!

error[E0271]: type mismatch resolving `<Right as Iterator>::Item == <Left as Iterator>::Item`
  --> src/main.rs:50:19
   |
50 | impl<Left, Right> std::iter::ExactSizeIterator for Either<Left, Right>
   |      ----  -----  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ expected type parameter `Right`, found type parameter `Left`
   |      |     |
   |      |     expected type parameter
   |      found type parameter
   |
   = note: expected associated type `<Right as Iterator>::Item`
              found associated type `<Left as Iterator>::Item`
   = note: a type parameter was expected, but a different one was found; you might be missing a type parameter or trait bound
   = note: for more information, visit https://doc.rust-lang.org/book/ch10-02-traits.html#traits-as-parameters
   = note: required because of the requirements on the impl of `Iterator` for `Either<Left, Right>`

Ok, I forgot to make it explicit that both variants are iterators that yield the same type.

impl<Left, Right, T> std::iter::ExactSizeIterator for Either<Left, Right>
where
    Left: std::iter::ExactSizeIterator + Iterator<Item=T>,
    Right: std::iter::ExactSizeIterator + Iterator<Item=T>,
ducky
Everything should be ready to be tested.
fn main() {
    let v = vec![0, 1, 3, 7];

    // We collect the result in a Vec to be able to print them easily
    let forward: Vec<_> = forward_or_backward(&v, true).collect();
    let backward: Vec<_> = forward_or_backward(&v, false).collect();
    println!("forward: {:?}", forward); // 0, 1, 3, 7
    println!("backward: {:?}", backward); // 7, 3, 1, 0
}
error[E0277]: `<impl Iterator as Iterator>::Item` doesn't implement `Debug`
  --> src/main.rs:87:31
   |
87 |     println!("forward: {:?}", forward); // 0, 1, 3, 7
   |                               ^^^^^^^ `<impl Iterator as Iterator>::Item` cannot be formatted using `{:?}` because it doesn't implement `Debug`
   |
   = help: the trait `Debug` is not implemented for `<impl Iterator as Iterator>::Item`
   = note: required because of the requirements on the impl of `Debug` for `Vec<<impl Iterator as Iterator>::Item>`
   = note: required by `std::fmt::Debug::fmt`
   = note: this error originates in a macro (in Nightly builds, run with -Z macro-backtrace for more info)
ducky
That’s strange, we are iterating over integers, and they implement the Debug trait, so why does this doesn’t work?

Let me think…

When we implemented forward_or_backward(), we didn’t forward the type of the item being iterated over:

fn forward_or_backward<T>(v: &Vec<T>, forward: bool) -> impl Iterator + '_

As you can see the type of the Item that the iterator will yield is unknow. This is very obvious if we don’t use type inference when collecting the values in a vector:

let forward: Vec<i32> = forward_or_backward(&v, true).collect();
error[E0277]: a value of type `Vec<i32>` cannot be built from an iterator over elements of type `<impl Iterator as Iterator>::Item`
  --> src/main.rs:85:59
   |
85 |     let forward: Vec<i32> = forward_or_backward(&v, true).collect();
   |                                                           ^^^^^^^ value of type `Vec<i32>` cannot be built from `std::iter::Iterator<Item=<impl Iterator as Iterator>::Item>`
   |
   = help: the trait `FromIterator<<impl Iterator as Iterator>::Item>` is not implemented for `Vec<i32>`

So we just need to modify the declaration of forward_or_backward to fix it:

fn forward_or_backward<T>(v: &Vec<T>, forward: bool) -> impl Iterator<Item=T> + '_

This time it works!

Playground


There is much more to say about iterators, it’s one of the most useful trait in Rust, but that’s all for today.

I hope those tips may help you in the future.
Discuss it on reddit.

Creative Commons License


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK