3

What if `vector<T>::iterator` were just `T*`? – Arthur O'Dwyer – Stuff mos...

 2 years ago
source link: https://quuxplusone.github.io/blog/2022/03/03/why-isnt-vector-iterator-just-t-star/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

What if vector<T>::iterator were just T*?

Every major C++ standard library vendor defines wrappers for their contiguous iterator types — vector<T>::iterator and so on. You might think that they should just define these iterators as aliases for T* instead.

Type libstdc++ libc++ MSVC initializer_list<T>::iterator
(mandated) const T* const T* const T* array<T, 10>::iterator T* T* std::_Array_iterator<int,10> span<T>::iterator __gnu_cxx::__normal_iterator<int*, std::span<int>> std::__wrap_iter<int*> std::_Span_iterator<int> string::iterator __gnu_cxx::__normal_iterator<char*, std::string> std::__wrap_iter<char*> std::_String_iterator<std::_String_val<std::_Simple_types<char>>> string_view::iterator const char* const char* std::_String_view_iterator<std::char_traits<char>> ranges::iterator_t<valarray<T>> T* T* T* vector<T>::iterator __gnu_cxx::__normal_iterator<T*, std::vector<T>> std::__wrap_iter<T*> std::_Vector_iterator<std::_Vector_val<std::_Simple_types<T>>>

All these wrappers impose a pretty high cost in terms of compilation speed, debug performance, and even object-file size (because of all those super-long mangled names). Could a library vendor decide to simply change their definition of vector<T>::iterator to T*?

Nothing prevents a green-field STL implementation from using T* for all these types (as far as I know), but you’ll never see any existing library vendor switch to T* by default: it would break too much existing code. Besides the obvious ABI breakage, it would actually be an API break as well! Here’s a list of code snippets that would change their behavior if vector<T>::iterator became an alias for T*. (Godbolt.)

Implicit conversion from T*

void overload_resolution() {
    extern void f(std::vector<int>::iterator);
    extern void f(const char *);
    f(nullptr);
}

The null pointer constant nullptr of course can be implicitly converted to T*; but there’s no reason it should be implicitly convertible to vector<T>::iterator. libc++’s iterator wrapper makes that conversion private; libstdc++ makes it public but explicit.

If vector<int>::iterator were an alias for int*, then the call to f above would become ambiguous. Today, in practice, every library vendor considers it unambiguous.

Mixing iterators from different kinds of containers

void container_conversion() {
    std::vector<char>::iterator vi;
    std::string::iterator si;
    std::swap(vi, si);
}

If both vector<char>::iterator and string::iterator were aliases for char*, then this would compile. In fact, it compiles on libc++ today, because libc++ makes both iterator types an alias for __wrap_iter<char*>; but it won’t compile on libstdc++ or MSVC because their aliases are less SCARY (and thus more expensive) than libc++’s.

Derived-to-base conversions

struct Base {};
struct Derived : Base {};

void derived_to_base_conversion() {
    std::vector<Derived>::iterator id;
    std::vector<Base>::iterator ib;
    ib = id;
}

Derived* is implicitly convertible to Base*; but there’s no reason to permit conversion of iterator-to-Derived to iterator-to-Base. libstdc++’s iterator wrapper disallows such conversions.

Const-qualification conversions

void const_contravariance() {
    std::vector<Base>::iterator p;
    const std::vector<Base>::const_iterator *pp;
    pp = &p;
}

Base* and const Base* are similar types, but __wrap_iter<Base*> and __wrap_iter<const Base*> are not similar. No iterator wrapper can allow this kind of pointer conversion, because it involves only primitive types.

Built-in types don’t trigger as much ADL

template<class T> struct Holder { T t; };

void avoid_adl() {
    struct Incomplete;
    std::vector<Holder<Incomplete>*>::iterator i;
    (void)(i == i);
}

See “ADL can interfere even with uglified names” (2019-09-26) for details on the Holder<Incomplete> pattern. For our purposes here, suffice it to say that equality-comparing two pointers of type Holder<Incomplete>** won’t trigger the completion of Holder<Incomplete>; but comparing two objects of the class type __wrap_iter<Holder<Incomplete>**> will trigger the completion of Holder<Incomplete>.

This could be altered by ADL-proofing vector<T>::iterator, that is, by defining it as __wrap_iter<T*>::type. See “How hana::type<T> disables ADL” (2019-04-09), and my recent LWG paper P2538 “ADL-proof std::projected for details. But of course that would be both an ABI and API break. For example, it would alter the ADL behavior of the following snippet:

struct A {
    // Warning: Don't try this at home!
    // iter_swap customizations rightly belong on the
    // iterator type, not on the element type!
    friend void iter_swap(std::vector<A>::iterator, std::vector<A>::iterator);
};

void respect_adl() {
    std::vector<A>::iterator i;
    iter_swap(i, i); // A's friend, or std::iter_swap?
}

If vector<A>::iterator is A*, this calls the only candidate: A’s hidden-friend iter_swap. If vector<A>::iterator is std::__wrap_iter<A*>, it’ll also consider std::iter_swap as a candidate, but A’s hidden friend will be the better match. But, if vector<A>::iterator is std::__wrap_iter<A*>::type, then std::iter_swap will be the only candidate!

Modifiable prvalues

Hat tip to Ben Craig for this addition:

void modify_prvalue() {
    std::vector<int> v = {1,2,3};
    auto it = --v.end();
}

If vector<int>::iterator is a class type, then this code is valid C++; it is initialized with the result of v.end().operator--(). But if it’s int*, then --v.end() is ill-formed: the built-in -- operator requires a modifiable lvalue, and v.end() would be a prvalue!


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK