0

Is Zero a Butterfly?

 3 years ago
source link: https://shafik.github.io/c++/2021/01/03/is_zero_a_butterfly.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Is Zero a Butterfly?

03 Jan 2021

» C++

Is Zero a Butterfly? Wait? What? That sounds like a pretty odd question to ask for C++ but let’s explore a little and see where it takes us.

So what is zero in C++? Perhaps the most obvious answer would be it is an integer and while this is true it misses a bigger story about zeros super powers in C++.

What is super about zero?

Well first some fun trivia. Zero is an octal literal, see the grammar for integer literals:

octal-literal:
  0
  octal-literal 'opt octal-digit

It is the only single digit octal literal, it won’t help you do anything special but might impress some of your friends.

Zero is a kind of super-literal (not official terminology), it can be converted to all sort of types, some mundane but many quite surprising. Most would not be surprised that the following code is valid:

void f(int);
f(0); // Zero is an integer, not surprising.
f(1); // Nope, not surprising either.

Zero is an integer after all and we will probably not be surprised the following works either (see it live):

void f1(bool);
void f2(char);
f1(0); // Ok, we are used to non-zero values being true and zero values being false.
f2(0); // Perhaps a little icky but char is an integer type.
f2(1); // Why not 1 as well?

We are probably used to thinking of zero values as converting to false and non-zero values converting to true, so not likely surprising. We might not be as thrilled about zero converting to a char without any explicit conversions but char, signed char and unsigned char are all integer types.

So how about pointers? Is zero a pointer? How about non-zero values? Are non-zero integers also pointers? Are we surprised by the following (see it live)?:

f(int *p);
f(0); // Ok, zero is a null pointer constant.
f(1); // Error, 1 is not a valid pointer constant.
f(reinterpret_cast<int*>(1)); // Ok (although questionable) implementation-defined 
                              // mapping from integer to pointer.

So now we start seeing some interesting behavior (powers), zero is special, [conv.ptr]p1 tells us that:

A null pointer constant is an integer literal ([lex.icon]) with value zero or a prvalue of type std::nullptr_t.

but other integers do not have this super ability and if we want to convert them to pointers we must use reinterpret_cast. This will be a questionable but implementation defined mapping. We inherit this directly from C and pre-C++11 we used to allow constant expressions that evaluated to zero to be also be consider null pointer constants. So the following prior to C++11 was also valid (see it live):

f(int *p);
f( 1 - 1 );  // Ok, prior to C++11, 1-1 is an integral constant expression.           
f( ~-1 );    // Also ok prior to C++11!

How about arrays? Is zero an array (😱) (see it live)?

void f(char arr[5]){} // Identical to
                      // void f(char *arr);
f(0); // Ok, zero is a null pointer constant

Ok, this probably feels like cheating but it works and is likely pretty surprising. Here we are seeing the effect of a parameter of type array of T being adjusted to pointer to T, see [dcl.fct]p5:

After determining the type of each parameter, any parameter of type “array of T” or of function type T is adjusted to be “pointer to T”.

This is another thing that C++ has inherited directly from C. In C++ we can avoid this adjustment by passing array by reference. This is used for example by std::end to determine the size of the array:

template< class T, std::size_t N >
T* end( T (&array)[N] );

but C-style arrays are not idiomatic C++ and we should prefer to use std::array or std::vector but often for generic code we need to support C-style arrays, as the std::end example above.

Ok, so what about std::string? You may be saying, no way, zero can’t be a std::string can it? Let’s see (see it live):

void f(std::string s){}
f(0);             // Ok, zero converts to a const char *, may crash due to undefined behavior
f("hello world"); // Ok, a string literal will decay to a pointer to a const char 
f(reinterpret_cast<const char *>(1)); // Undefined behavior, 1 is not a valid pointer to a const char 

What we are seeing here is that std::string has a constuctor that takes a const char* (I am being lose here, really should be CharT) and copies the contents:

constexpr basic_string( const CharT* s,
                        const Allocator& alloc = Allocator() );

Due to the legacy C conversion of zero to a null pointer constant we can convert zero to a const char* and thus we can call the std::string constructor that expects a null-terminated C-style string to copy the contents of. Worth noting this an easy trap to fall into:

std::string s(0); // Same problem, does not create a std::string
                  // of size one with value char value 0

Ironically the following does not work (see it live):

std::string s;
s=0;  // Ill-formed, ambiguous overload are we passing a char or const char*?
s=65; // Ok, likely will result in string storing character A

Assigning zero to a std::string does not work, it ends up being an ambiguous overload because std::string has an overload to operator= that takes char and an overload that takes const char* and so assigning zero is ambiguous.

One last place where zero is special and that is a declaring pure virtual member functions:

struct A{
  virtual void f() = 0; // Zero yet again!
};

I am not to happy to report that zero was used to avoid having to add a new keyword so close to release. There was nothing technical preventing an alternative such as pure or abstract other than the difficulty of getting a new keyword accepted so close to release.

So this brings an end to zeros super fun, but it does leave us with one burning question (How does one turn a meme into an article) also see original meme that inspired this post:

Is this a

Thank you to those who provided feedback on or reviewed this write-up: Ólafur Waage and Jan Wilmans.

Of course in the end, all errors are the author’s.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK