2

Obscure C99 Array Features

 2 years ago
source link: https://dev.to/pauljlucas/obscure-c99-array-features-3270
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Introduction

C99 introduced a number of new features for arrays. Even though C99 is over 20 years old, you seldom see these new features used in the wild. Because of that, you’re less likely to be familiar with them and so less likely to use them in your own code (but maybe you shouldn’t!). So here’s a tour of those features.

Variable Length Arrays

Prior to C99, all arrays had to be declared to be of a fixed length (known at compile-time). Introduced in C99, variable length arrays (VLAs) can be declared to be of a variable length (not known until run-time). For example:

size_t n = 10;
// ...
int a[n];

Enter fullscreen mode

Exit fullscreen mode

Not only that, but the sizeof operator, historically a compile-time operator, is now sometimes a run-time operator — when its argument is a VLA:

size_t sz = sizeof(a) / sizeof(*a); // sz = 10

Enter fullscreen mode

Exit fullscreen mode

(The first sizeof is evaluated at run-time because its argument is a VLA; the second sizeof is still evaluated at compile-time.)

Variable Length Arrays Caveat

One serious caveat to VLAs is that, if the length is too big, it will silently overflow the stack. Additionally, unlike malloc() returning NULL upon failure, there's no way to detect when a VLA overflows the stack. Hence, if the size can be “too big,” your code has to guard against it:

size_t const A_LEN_MAX = 1024;
// ...

if ( n > A_LEN_MAX )
    // do something?
int a[n];

Enter fullscreen mode

Exit fullscreen mode

However, if you know that A_LEN_MAX is the maximum safe size, then you might as well just declare a to be of that size and not use a VLA.

Incidentally, you could do small size optimization:

void f( size_t n ) {
    int a[A_LEN_MAX];
    int *const p = n <= A_LEN_MAX ? a : malloc( n * sizeof(*p) );
    // use only 'p' to access array
    if ( p != a )
        free( p );
}

Enter fullscreen mode

Exit fullscreen mode

That is, if n isn’t too big, use the (fixed sized) array on the stack; otherwise, use a dynamically sized array in the heap. This has the advantage of saving on the calls to malloc() and free() for “small” n yet still works for “large” n. But notice that a VLA is not being used.

Hence, the moral is: use VLAs only when you don’t know the size at compile time but can guarantee that it won’t be “too big.” However, this is pretty much never true.

In hindsight, VLAs, though they seem convenient at times, are problematic, so much so that C11 made VLAs an optional feature. Incidentally, C++ never adopted VLAs from C99.

Array Syntax for Function Parameters

As you should be aware, array syntax can be used to declare function parameters, but, as you should also be aware, it’s just syntactic sugar since the compiler rewrites such parameters as pointers:

void f( int a[] );         // int *a

Enter fullscreen mode

Exit fullscreen mode

Note that I’m intentionally writing “array syntax for parameters” and not “array parameters” because “array parameters,” despite appearances, simply don’t exist in C.

The only potential benefit of using array syntax for parameters is that it conveys to the human reader that a is presumed to be a pointer to at least one int rather than a exactly one int. However, it’s only a presumption and not a guarantee since you can call such a function with a null pointer:

f( NULL );                 // f’s 'a' will be NULL

Enter fullscreen mode

Exit fullscreen mode

Note that adding a size doesn’t help:

void f( int a[10] );       // int *a

Enter fullscreen mode

Exit fullscreen mode

While again this might convey to a human reader that a is presumed to be an array of 10 ints, the compiler ignores the size.

Non-Null Array Syntax Pointers for Function Parameters

One of the features added in C99 was the ability to declare an “array” function parameter that must not be null and be of a minimum size:

void f( int a[static 10] );

Enter fullscreen mode

Exit fullscreen mode

If you try to pass either NULL or an array that has fewer than 10 ints, the compiler will warn you.

This marks yet another overloading of the static keyword in C since this static has nothing to do with either linkage or duration.
Incidentally, C99 did not introduce a parallel way to specify that a pointer parameter must not be null.

Array Syntax for Function Parameters Caveat

Using array syntax for function parameters can also be dangerous:

void f( int a[10] ) {      // int *a
    for ( size_t i = 0; i < sizeof(a)/sizeof(*a); ++i ) {
        // ...

Enter fullscreen mode

Exit fullscreen mode

The intention is to iterate over all the elements of the array, but, despite the sizeof expression being correct for an array, a is, again, not an array, but a pointer; so you’ll get the size of a pointer divided by the size of an int. Fortunately, gcc will warn about this.

Qualified Array Syntax for Function Parameters

To drive home that parameters declared with array syntax really are pointers, you can change them:

int ra[10];                // real array

void f( int pa[] ) {       // int *pa
    ++ra;                  // error (as expected)
    ++pa;                  // OK (surprisingly)

Enter fullscreen mode

Exit fullscreen mode

C99 also added the ability to qualify the rewritten pointer:

void f( int pa[const] ) {  // int *const pa
    ++pa;                  // error now

Enter fullscreen mode

Exit fullscreen mode

In addition to const, you can also qualify the pointer with volatile and restrict (CVR).

Note that neither of these:

void f( int const pa[] );  // pointer to const int
void f( const int pa[] );  // same as above

Enter fullscreen mode

Exit fullscreen mode

is the same thing: the const outside the [] refers to the int and not pa.

Why doesn’t the compiler convert parameters with array syntax to const pointers? Because const wasn’t a part of C when Kernighan & Ritchie invented it.

Variable Length Array Syntax for Function Parameters

C99 also added the ability to use VLAs for function parameters:

void f( size_t n, int a[n] ) { // int *a

Enter fullscreen mode

Exit fullscreen mode

That is, the size of the “array” is given by an integral parameter that precedes it. Note, however, that a is still a pointer. Despite having the size information at run-time, sizeof(a) will still return the size of the pointer. Hence, this “feature” serves only to convey to the human reader that n is the presumed size of the “array.”

However, this “feature” is actually useful for arrays of higher dimension. But before we get to that, a quick refresher on multidimensional array syntax for function parameters.

Multidimensional Array Syntax for Function Parameters

As you should be aware, array syntax can also be used to declare function parameters for multidimensional “arrays”:

void f( int a[10][20] );   // int (*a)[20]

Enter fullscreen mode

Exit fullscreen mode

The rule that the compiler converts array syntax for a function parameter into a pointer happens only for the first (left-most) dimension; the remaining dimension(s) keep their “array-ness.” Hence, a is a pointer to an actual array of 20 ints.

Note that the parentheses are necessary: without them, it would be an array of 20 pointers to int. FYI: to help decipher cryptic C declarations, you can use cdecl.

Pointers to array don’t often occur in C programs since the name of an array “decays” into a pointer to its first element. In most cases, that’s good enough even though the size information is lost. However, pointers to array retain the size information:

int (*p3)[3];              // pointer to array 3 of int
int (*p5)[5];              // pointer to array 5 of int

p5 = p3;                   // warning: incompatible pointers

Enter fullscreen mode

Exit fullscreen mode

Part of the reason pointers to array aren’t used much is because it’s clunky to access array elements since you have to dereference the pointer first:

int e1 = (*p3)[1];         // must dereference p3 first

Enter fullscreen mode

Exit fullscreen mode

However, you can dereference the pointer once into another pointer then use that pointer:

int *p = *p3;
int e1 = p[1];             // same as: (*p3)[1]

Enter fullscreen mode

Exit fullscreen mode

Multidimensional VLAs for Function Parameters

Multidimensional array syntax for parameters can be used for VLAs:

void f( size_t m, size_t n, int a[m][n] ) {

Enter fullscreen mode

Exit fullscreen mode

Since the compiler rewrites only the first array dimension as a pointer, the above is really:

void f( size_t m, size_t n, int (*a)[n] ) {

Enter fullscreen mode

Exit fullscreen mode

that is a is a pointer to a VLA of n ints. In this case, the VLA is actually a useful feature since the n allows the compiler to know the length of each row of the array. Additionally, sizeof (the first one) once again becomes a run-time operator:

size_t sz = sizeof(*a) / sizeof(**a); // sz = n

Enter fullscreen mode

Exit fullscreen mode

Unlike VLAs in general, VLA syntax used for function parameters is safe since the actual array passed to the function can (and often is) a normal array:

void f( size_t m, size_t n, int a[m][n] ) {
  // ...
}

void g() {
  int a[10][20];
  f( 10, 20, a );
}

Enter fullscreen mode

Exit fullscreen mode

There’s no new VLA being created at run-time here, so the it can’t overflow the stack.

Multidimensional VLAs parameters for Function Declarations

When declaring (as opposed to defining) functions, C allows you to omit the parameter names; however, if you do that, then there’s no name to specify the size of the VLA; but C99 added a new syntax for this case:

void f( size_t, size_t, int[][] );    // error
void f( size_t, size_t, int[*][*] );  // OK

Enter fullscreen mode

Exit fullscreen mode

That is, you use * to denote a VLA of an unnamed size. Note that since the first array dimension is always converted to a pointer, the * is needed only starting with the second dimension:

void f( size_t, size_t, int[][*] );   // same as previous

Enter fullscreen mode

Exit fullscreen mode

Hence, you never need * when using single dimension array syntax.

Conclusion

C99 added the new array features of:

  1. VLAs (which are unsafe, so you probably shouldn’t use them).
  2. VLA syntax for function parameters (which is safe).
  3. static that requires non-null, minimum-sized arrays be passed as parameters.
  4. The ability to const, volatile, or restrict qualify the rewritten pointers that decayed from arrays for function arguments.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK