I looked at the usage of Array of length zero
in C, and I feel quite inspired. However, from a standard perspective (as opposed to compiler extensions), this feature only exists in the C language (after C99) and does not exist in C++. Let’s dig into it.
First, let’s understand what an incomplete type
is:
[ISO/IEC 14882:2014] A class that has been declared but not defined, an enumeration type in certain contexts (7.2), or an array of unknown size or of incomplete element type, is an incompletely-defined object type. Incompletely-defined object types and the void types are incomplete types (3.9.1)
In C language ([ISO/IEC 9899:1999]), an incomplete type
is:
- The void type comprises an empty set of values; it is an incomplete type that cannot be completed.
- An array type of unknown size is an incomplete type.
- A structure or union type of unknown content (as described in 6.7.2.3) is an incomplete type.
OK, the groundwork is set, let’s continue with the content of Array of length zero
.
In C, a struct
can contain an incomplete type
(but not randomly):
[ISO/IEC 9899:1999] A structure or union shall not contain a member with incomplete or function type (hence, A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array.
This means: the last member of a structure with multiple members can be an incomplete array.
Note: this feature was introduced in C99; you can check the requirements for members of struct
and union
in ANSI C89 (ISOC90):
[ISO/IEE 9899:1990] A structure or union shall not contain a member with incomplete or function type. Hence it shall not contain an instance of itself (but may contain a pointer to an instance of itself).
ANSI C89 stipulates that struct
and union
cannot contain incomplete types
and function types.
By distinguishing different standard versions, one can write robust and portable code, helping to exclude objective factors (like the support level of compiler versions for standards).
With the feature introduced in C99, Array of length zero
can be used to dynamically expand a struct
:
1 | typedef struct A{ |
Although a pointer can also be used:
1 | typedef struct A{ |
However, using Array of length zero
can save the overhead of sizeof(char*)
(as well as the memory alignment overhead), and the first method creates contiguous allocations, which can reduce memory fragmentation issues.
Using sizeof(Astruct)
shows that its size is just sizeof(char)
, while sizeof(AstructPtr)
is sizeof(char)
+ padding
+ sizeof(char*)
:
1 | printf("%llu\n",sizeof(Astruct)); |
From this perspective, it can indeed save a lot of space. For specifics on member padding alignment in struct
, you can refer to my article: Structure Member Memory Alignment Issues.
However, pay attention to the last sentence of the C99 standard citation, which states that Array of length zero
cannot appear at the end of a non-object layout. This also means that this feature cannot exist in C++ (because C++ has inheritance), where a derived class can have members from a base class, but the C++ standard does not specify how a class’s memory layout should work, so it cannot guarantee that the last member of the inherited base class is also the last member of the derived class. In C++, it also cannot guarantee that the last member of the class appears at the end of the instance in memory layout (cannot determine the location of the virtual table, depends on the implementation). This involves C++ object model content; for details, see Inside The C++ Object Model.
That is, in C++, non-static members of a class cannot be an incomplete types
:
[ISO/IEC 14882:2014] Non-static data members shall not have incomplete types. In particular, a class C shall not contain a non-static member of class C, but it can contain a pointer or reference to an object of class C.
Therefore, you cannot use the Array of length zero
technique from C in C++ (this still relies on implementation). If you use this technique, it will be considered Undefined Behavior
from C++ standard perspective.
Although C++ can also use {}
to initialize an array of unknown size, it cannot use an empty initializer list to initialize an array of unknown size.
[ISO/IEC 14882:2014] An array of unknown size initialized with a brace-enclosed initializer-list containing n initializer-clauses, where n shall be greater than zero, is defined as having n elements (8.3.4).
1 | int x[] = { 1, 3, 5 }; |
declares and initializes x as a one-dimensional array that has three elements since no size was specified and there are three initializers. An empty initializer list {} shall not be used as the initializer-clause for an array of unknown bound.
The syntax provides for empty initializer-lists, but nonetheless C++ does not have zero length arrays.
Thus, in C++, access to A::alz is UB behavior (if you are able to compile it):
1 | struct A{ |
The above code compiled successfully without any warnings using MinGW-W64 G++ 6.2.0
and Clang++ 3.9 x86_64-w64-windows-gnu
with the following parameters:
1 | g++ -o test test.cc |
In VS2015
, compiling gives a warning:
warning C4200: non-standard extension: zero size array in structure/union
Here are some articles and discussions about Array of length zero
: