读《C++语言的设计与演化》及一些疑问的解答

C++ Language Design and Evolution is a book written by C++ author Bjarne Stroustrup, discussing the thought process and trade-offs from the conception and design to the actual implementation of C++. It is arguably the only book on the market written from the perspective of a language designer on language design.

There are many issues where we should not only know How, but also understand Why, as this allows for a deeper comprehension of the subject. Fortunately, “C++ Language Design and Evolution” is precisely that kind of book. Recently, I realized many aspects of C++ that I only knew How but did not understand Why (too much was sacrificed for compatibility with C), and this article serves as a reading note and a record of Why, which I will gradually organize.

What is Object-Oriented Programming?

·····This paper about “What is” (What is ‘‘Object-Oriented Programming’’? (1991 revised version)) defines a set of issues that I believe the language of data abstraction and object-oriented programming should address, along with examples of necessary language features. The result reiterates the importance of C++’s “multi-paradigm” nature:
“Object-oriented programming is programming using inheritance mechanisms. Data abstraction is programming using user-defined types. With few exceptions, object-oriented programming will be able to and should support data abstraction. These techniques require effective and correct support. Data abstraction essentially needs to be supported in the form of language features, while object-oriented programming further requires support from programming environments. To achieve universality, a language supporting data abstraction or object-oriented features must effectively utilize traditional hardware.”

More content about C++ and object orientation can be found in these articles:
Why C++ is not just an Object-Oriented Programming Language
What is ‘‘Object-Oriented Programming’’? (1991 revised version)
and SOLID (Single Responsibility, Open/Closed Principle, Liskov Substitution, Interface Segregation, and Dependency Inversion)

Why does C++ not have a Garbage Collection (GC) mechanism?

Before 1985, there was consideration for the possibility of garbage collection, but it was later believed that such a feature was inappropriate for a language already used in real-time processing and hard-core systems (for example, device drivers). In those days, garbage collection was not as complex as it is today, and compared to today’s systems, the processing and storage capabilities of general computers were very low.

Had C (or even C++) been defined as a language requiring garbage collection, it would certainly have been more elegant, but it also would have been a stillborn concept.

What is the relationship between C++ and C?

···Another aspect of compatibility issues is more critical: “In what way should C++ differ from C in order to achieve its own goals?” and “In what way should C++ be compatible with C to reach its goals?”

···A common opinion gradually emerged: there should not be arbitrary incompatibilities between C++ and ANSI C [Stroustrup, 1986], but certain incompatibilities should exist as long as they are not arbitrary.···This principle has been widely understood as: C++, as much as possible, should be close to C but not excessively so.

For example, the following code is valid in C:

1
2
3
4
5
6
7
8
9
10
11
12
struct outer
{
struct inner
{
int i;
};
int j;
};
int main(int argc,char* argv[]){
struct inner x={1};
printf("%d\n",x.i);
}

However, it is illegal in C++:

error: variable has incomplete type ‘struct inner’

Because C++ has the concept of namespaces (in this case, class scope), the correct way to write it in C++ is:

1
2
3
4
5
6
7
8
9
10
11
12
struct outer{
struct inner{
int i;
};
int j;
};
int main(int argc,char* argv[]){
// Of course, in C++ it does not matter whether 'struct' is written here or not
// I found through comparing the intermediate code (using "-S", "-emit-llvm" in LLVM/Clang) that there is no difference between both.
struct outer::inner x={1};
printf("%d\n",x.i);
}

Similarly, consider this case:

1
2
3
4
5
6
struct outer{
}val;
int main(int argc,char* argv[])
{
printf("%d\n",sizeof(val));
}

This code produces completely different results when compiled and run under C and C++ compilers (GCC/G++): (GCC:0/G++:1).
See the following intermediate code:

1
2
3
4
5
6
7
// GCC
%struct.outer = type {}
@x = common global %struct.outer zeroinitializer, align 1

// G++
%struct.outer = type { i8 }
@x = global %struct.outer zeroinitializer, align 1

In fact, when compiling (C++ style) outer with G++, it has a hidden 1-byte size that is inserted by the compiler. This ensures that two objects of this class have unique addresses in memory. (Deep Exploration of C++ Object Model P84)

Because, due to instantiation reasons in C++, each instance must have a unique address in memory, the compiler often adds an implicit byte to an empty class to ensure that an empty class will receive a unique address upon instantiation.

By consulting the C++ standard (ISOIEC:144882:2014 (C++14)), the definition was found (Classes-P214):

Complete objects and member subobjects of class type shall have nonzero size. [^108][Note: Class objects can be assigned, passed as arguments to functions, and returned by functions (except objects of classes for which copying or moving has been restricted; see 12.8). Other plausible operators, such as equality comparison, can be defined by the user; see 13.5. —end note ]
[^108]: Base class subobjects are not so constrained.

This situation described by footnote 108[^108]:

1
2
3
4
5
6
class X{};
class Y: X{};
// At this point, what should the sizes of X and Y be when using sizeof?
cout<<sizeof(X)<<endl
<<sizeof(Y)<<endl;
// The output should be 1 for both

The long interval between C++03 and C++11 release dates (TC++PL4)

[TC++PL 4th] One reason for the long gap between the two standards is that most members of the committee (including me) were under the mistaken impression that the ISO rules required a ‘‘waiting period’’ after a standard was issued before starting work on new features. Consequently, serious work on new language features did not start until 2002. Other reasons included the increased size of modern languages and their foundation libraries.

The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:读《C++语言的设计与演化》及一些疑问的解答
Author:LIPENGZHA
Publish Date:2016/09/09 08:15
World Count:4.5k Words
Link:https://en.imzlp.com/posts/30227/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!