Posted onEdited onIn技术笔记
,
C++Views: Symbols count in article: 3.9kReading time ≈10 mins.
C++ is a language that supports object-oriented programming (object-oriented Programming), with inheritance and polymorphism (Polymorphic) being its most important features. There has been considerable discussion in previous articles about various aspects of C++ inheritance and class member content. This article primarily investigates one implementation method of C++ polymorphism by the compiler: virtual function tables.
The C++ standard ([IOS/IEC 14882:2014]) states:
Virtual functions support dynamic binding and object-oriented programming. A class that declares or inherits a virtual function is called a polymorphic class.
Note: The C++ standard does not specify how polymorphism is implemented, so the implementation of polymorphism by the compiler is Implementation-defined Behavior, meaning different compilers may implement polymorphism differently, and different platforms may yield different experimental results.
Therefore, it is necessary to outline the compilation environment for the code in this article. The code compilation uses the C++14 standard (-std=c++14):
### When is the virtual function table initialized? When is the virtual function table included in the class instance? Let’s analyze the following simple code from LLVM-IR:
```cpp class A{ public: virtual void vfunc_one(int) { std::cout<<"A::vfunc_one"<<std::endl; } virtual void vfunc_two(int) { std::cout<<"A::vfunc_two"<<std::endl; } private: int ival; }; class B:public A{ public: virtual void vfunc_one(int) { std::cout<<"B::vfunc_one"<<std::endl; } virtual void vfunc_two(int) { std::cout<<"B::vfunc_two"<<std::endl; } char cval; }; int main() { B bobj; return 0; }
Let’s look at the object layout for types A and B:
// class object layout *** Dumping AST Record Layout 0 | classA 0 | (A vtable pointer) 8 | int ival | [sizeof=16, dsize=12, align=8, | nvsize=12, nvalign=8]
*** Dumping AST Record Layout 0 | class B 0 | classA (primary base) 0 | (A vtable pointer) 8 | int ival 12 | char cval | [sizeof=16, dsize=13, align=8, | nvsize=13, nvalign=8]
// class memory align %class.B = type { %class.A.base, i8, [3 x i8] } %class.A.base = type <{ i32 (...)**, i32 }> %class.A = type <{ i32 (...)**, i32, [4 x i8] }>
In Clang’s implementation, the vptr is at the start of the object space, being a pointer (in my compilation environment, it is 8 bytes). The memory layout of A is vptr(sizeof(void*))+ival(sizeof(int))+padding 4bytes = 16bytes. The memory layout of B is: A’s base class sub-object (vptr(sizeof(void*))+ival(sizeof(int)))+cval(sizeof(char))+padding 3bytes = 16bytes. For details on memory alignment, you can refer to my previous article: Memory Alignment Issues of Structure Members.
The LLVM-IR code for the main function of the above C++ code is:
// set vptr in _ZN1BC2Ev store i32(...)** bitcast(i8** getelementptr inbounds ({ [4 x i8*] }, { [4 x i8*] }* @_ZTV1B, i32 0, inrange i32 0, i32 2) to i32 (...)**), i32(...)*** %5, align 8
The address of _ZTV1B is assigned to vptr. _ZTV1B is the symbol after vptr has undergone Name Mangling, which can be viewed using c++filt:
1 2
$ c++filt _ZTV1B vtable for B
In summary, the compiler initializes the vptr from the constructor (after calling the base class’s constructor, which also assigns its vptr). Therefore, calling a virtual function in the base class’s constructor does not have polymorphic behavior…
Storage Location of Virtual Function Tables and Analyzing Its Initialization from Assembly
This section expands and elaborates on the previous analysis:
How is the virtual table pointer initialized?
Does each instance have its own copy of the virtual function table?
Is there polymorphic behavior when calling virtual functions within the base class’s constructor?
With the above questions in mind, let’s look at the following simple example:
# Virtual function table for class B .lcomm _ZStL8__ioinit,1 # @_ZStL8__ioinit .section .rdata$_ZTV1B,"dr",discard,_ZTV1B .globl _ZTV1B # @_ZTV1B .p2align 3 _ZTV1B: .quad 0 .quad _ZTI1B .quad _ZN1B4funcEv .quad _ZN1B5func2Ev
# Virtual function table for class A .section .rdata$_ZTV1A,"dr",discard,_ZTV1A .globl _ZTV1A # @_ZTV1A .p2align 3 _ZTV1A: .quad 0 .quad _ZTI1A .quad _ZN1A4funcEv .quad _ZN1A5func2Ev
We can see that the virtual function tables for class A and class B are stored in _ZTV1B and _ZTV1A, which contain the pointers to the virtual functions, and are located in the .data section. Thus, there is only one copy globally. Their layout is as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
# LLVM-IR B: [4 x i8*] [ i8* null, i8* bitcast ({ i8*, i8*, i8* }* @_ZTI1B to i8*), i8* bitcast (void (%class.B*)* @_ZN1B4funcEv to i8*), i8* bitcast (void (%class.B*)* @_ZN1B5func2Ev to i8* ]
A: [4 x i8*] [ i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI1A to i8*), i8* bitcast (void (%class.A*)* @_ZN1A4funcEv to i8*), i8* bitcast (void (%class.A*)* @_ZN1A5func2Ev to i8* ]
The class constructors retrieve the address of the virtual function table from the .data section and assign it to the instance’s vptr. It is important to note that the virtual table structure contains offsets for this (the first element) and class type information.
When assigning to the instance’s vptr, an offset is applied:
1 2
leaq _ZTV1A(%rip), %rax addq $16, %rax
This skips the first two elements of the virtual table structure, directly pointing to the pointer that stores the address of the first virtual function.
Moreover, it is crucial to emphasize that during the construction of class B, the constructor of class A is called. At this point, operations on the vptr in class B’s constructor have not yet been executed, so if a virtual function is called within class A, there will be no polymorphic behavior — because at this time, the instance’s vptr points to A’s virtual function table.
For other related materials on object models, I recommend reading “Inside the C++ Object Model”. Some parts of this book may be a bit outdated, as certain features may have different compiler implementations now (there’s no specification on how to implement), but since it was published in 1996, at the dawn of C++, it has reference value for implementation ideas and there are currently no other equivalent books available.
The article is finished. If you have any questions, please comment and communicate.