通过IR代码来分析C++代码语义

IR code is the Intermediate Code generated by LLVM. By analyzing the IR code, we can understand how the compiler parses and executes the code we write, making the analysis of code semantics clearer. The syntax and semantics of the IR code can be referenced in the LLVM Language Reference Manual.

The command to generate IR code using Clang/LLVM is as follows:

1
clang++ -S -emit-llvm source.cpp

This will generate a source.ll file, which is the IR code we need to analyze.

Let’s start with a very simple example:

1
2
int x;
cout<<x<<endl;

We all know that if an object located in automatic/dynamic storage duration is not initialized, it has an indeterminate value.

If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17).

So how does the compiler deal with this? Looking at the intermediate code, unnecessary parts are omitted for brevity.

1
2
3
4
%2 = alloca i32, align 4
%3 = load i32, i32* %2, align 4
%4 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZNSolsEi(%"class.std::basic_ostream"* @_ZSt4cout, i32 %3)
%5 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %4, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* @_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)

We can see that declaring (for the difference between declaration and definition, see the difference between declaration and definition in C++) an automatic int type object and outputting has executed the following steps:

  1. Allocate space for an i32, %2.
  2. Load the address of that space, %3.
  3. Then %4 and %5 call cout to output the memory of %3.

Note: No operations were performed on the allocated memory here, so the value obtained from it is indeterminate.

Now let’s compare it with an operation that has initialization:

1
2
int x=111;
cout<<x<<endl;

The IR code is:

1
2
3
4
5
%2 = alloca i32, align 4
store i32 111, i32* %2, align 4
%3 = load i32, i32* %2, align 4
%4 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZNSolsEi(%"class.std::basic_ostream"* @_ZSt4cout, i32 %3)
%5 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %4, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* @_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)

Note the added second line:

1
store i32 111, i32* %2, align 4

This line indicates storing the i32 value 111 into the space pointed to by i32* %2.

Next, let’s analyze an example: will initializing some elements of an array also initialize the uninitialized elements?

1
int iarr[123]={111,222,333};

We certainly know this because the standards state as follows:

[ISO/IEC 14882:2014]

1
float y[4][3] = {{ 1 }, { 2 }, { 3 }, { 4 }};

initializes the first column of y (regarded as a two-dimensional array) and leaves the rest zero.
[ISO/IEC 9899:1999]
If there are fewer initializers in a brace-enclosed list than there are elements or members of an aggregate, or fewer characters in a string literal used to initialize an array of known size than there are elements in the array, the remainder of the aggregate shall be initialized implicitly the same as objects that have static storage duration.

So how does the compiler handle this? Its IR code is as follows:

1
2
3
4
5
6
7
8
9
10
11
%2 = alloca [123 x i32], align 16
%3 = bitcast [123 x i32]* %2 to i8*
call void @llvm.memset.p0i8.i64(i8* %3, i8 0, i64 492, i32 16, i1 false)
%4 = bitcast i8* %3 to [123 x i32]*
// Accessing indices and performing assignments
%5 = getelementptr [123 x i32], [123 x i32]* %4, i32 0, i32 0
store i32 111, i32* %5
%6 = getelementptr [123 x i32], [123 x i32]* %4, i32 0, i32 1
store i32 222, i32* %6
%7 = getelementptr [123 x i32], [123 x i32]* %4, i32 0, i32 2
store i32 333, i32* %7
  1. First, allocate space of size 123*i32, %2.
  2. Convert the address of that allocated space to a pointer i8*, %3 (the starting address of the array).
  3. Call memset to set each byte (i8) of space %2 to 0, starting from position %3 offset by 123*i32=492 bytes.
  4. Access the addresses of elements with indices 0 (%4), 1 (%5), and 2 (%6) through the starting address of the array.
  5. Store the values from the initialization list to that address in sequence (111 to %4 / 222 to %5 / 333 to %6).

Next, let’s analyze the example from CppQuiz#Q5, the original question is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
#include <iostream>

struct AClass {
AClass() { std::cout << "AClass"; }
};
struct BClass {
BClass() { std::cout << "BClass"; }
};

class CClass {
public:
CClass() : aobj(), bobj() {}

private:
BClass bobj;
AClass aobj;
};

int main()
{
CClass();
}

Question: What is the execution result of this code under C++11?

Assuming we don’t know what it will execute (the answer can be seen in the link above), we will analyze it from the perspective of IR code.
We constructed a temporary object of CClass in the main function, and we expect the initialization order of objects in CClass to be aobj/bobj. But what is the actual case?
The IR code is as follows (we only need to look at CClass’s constructor):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
// Main function
; Function Attrs: norecurse uwtable
define i32 @main() #4 {
%1 = alloca %class.CClass, align 1
call void @_ZN6CClassC2Ev(%class.CClass* %1)
ret i32 0
}
// CClass's constructor
; Function Attrs: uwtable
define linkonce_odr void @_ZN6CClassC2Ev(%class.CClass*) unnamed_addr #0 comdat align 2 {
%2 = alloca %class.CClass*, align 8
store %class.CClass* %0, %class.CClass** %2, align 8
%3 = load %class.CClass*, %class.CClass** %2, align 8
%4 = getelementptr inbounds %class.CClass, %class.CClass* %3, i32 0, i32 0
call void @_ZN6BClassC2Ev(%struct.BClass* %4)
%5 = getelementptr inbounds %class.CClass, %class.CClass* %3, i32 0, i32 1
call void @_ZN6AClassC2Ev(%struct.AClass* %5)
ret void
}
// BClass's constructor
; Function Attrs: uwtable
define linkonce_odr void @_ZN6BClassC2Ev(%struct.BClass*) unnamed_addr #0 comdat align 2 {
%2 = alloca %struct.BClass*, align 8
store %struct.BClass* %0, %struct.BClass** %2, align 8
%3 = load %struct.BClass*, %struct.BClass** %2, align 8
%4 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* dereferenceable(272) @_ZSt4cout, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i32 0, i32 0))
ret void
}
// AClass's constructor
; Function Attrs: uwtable
define linkonce_odr void @_ZN6AClassC2Ev(%struct.AClass*) unnamed_addr #0 comdat align 2 {
%2 = alloca %struct.AClass*, align 8
store %struct.AClass* %0, %struct.AClass** %2, align 8
%3 = load %struct.AClass*, %struct.AClass** %2, align 8
%4 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* dereferenceable(272) @_ZSt4cout, i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i32 0, i32 0))
ret void
}

It can be seen that although we wrote in CClass’s constructor (expecting to construct aobj before bobj):

1
CClass() : aobj(), bobj() {}

In fact, the order in which the compiler constructs CClass is (calling the constructor of BClass first, then calling the constructor of AClass):

1
2
3
4
%4 = getelementptr inbounds %class.CClass, %class.CClass* %3, i32 0, i32 0
call void @_ZN6BClassC2Ev(%struct.BClass* %4)
%5 = getelementptr inbounds %class.CClass, %class.CClass* %3, i32 0, i32 1
call void @_ZN6AClassC2Ev(%struct.AClass* %5)

That is, classes are not constructed in the order of the member initializer list, but in the order they are declared in the class definition.

[ISO/IEC 14882:2014] non-static data members are initialized in the order they were declared in the class definition (again regardless of the order of the mem-initializers).

I will stop here for now. Analyzing the standard documentation of semantic composition languages in this way can serve as a powerful tool for analyzing low-level logical errors in the code.

If you want to compile from IR code to assembly code, you can use llc:

1
$ llc test.ll -o test.s
The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:通过IR代码来分析C++代码语义
Author:LIPENGZHA
Publish Date:2017/03/08 11:46
World Count:3.6k Words
Link:https://en.imzlp.com/posts/20479/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!