C++ Template Metaprogramming Data Collection

This article mainly summarizes my insights during the learning process of template metaprogramming, and I will also include some template metaprogramming code I have written here.

Template Code

Template Metaprogramming for Base Conversion

Implemented a template metaprogram version to convert from decimal to base 2~10:

template<int N,int Z>
struct Convert{
  enum:size_t{
    resault=(N<Z)?N:(10*Convert<N/Z,Z>::resault+N%Z)
  };
};

template<int Z>
struct Convert<0,Z>{enum:size_t{resault=0};};

You can use it like this:

int main()
{
  printf("%d\n",Convert<12345,2>::resault);
  printf("%d\n",Convert<12345,8>::resault);
}

The range of parameters received cannot exceed size_t.

Looking at the IR code of the above main function, you can see that there is no runtime overhead.

define i32 @main() #4 {
  %1 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i64 11000000111001)
  %2 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i64 30071)
  ret i32 0
}

Additionally, here is a template metaprogram version for integer power:

template<int N,int Z>
struct mtpow{
  enum:size_t{
    resault=N*mtpow<N,Z-1>::resault
  };
};
template<int N>
struct mtpow<N,1>{
  enum:size_t{
    resault=N
  };
};

int main()
{
  printf("%d\n",mtpow<3,2>::resault);
}

Template Metaprogramming for Fibonacci Sequence

There is no need to elaborate on what the Fibonacci sequence is. Today, I couldn’t sleep during lunch break, so I whipped up a template metaprogram version that calculates the Nth term of the Fibonacci sequence and the sum of the first N terms with zero runtime overhead.
Without further ado, here is the code:

template<int N>
struct FibonacciN
{
  enum:size_t{
    resault=FibonacciN<N-1>::resault+FibonacciN<N-2>::resault
  };
};

template<>
struct FibonacciN<1>{
  enum:size_t{resault=1};
};
template<>
struct FibonacciN<2>{
  enum:size_t{resault=1};
};

template<int N>
struct FibonacciNSum{
  enum:size_t{
    resault=FibonacciN<N>::resault+FibonacciNSum<N-1>::resault
  };
};

template<>
struct FibonacciNSum<1>{
  enum:size_t{
    resault=FibonacciN<1>::resault
  };
};

int main()
{
  printf("%d\n",FibonacciN<5>::resault);
  printf("%d\n",FibonacciNSum<5>::resault);
}

Let’s take a look at the LLVM-IR code (for brevity, only the main function part):

define i32 @main(i32, i8**) #4 {
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i8**, align 8
  store i32 0, i32* %3, align 4
  store i32 %0, i32* %4, align 4
  store i8** %1, i8*** %5, align 8
  %6 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32 5)
  %7 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32 12)
  ret i32 0
}

You can see that the direct outputs in printf are just constant 5 and 12, which means there is completely no runtime overhead.

Compile-Time Type Checking

How can we determine at compile time if a type is a class type?
Based on the SFINAE concept in C++, we can execute different logic based on whether the match is successful or fails:

struct True{
  char x;
};
struct False{
  char x[2];
};
template<typename T>
struct IsClass{
  template<typename C> static True isClass(int C::*){return True{};}
  template<typename C> static False isClass(...){return False{};}
  enum{ISCLASS=sizeof(IsClass<T>::isClass<T>(0))==sizeof(True)};
};
int main()
{
  if(IsClass<double>::ISCLASS){
    cout<<"yes"<<endl;
  }else{
    cout<<"no"<<endl;
  }
}

// output: no

The main point here is True isClass(int C::*). Remember, as I mentioned in a previous article Pointers to Class Members Are Not Pointers in C++, 0 can be converted to a pointer to a member. The idea is that if type T has a pointer to a class member, then it is a class; otherwise, it is not.
Based on the same idea, we can also implement a check to see if there exists a conversion from type A to another B:

template<typename A,typename B>
struct haveConvert{
  static True Convert(B){return True{};}
  static False Convert(...){return False{};}
  enum{CHECK=sizeof(haveConvert<A, B>::Convert(A{}))==sizeof(True)};
};
int main()
{
  if(haveConvert<string,double>::CHECK){
    cout<<"yes"<<endl;
  }else{
    cout<<"no"<<endl;
  }
}
// output: no

There’s also completely no runtime overhead. The code here is already determined during compilation, so at runtime it’s just:

if(0){
  cout<<"yes"<<endl;
}else{
  cout<<"no"<<endl;
}

After optimization by the compiler, dead code is eliminated:

1	cout<<"no"<<endl;

From the LLVM-IR code, you can see that during runtime only the output of cout is executed:

@.str = private unnamed_addr constant [3 x i8] c"no\00", align 1

; Function Attrs: norecurse uwtable
define i32 @main() #4 {
  %1 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc(%"class.std::basic_ostream"* dereferenceable(272) @_ZSt4cout, i8* getelementptr inbounds ([3 x i8], [3 x i8]* @.str, i32 0, i32 0))
  %2 = call dereferenceable(272) %"class.std::basic_ostream"* @_ZNSolsEPFRSoS_E(%"class.std::basic_ostream"* %1, %"class.std::basic_ostream"* (%"class.std::basic_ostream"*)* @_ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_)
  ret i32 0
}

This encapsulates the essence of template programming: having zero runtime overhead is such a big deal!!

Using Self-Type in Class Members

How to define a class member that can access the current class type?
You can use the following syntax:

template<typename T>
struct base{
  using selfType=T;
};

template<typename T>
struct foo:public base<foo<T>>{};

Although it does seem a bit forced…

Passing Templates as Template Parameters

This refers to template template parameters. If we have a demand where the second parameter of the template is a template whose parameters are of the type of the first parameter:

1 2	// Expect to use string as a template argument for vector inside TempClass TempClass<string,vector> test;

However, you cannot use the above syntax, nor can you use TempClass<string,vector<>> like this.
Because it needs to pass a template, not a type of vector<T>. You can use the following syntax:

using namespace std;
template<class X,template<class Y=X> class Z>
struct A
{
  Z<> x; // equal Z<X>, because default template parameter of template Z is X
  void getZname(){
    cout<<typeid(this->x).name()<<endl;
  }
};
template<class T>
using Vector=vector<T>;

int main(int argc,char* argv[])
{
  A<string,Vector> x;
  x.getZname();
  vector<string> y;
  cout<<typeid(y).name()<<endl;
  return 0;
}

Template Array Reference Parameters Will Not Implicitly Convert to Pointers

First, look at this template, which is used to get the size of array elements:

template<class T, size_t N>
size_t arr_size(T (&)[N])
{
  return N;
}

Usage: pass an array type, and get its element count from the return value:

int main()
{
  int i_array[10]={};
  cout<<arr_size(i_array)<<endl;
}

// output
10

Now, let’s analyze why the arr_size template function is written this way.
First, this template class takes two parameters, one is type T and the other is the size N, the compiler matches these two template parameters based on the passed arguments. The meaning of this template is to get the size of the elements (N) based on the type of the array (e.g., int [N]).
The most important part is the function parameter declaration:

1	size_t arr_size(T (&)[N])

There are two questions:

Why do we need &? Can’t we use pass by value?
Why do we need parentheses? Would T &[] not work?

First, let’s address the first question: Why do we need to pass by reference? Can’t pass by value work?
Pass by value does not work. We can write a function that passes limited array argument elements to analyze what the actual argument looks like when passed by value:

void test_array(int (arr)[10]){}

int main()
{
  int i_array[10]={0};
  test_array(i_array);
}

As usual, let’s look directly at the LLVM-IR code generated by the above:

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @_Z10test_arrayPi(i32*) #4 {
  %2 = alloca i32*, align 8
  store i32* %0, i32** %2, align 8
  ret void
}

; Function Attrs: noinline norecurse nounwind optnone uwtable
define dso_local i32 @main() #5 {
  %1 = alloca i32, align 4
  %2 = alloca [10 x i32], align 16
  store i32 0, i32* %1, align 4
  %3 = bitcast [10 x i32]* %2 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %3, i8 0, i64 40, i1 false)
  %4 = getelementptr inbounds [10 x i32], [10 x i32]* %2, i32 0, i32 0
  call void @_Z10test_arrayPi(i32* %4)
  ret i32 0
}

Directly observing the function signature of test_array:

1	void @_Z10test_arrayPi(i32*)

There is an implicit conversion from array to pointer! At this point, the type information of the array is lost.
If we use pass by value, we cannot retrieve the number of elements in the array during type deduction because the count is lost as the values are passed in.
So does passing by reference solve the problem?
Yes, passing by reference won’t have implicit conversion because reference types need to bind to the original type, and pointer types and array types are different:

void test_array(int (&arr)[10]){}

int main()
{
  int i_array[10]={0};
  test_array(i_array);
}

Its LLVM-IR code is:

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @_Z10test_arrayRA10_i([10 x i32]* dereferenceable(40)) #0 {
  %2 = alloca [10 x i32]*, align 8
  store [10 x i32]* %0, [10 x i32]** %2, align 8
  ret void
}

; Function Attrs: noinline norecurse nounwind optnone uwtable
define dso_local i32 @main() #1 {
  %1 = alloca [10 x i32], align 16
  %2 = bitcast [10 x i32]* %1 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %2, i8 0, i64 40, i1 false)
  call void @_Z10test_arrayRA10_i([10 x i32]* dereferenceable(40) %1)
  ret i32 0
}

You can see that the type of the test_array parameter is also an array type now, retaining the type information, so this usage can be utilized in templates.

Now, for the second question: Why do we need parentheses? Would T &arr_ref[N] not work?
No, let’s first look at what the type semantics of T &arr_ref[N] is:

1	int &i_arr_ref[10]; // error: 'i_arr_ref' declared as array of references of type 'int &'

First, the [] has higher precedence than &, so the binding of T &arr_ref[N] is interpreted as T &(arr_ref[10]), which is an array of references type, but there is no array of references type in the C++ standard:

[ISO/IEC 14882:2014 §8.3.2.5] There shall be no references to references, no arrays of references, and no pointers to references.

That means there can be no references to references, no arrays of references, and no pointers to references, and hence the type T &[N] is illegal.
What if we add parentheses?

1	int (&i_arr_ref)[10]; // error: declaration of reference variable 'i_arr_ref' requires an initializer

Initialization would look like:

1 2	int i_array[10]={0}; int (&i_arr_ref)[10]=i_array;

This declaration type is a reference to an array of type int [10], which is legal. The corresponding LLVM-IR code is:

define dso_local i32 @main() #1 {
  %1 = alloca [10 x i32], align 16
  %2 = alloca [10 x i32]*, align 8
  %3 = bitcast [10 x i32]* %1 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %3, i8 0, i64 40, i1 false)

  %4 = store [10 x i32]* %1, [10 x i32]** %2, align 8
  ret i32 0
}

In fact, there are similar declarations for pointers to array objects (which differ from pointer types to array names):

1	int (*pointer_to_array)[10];

I have mentioned earlier that references are implemented by pointers in the compiler. The following code generates exactly the same LLVM-IR code:

1
2
3

int i_array[10]={0};
int (*pointer_to_array)[10]=&i_array;
int (&i_array_ref)[10]=i_array;

Corresponding LLVM-IR code is:

define dso_local i32 @main() #5 {
  // declare i_array
  %1 = alloca [10 x i32], align 16
  // declare pointer_to_array
  %2 = alloca [10 x i32]*, align 8
  // declare i_arr_ref
  %3 = alloca [10 x i32]*, align 8

  // initialize i_array
  %4 = bitcast [10 x i32]* %1 to i8*
  call void @llvm.memset.p0i8.i64(i8* align 16 %4, i8 0, i64 40, i1 false)

  // initialize pointer_to_array
  store [10 x i32]* %1, [10 x i32]** %2, align 8
  // initialize i_arr_ref
  store [10 x i32]* %1, [10 x i32]** %3, align 8
  ret i32 0
}

Template Metaprogramming Resources

The Development History of Templates

Excerpted from an article by Herb Sutter, chairman of the C++ standards committee, titled “Exceptional C++: 40 New Engineering Puzzles, Programming Problems, And Solutions.”

The first design proposal for C++ templates was put forward by Bjarne Stroustrup in October 1988 [Stroustrup88].
In 1990, Margaret Ellis and Bjarne Stroustrup co-authored The Annotated C++ Reference Manual (also known as ARM) [Ellis90]. That same year, the ISO/ANSI C++ standards committee continued to progress and used ARM as the foundational document. ARM was the first reference containing descriptions of templates; the templates described were quite rudimentary, with the whole specification occupying only 10 pages. At that time, the focus was entirely on supporting parameterized types and functions, with the example given in ARM being a List container that could store objects of various types, along with a sort algorithm that could sort sequences of various types. However, even back then, people were already considering whether to grant templates a separate compilation capability. Cfront (Stroustrup’s C++ compiler) supported a “separate” compilation of templates, although the approach at the time was not scalable (as noted above).
From 1990 to 1996, dozens of C++ compiler developers emerged, each adopting different methods to implement templates while the standards committee significantly improved and expanded (complicating) templates. The in-depth book C++ Templates [Vandevoorde03] devotes about 133 out of 552 pages to a complete description of standard C++ templates, and the entire book is dedicated to detailing this language feature and how to use it effectively.
In the early to mid-1990s, the C++ standards committee focused primarily on making templates more robust and practical, thereby providing better support for their basic usage. Few would doubt that they created a highly flexible somewhat quirky miracle, which later became known as Turing complete metaprogramming; programs could be written that were as complex as desired, executing at compile time. However, the template metaprogramming and advanced library design techniques that are now prevalent in C++ were not anticipated by those who initially gave life to C++ templates—the decisions made then have shaped everything we have today. Most of the template techniques that became popular did not even exist between 1990 and 1996! Remember, at the end of 1994, Stepanov submitted his STL library to the standards committee for the first time, and in 1995 STL was adopted as a standard, viewed as a groundbreaking achievement at the time. Yet today, STL is “just” a container and algorithm library. Of course, STL was indeed a revolutionary breakthrough in 1995, and even today, it remains a powerful testament to what distinguishes C++ from other languages; however, from today’s standards, STL is merely a small trial run for C++ templates.