Why can't we override && and || and , (comma)?

为什么不能重载&&与||以及,(comma)?

C++ provides two logical operators || and && as well as the , (comma) operator in its basic syntax. We can overload these operators in a class, but we should avoid doing that. In this article, I will outline the standard descriptions and the reasons why they should not be overloaded.

In summary, because the built-in || and && have short-circuit evaluation semantics, overloading them turns them into regular function calls, yielding semantics that are entirely different from the built-in || and &&. Furthermore, the , operator has left-to-right evaluation semantics, so if you overload it, it will also become a function call and result in semantics that differ from the built-in version.

First, let’s briefly introduce the logical operators && and || in C++. However, since this is basic syntax, I’ll skip the usage explanation and directly cite the standard (ISO/IEC 14882:2014):

logical AND operator: The && operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). The result is true if both operands are true and false otherwise. Unlike &, && guarantees left-to-right evaluation: the second operand is not evaluated if the first operand is false.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

logical OR operator: The || operator groups left-to-right. The operands are both contextually converted to bool (Clause 4). It returns true if either of its operands is true, and false otherwise. Unlike |, || guarantees left-to-right evaluation; moreover, the second operand is not evaluated if the first operand evaluates to true.
The result is a bool. If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

In fact, the two most important points regarding “why you cannot overload operator && and operator ||” are:

  • AND: If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.
  • OR: If the second expression is evaluated, every value computation and side effect associated with the first expression is sequenced before every value computation and side effect associated with the second expression.

This means that && and || have the property of short-circuit evaluation: The operators && and || will not evaluate their second argument unless doing so is necessary. (TC++PL4th)

However, if you overload && and ||, they will simply become functions. All arguments will be evaluated when passed to the function parameters:

[ISO/IEC 14882:2014] All side effects of argument evaluations are sequenced before the function is entered.

Moreover, the C++ standard clearly states that the order of evaluation of function parameters is unspecified.

[ISO/IEC 14882:2014 §8.3.6.9] The order of evaluation of function arguments is unspecified.

Thus, the side effects of the function’s arguments are executed before entering the function body, which violates the short-circuit evaluation property of logical operations and conflicts with the semantics of built-in operators, so we cannot overload && and ||.

Let’s look at the following example:

1
2
3
4
5
6
7
8
int main()
{
int a=0;
int b=0;
++a||++b;
printf("%d\n",b);
}
// output: 0

Since ++a is true, ++b will not be executed, demonstrating the principle of short-circuit evaluation. Now, consider this code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
struct A{
int ival=0;
A& operator++(){
ival++;
return *this;
}
template<typename U>
bool operator||(U& x){
if(ival!=0||x)
return true;
else
return false;
}
};
int main()
{
A aobj;
int b=0;
++aobj||++b;
printf("%d\n",b);
}
// output: 1

Here, || is overloaded within class A, so ++aobj||++b; becomes a function call. Now, let’s look at its LLVM-IR code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
define i32 @main() #4 {
%1 = alloca %struct.A, align 4
%2 = alloca i32, align 4
call void @_ZN1AC2Ev(%struct.A* %1) #3
store i32 0, i32* %2, align 4
%3 = call dereferenceable(4) %struct.A* @_ZN1AppEv(%struct.A* %1)
%4 = load i32, i32* %2, align 4
%5 = add nsw i32 %4, 1
store i32 %5, i32* %2, align 4
%6 = call zeroext i1 @_ZN1AooIiEEbRT_(%struct.A* %3, i32* dereferenceable(4) %2)
%7 = load i32, i32* %2, align 4
%8 = call i32 (i8*, ...) @_ZL6printfPKcz(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i32 %7)
ret i32 0
}

Focus on these lines, which execute the operation from ++aobj||++b;:

1
2
3
4
5
%3 = call dereferenceable(4) %struct.A* @_ZN1AppEv(%struct.A* %1)
%4 = load i32, i32* %2, align 4
%5 = add nsw i32 %4, 1
store i32 %5, i32* %2, align 4
%6 = call zeroext i1 @_ZN1AooIiEEbRT_(%struct.A* %3, i32* dereferenceable(4) %2)

You may have doubts about names like _ZN1AppEv and _ZN1AooIiEEbRT_. They appear to be function calls, yet it’s unclear which function they correspond to. I detailed this in my other two articles: Analysis of C/C++ Compilation and Linking Models and Why extern “C” is Needed.

Let’s directly look at what the symbols _ZN1AppEv and _ZN1AooIiEEbRT_ represent:

1
2
3
4
$ c++filt _ZN1AppEv
A::operator++()
$ c++filt _ZN1AooIiEEbRT_
bool A::operator||<int>(int&)

From the order of these symbols in the IR code, we can determine that the execution order of ++aobj||++b; is:

  1. First, aobj is incremented.
  2. Then b is incremented.
  3. Finally, A::operator|| is called.

This indicates that regardless of ++aobj‘s result, ++b will always be executed, eliminating the short-circuit evaluation principle! The same logic applies to &&.

Now, let’s examine why operator, cannot be overloaded. In C/C++, the , operator evaluates expressions from left to right:

[ISO/IEC 14882:2014] A pair of expressions separated by a comma is evaluated left-to-right; the left expression is a discarded-value expression.

Thus, if operator, is overloaded, there are two issues:

  1. The arguments are evaluated before entering the function body.
  2. The evaluation order of overloaded operator, is unspecified.

This conflicts with the built-in , operator’s semantics, so we cannot overload operator,.

Consider the following code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
struct A{
int ival=0;
A& operator++(int){
ival++;
return *this;
}
template<typename T>
T operator,(T x){
return x;
}
};
int main()
{
A aobj;
int b=0;
aobj++,b=aobj.ival;
}

Here, aobj++,b=aobj.ival; invokes the class A’s operator,, which is a function call. As a function call, it adheres to C++’s parameter evaluation principles: The order of evaluation of function arguments is unspecified.

Unspecified means it may not be executed in any particular order, so in the overloaded operator,, the right parameter relying on the left is undefined behavior. Although I tested widely-used compilers (GCC6.2/Clang 3.9) and found that they execute in the order of the argument list, there’s no guarantee in the C++ standard.

Let’s check the LLVM-IR code for the main function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
define i32 @main() #4 {
%1 = alloca i32, align 4
%2 = alloca %struct.A, align 4
%3 = alloca i32, align 4
store i32 0, i32* %1, align 4
call void @_ZN1AC2Ev(%struct.A* %2) #3
store i32 0, i32* %3, align 4
%4 = call dereferenceable(4) %struct.A* @_ZN1AppEi(%struct.A* %2, i32 0)
%5 = getelementptr inbounds %struct.A, %struct.A* %2, i32 0, i32 0
%6 = load i32, i32* %5, align 4
store i32 %6, i32* %3, align 4
%7 = call i32 @_ZN1AcmIiEET_S1_(%struct.A* %4, i32 %6)
ret i32 0
}

The behavior of the compiler for aobj++,b=aobj.ival; is to first execute aobj‘s operator++(int) operation, then execute b=aobj.ival.

In summary, while mainstream C++ compilers may implement such (similar to built-in) behavior, the differences between function calls and built-in operator, are defined by the C++ standard.

The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:Why can't we override && and || and , (comma)?
Author:LIPENGZHA
Publish Date:2017/06/24 22:21
Word Count:5.2k Words
Link:https://en.imzlp.com/posts/11306/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!