main原型考证及程序终止行为

In C and C++, there are many versions of the main function prototype that circulate, and different books present different ways of writing it. Today, I will explore what constitutes “standard behavior” from the perspectives of several standards (C89/99/11 and C++98/03/11/14) and what happens after the main function returns.

The most common are the following:

1
2
3
4
5
void main()
main()
int main()
int main(void)
int main(int argc,char *argv[])

void main()

First, from a standard perspective (across all versions), void main() is definitely incorrect; no standard (C89/99/11 and C++98/03/11/14) has ever allowed this type of declaration.

However, I came across a reason for writing the main function as void main() in APUE, but I’m not sure if someone popularized this perspective and caused misinformation.

The problem is that these compilers don’t know that an exit from main is the same as a return. One way around these warnings, which become annoying after a while, is to use return instead of exit from main. But doing this prevents us from using the UNIX System’s grep utility to locate all calls to exit from a program. Another solution is to declare main as returning void, instead of int, and continue calling exit. This gets rid of the compiler warning but doesn’t look right (especially in a programming text), and can generate other compiler warnings since the return type of main is supposed to be a signed integer.

Another possibility is that it originated from embedded systems, where there is no operating system, making the return of anything meaningless. This comment comes from a Zhihu user @James Swineson.

main()

In K&R C and C89, if a function does not explicitly declare a return type, it defaults to int:

C89 describes the syntax of function definitions as follows (note the opt subscript on declaration-specifiers):

$$\text{declaration-specifiers}{\text{opt}}\hspace{2mm}\text{declarator}\hspace{2mm}\text{declaration-list}{\text{opt}}\hspace{2mm}\text{compound-statement}$$

In C89, declaration-specifiers under Syntax includes:

  • storage-class-specifier
  • type-specifier
  • type-qualifier

This indicates that in C89, the function’s return type can be omitted.

The description in K&R C is as follows:

Various parts may be absent; a minimal function is

1
dummy() {}

which does nothing and returns nothing. A do-nothing function like this is sometimes useful as a placeholder during program development. If the return type is omitted, int is assumed.

This means:

1
2
3
func(){}
// is equivalent to
int func(){}

However, this method was abolished after C99 (note that declaration-specifiers no longer has the opt subscript):

$$\text{declaration-specifiers}\hspace{2mm}\text{declarator}\hspace{2mm}\text{declaration-list}_{\text{opt}}\hspace{2mm}\text{compound-statement}​$$

In summary, in C89, the function’s return type can be omitted, but defaults to int, meaning

The declaration of the main function main() implicitly is int main().

int main()

int main() and int main(void) have different meanings in C:

1
2
3
int main()
// is not equivalent to
int main(void)

In C, an empty parameter list (i.e., neither providing a parameter list nor using void) means no information about the number or types of the parameters is supplied:

1
2
3
4
5
6
7
8
int func(){
print("func()\n");
return 0;
}

int main(void){
func(1,2,3,4);// call func();
}

The empty list in a function declarator that is not part of the function’s definition specifies that no information about the number or types of the parameters is supplied.

C99/11 Standard

The C99/11 standard explicitly defines two prototypes for the standard main function:

The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:

1
int main(void) { /* ... */ }

or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):

1
int main(int argc, char *argv[]) { /* ... */ }

or equivalent; or in some other implementation-defined manner.
If they are declared, the parameters to the main function shall obey the following constraints:

  • The value of argc shall be nonnegative.
  • argv[argc] shall be a null pointer.
  • If the value of argc is greater than zero, the array members argv[0] through argv[argc-1] inclusive shall contain pointers to strings, which are given implementation-defined values by the host environment prior to program startup. The intent is to supply to the program information determined prior to program startup from elsewhere in the hosted environment. If the host environment is not capable of supplying strings with letters in both uppercase and lowercase, the implementation shall ensure that the strings are received in lowercase.
  • If the value of argc is greater than zero, the string pointed to by argv[0]
    represents the program name; argv[0][0] shall be the null character if the program name is not available from the host environment. If the value of argc is greater than one, the strings pointed to by argv[1] through argv[argc-1]
    represent the program parameters.
  • The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program and retain their last-stored values between program startup and program termination.

C++ Standard

Due to inconsistencies in function parameter list rules between C and C++ (an empty parameter list in C++ means it does not accept any parameters), the prototype of main in C++ differs from that in ISO C:

  • a function of () returning int and
  • a function of (int, pointer to pointer to char) returning int

main return value

The reason that main must have a return value is that in C and C++, using a return statement passes the return value as an argument to the exit/std::exit function to terminate the program.

If status is zero or EXIT_SUCCESS, an implementation-defined form of the status successful termination is returned.

ISO C99/11:
If the return type of the main function is a type compatible with int, a return from the initial call to the main function is equivalent to calling the exit function with the value returned by the main function as its argument; reaching the } that terminates the main function returns a value of 0. If the return type is not compatible with int, the termination status returned to the host environment is unspecified.

ISO C++11/14:
A return statement in main has the effect of leaving the main function (destroying any objects with automatic storage duration) and calling std::exit with the return value as the argument. If control reaches the end of main without encountering a return statement, the effect is that of executing

1
return 0;

exit

1
2
#include <stdlib.h>
void exit(int status);

The exit function causes normal program termination to occur. If more than one call to the exit function is executed by a program, the behavior is undefined.

  • First, all functions registered by the atexit function are called, in the reverse order of their registration, except that a function is called after any previously registered functions that had already been called at the time it was registered. If, during the call to any such function, a call to the longjmp function is made that would terminate the call to the registered function, the behavior is undefined.
  • Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed.
  • Finally, control is returned to the host environment. If the value of status is zero or EXIT_SUCCESS, an implementation-defined form of the status successful termination is returned. If the value of status is EXIT_FAILURE, an implementation-defined form of the status unsuccessful termination is returned. Otherwise the status returned is implementation-defined.

The exit function cannot return to its caller.

_Exit

1
2
#include <stdlib.h>
void _Exit(int status);

The _Exit function causes normal program termination to occur and control to be returned to the host environment. No functions registered by the atexit function or signal handlers registered by the signal function are called. The status returned to the host environment is determined in the same way as for the exit function (7.20.4.3). Whether open streams with unwritten buffered data are flushed, open streams are closed, or temporary files are removed is implementation-defined.
The _Exit function cannot return to its caller.

The article is finished. If you have any questions, please comment and communicate.

Scan the QR code on WeChat and follow me.

Title:main原型考证及程序终止行为
Author:LIPENGZHA
Publish Date:2017/02/27 15:30
World Count:6.8k Words
Link:https://en.imzlp.com/posts/15272/
License: CC BY-NC-SA 4.0
Reprinting of the full article is prohibited.
Your donation will encourage me to keep creating!