int main() vs void main()
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
What would be the best way to start a blog that talks about building safer code? Yes, it would be to talk about the first thing you think when you start coding. The “main” function (as seen in C/C++).
We’ve been brought up with many different variations of this popular function. Some books/teachers like to write it as “int main”, some write it as “void main” and then there are some who make do with just “main”, forgoing the return type altogether. So, which one out of them is correct? Or does it even make a difference?
The answer to the 2nd question is “YES”. It makes a whole world of difference as in it could:
- do nothing
- or give you a compile time warning
- or crash the program
- or cause problems in your invocation environment
Now, we go back to the first question. Which is the correct form and why?
The answer is “int main” is the correct type for C++.
But for C, it is a bit tricky and I’d say “int main” is the recommended way.
The simple reasoning is “because the C and C++ standards say so”. (See this however, which is what is leads to a bit of confusion though and makes it implementation dependent in c)
But lets take a brief look at the practical reasons for this because you might wonder “My compiler doesn’t give me a warning for void main, so why should I care?” (If your compiler does that, then its time to switch to something else. Did I hear you are using a Microsoft compiler?
).
There is something called “startup code” (more on this soon) which is what runs first when you start your program, sets things up (e.g. initializes stack, heap, .bss etc) and then calls “main”. Now, generally this start up code expects main to return an integer value and place it onto stack. Now, if we don’t return anything (in void main’s case), it will still take something in its place, which you don’t have any control of. e.g. If this startup code expects the value to be in a certain range, it will crash if it sees something out of it. Or imagine if the caller function pops off an int from the stack but our main never put anything into it, it would screw up the stack for good. (Ever saw segfaults on a program that runs fine but crashes while exiting)
Moreover, if you are running a script that makes its basis upon such a program, a garbage value will be returned to it leading to unknown consequences.
End Note: Some would say that leaving out the type altogether and just writing “main” should be the way to go because the return type of such functions is automatically assumed to be int by the compiler. However, this is not correct because the standard does not define an intended behaviour for this and this is totally compiler dependent and hence it is not reliable.
Please post your comments about this article or if there is something that I missed out on. Also let us know if there is something in particular that you’d want us to talk about.
© Safer Code | int main() vs void main()
|
Liked this post? Get FREE Updates Subscribe to RSS feed |
Related posts
Tags: C, crash, heap, int main, main, microsoft compiler, Safety, Security, stack, startup code, void main






This is an excellent explanation that something so simple that most people miss it. It’s the first time I’ve seen this explanation, but when I’ve asked a friend that does a lot more programming in this languages that I do (I work mainly in abap nowdays) she explained it’s true.
Your conclusions about “not returning a value from main could cause a crash” (or other errant behavior) SHOULD be incorrect. (I say “should” because I’m assuming a properly written shell. More about that later). The return value from main() is simply returned to the shell (that launched the executable). The purpose of this return value is not even defined, but traditionally, some shell scripts use this return value to gauge whether the executable succeeded, under the assumption that an executable will return a 0 value to the shell if the executable succeeded, or a non-zero value if there was a failure. Note that this is simply a traditional rule of thumb, and there is no guarantee that an executable will in fact follow this rule of thumb. You’ve discovered that many executables, especially ones written to run from a GUI, don’t. Consequently, a shell script should use the return value from main ONLY when running an executable that does in fact follow this rule of thumb. (How does a shell script know whether a given executable follows this rule of thumb? There’s no way to know programmably. The guy writing the shell script has to look at the source code of the executable to determine if it follows this rule of thumb). The shell itself does nothing with this return value (or it shouldn’t. If it does, it’s a poorly written shell that makes a bad assumption, and does something really, really stupid and unneccesary. In that case, your gripe is with the guy writing the shell, not the guy writing the executable. Yes, software that makes bad assumptions typically does crash. But the bad assumption here would be with the shell writer). Given the proliferation of GUI software, which typically doesn’t even have a main() function at all (because the “GUI shell” is using an entirely different startup scheme), this traditional rule of thumb should now be considered obsolete. Therefore, unless you’re using an obsolete shell and/or poorly written shell scripts, then there is absolutely no negative ramifications about not returning a value from main. If you ARE using said tools, replace them… immediately.
P.S. The reason that microsoft’s C compiler doesn’t complain about a void return, is because the C compiler is written for modern software running on a modern OS, and therefore properly acknowledges the above rule of thumb as obsolete. Furthermore, the windows shell doesn’t even care about main’s return value, and automatically substitutes a 0 value to any shell script because the shell knows that this rule of thumb should be considered obsolete. Anyway, most software written for Windows is GUI based (with a WinMain function instead of main). The MS compiler, and shell, do the right thing. If you’re using a non-MS shell that doesn’t, replace it. (Or at the least, never ever run a GUI environment such as Gnome, KDE, Enlightenment, etc, or any software written for those environments on that thing. Just use really old console-based apps that follow that rule of thumb. Good luck).
JG makes a very good point. It is the reason that mission-critical production applications in an Enterprise environment should (IMHO) *never* be run on a Windows machine. Windows lacks the sort of beefy command line that makes running shell scripts possible for system automation (and you can’t easily write fully automated programs using a GUI). While this may change should Microsoft ever decide to release something better than that toy called DOS, the fact of the matter is that business automation generally involves some flavor of Unix (with some using Solaris, BSD, or Linux thrown in), where the return value of a program is vital to flow control.
Just because Microsoft compilers ignore the return value of a C/C++ program doesn’t give the programmer a license to write bad, non-standard code. And just because Microsoft tells you that the return code is obsolete (do they really?) *doesn’t*make it so, and *doesn’t* make their way the new standard.
Remember the old joke: “How many Microsoft officers does it take to change a light bulb? A: None, they simply declare Darkness (TM) to be the new standard (Patent Pending).”
Moreso, the Linux GUI is usually used with a client-server pair, so even when running over a gui, it’s a great idea to keep your programs behaving civilized.
A good point? JG doesn’t know what he’s talking about. The article isn’t about shells crashing because of bad return values, and doesn’t mention that anywhere. It only mentions startup code that can crash if main isn’t declared properly. This startup code is added by your compiler and does all of the preparation before the main function begins, and the clean-up afterward. That code could crash if main is declared improperly (and if your compiler allows it). His point about a “properly written” shell and “bad assumptions” misses the entire point of the article.
Furthermore ranting about the use of the return code betrays his inexperience. The exit code of an application is specific to the application in the same way that the return code of a C function is specific to the function. There’s no reason to go off complaining about it.
It’s better to use “int main” even if “void main” works fine with your compiler, because one day you might be writing code where the return values matter. Just initialise a status to 0 and return that if you don’t care about the exit status.
A compiler generates code for calling functions and passing return values, they may very well be in a register and not a stack. It’d be a pretty shoddy compiler, that permits void main, which then risks a stack underflow by “expecting a return value on a stack”. In past often the return value of last function, would get used, if exit() was not called, and unusual values could get passed back to the system environment, when the compiler generated exit routine would then call exit() for you.
Why would the compiler start up code, “expect any value in a range” when it’s something that the programmer has to set? It’s going to use it as an index into something, you think???? Errrr no.
If you don’t declare main as an int, you should be calling exit anyway, if you care about portability.
The OS when it cleans up the process/task memory, is what would pass a return value back to the calling environment. If the compiler doesn’t generate correct termination code, then I guess the OS will terminate the program when it makes a memory access violation. The segmentation violation on exit, is rational compiler behaviour if void is an incorrect type for main.
IMNSHO this article on “safer programming” style would have advised initialising a failure status, to guard against “falling through” errors, and forcing an explicit setting a successful outcome.
There’s not much point in speculating on (buggy) compiler implementations, though I guess having a dig at M$ will increase the page views….
Huh? Every PC C/C++ compiler I’ve ever seen returns a C/C++ function’s value in a CPU register. For example, this is how Microsoft C compilers, Borland C compilers, Intel’s C compilers, gcc, etc work. (For intel CPUs, the AX or EAX register is historically used for returning a function’s value). This is pretty much a standard carried over from the days of assembly language programming when it was a rule of thumb to return a function’s value in a register, rather than messing with the stack, as you’d have to rearrange where the caller’s return address was pushed before you actually issued your return instruction. Remember that the return instruction pulls the caller’s return address off the top of the stack and stuffs it into the “instruction pointer” register. But if you’ve pushed your own return value onto the stack before doing that return, well… think about it. (And yes, I’ve done lots of asm programming, as well as compiler design). So to make the C compiler’s design easier, this standard was kept. If a function actually returned a value on the stack, that means both the caller and called function would have to do some extra stack cleanup. This was deemed to be unnecessary overhead, since all basic C datatypes (aside from a struct) can fit into a register (or two), and such extra overhead for every single function call would have been a poorly performing, more RAM intensive, choice.
A C/C++ function’s return value is NOT pushed onto the stack. Let me repeat this: A C/C++ function’s return value is NOT pushed onto the stack, and therefore has absolutely no risk to “stack underflow”, and will not crash the compiler’s exit code (not startup code, as someone mistakenly alluded. How can a termination value crash code that is executed only upon startup???). I mean, how the hell can you crash the compiler’s exit code just by stuffing some integer into the register used for main’s return??? There was never any standard for what is done with that integer, so the exit code shouldn’t be making any assumptions about it at all. But it definitely isn’t a pointer (ie, “index into something” as Rob put it). (And that rules out any sort of “memory corruption” too. The return value has absolutely no reference to any memory address). It would be a really, really horribly designed compiler that produces code that crashes as a result of leaving some random integer value in the return register for main. Please, if anyone has a specific compiler that does that — and I seriously doubt anyone can produce such as an example because it appears that people are pontificating based upon what they THINK C/C++ compilers do rather than studying actual disassembled C/C++ code (from a number of compilers) as I’ve done for some years — then cite that specific compiler so I can be sure to avoid it. Find me the example. I want to see that it actually exists.
As an aside, because a C compiler puts a function’s return value in a CPU register is why you can’t return a structure from a C function. (ie, You have to return a _pointer_ to the structure). It’s because a structure typically is larger than 16 or 32 bits, and therefore will not fit into a (16 or 32 bit) CPU register. (On the other hand, a pointer will). Functions that return a double or long long use two registers. (On Intel CPU’s, the standard is EAX and EDX).
Folks, take your C compiler, and use the compiler flag that outputs the intermediary asm code. You’ll see that the return value of main() is being put into the EAX register (for Intel CPUs). Repeat this experiment with every C/C++ compiler you have. I’ve be very interested to know exactly which one deviates from this standard. You can learn a LOT about what compilers do by disassembling and studying the actual code they produce. I very much recommend that folks do that here, since it’s clear to me that some people are making wholly _unsubstantied_ allegations that would quickly be disproven by even a cursory inspection of the compiler’s actual output. Here, let me show you the very _first_ web page I googled for “gcc main disassembly”:
http://www.milw0rm.com/papers/47
Look at the code on that page. Notice the last instruction “movl $0, %eax”? That’s the C compiler’s assembly of the C instruction “return 0″. And it’s putting the return value (of 0) into the EAX register, and immediately returning, as it should. Compile that example with your own, or other, C compilers and look at the disassembled result. I would hope that this experiment results in enlightenment.
Case closed. Perhaps someone who thinks he’s “experienced”, and that others “don’t know what [they're] talking about” should avail himself of the information I’ve provided him in these posts, stop making his own unsubstantiated allegations, and learn something.
The way C/C++ functions handles parameters/return values is dictated by calling convensions. Knowing that the default C calling convension is cdecl (in VC++ at least) gives you a point. But what you said about returning a value from a command line application is obsoleted is simply NOT TRUE. If you’re a VC++ developer, and I assume you are becaouse of your comments, you must know that when you make a post build event calling commands, i.e. regsvr32.exe or copy or xcopy, etc., you get a return value from those commands and if it’s non zero the entire build process fails. So, even Microsoft uses return values for their programs. Can you say now that returning values from processes is outdated and obsolete?
> The way C/C++ functions handle parameters/return values is dictated by calling conventions.
That’s half correct. The calling convention deals _only_ with passing arguments (ie, input parameters) to a function. The calling convention has nothing whatsoever to do with the way that a function’s return value is returned to the caller (ie, the output). As mentioned before, a C function returns its value in a CPU register. It does not return its value on the stack. The keywords cdecl, pascal, etc, indicate whether the caller is responsible for cleaning up the stack in regards to the input parameters, or whether the called function does. That’s entirely different from a return value. cdecl, pascal, etc calling conventions have absolutely nothing to do with returning a value.
> regsvr32.exe or copy or xcopy, etc., you get a return value from those commands
I presume you’re talking about some sort of dialog box in a MS IDE that lets you type in the name of some executables, and it runs those exes (presumably passing some args to them via the STARTUPINFO struct — the “command line” as you’d probably think of it). In this case, it’s not the Windows shell running the executable, but rather the MS IDE calling the API ShellExecute() (well, most likely CreateProcess if running console mode apps. And it sounds like that’s what that dialog box is really there for). So my comments about how the Windows shell throws away the executable’s return value isn’t applicable here. (But it does).
It wouldn’t surprise me that MS has written its own console mode apps to follow the standard convention for main’s return value. (The examples you mention are all MS console mode apps). And I’m sure they presume that these console mode apps are what you’ll primarily be running from the MS IDE. So, I can see why they may abort the process when one returns a non-zero value. MS is probably assuming you aren’t going to run any other executable than a console mode app that follows the old standard.
But that’s not to say if you happen to run an exe that declares main as returning a void, your build process won’t abort at that point, even if the exe successfully did its job. It’s also not to say if you run an app, the build process won’t continue even if the app failed to do its job. (Many GUI apps don’t bother returning anything but 0 — or rather, let the C exit code do that for them. After all, these apps aren’t designed to be run from the shell, nor some “IDE build dialog”). Try running MS Word from your “build dialog” and deliberately make it fail (such as asking it to a load a file that doesn’t exist), and see if you get any sort of accurate error reporting back to your build dialog.
I repeat, never ever trust that an exe today will follow the old standard. It may. Or it may not. And because you can’t be sure of a particular exe until you actually test it, that’s why the old guideline should be considered obsolete. It’s not reliable. Depending upon what exes you choose to run from that dialog box, you may find that a build process suddenly is inexplicably “broken”. (ie, It aborts when you don’t expect it to, or doesn’t when you do expect it to have noticed some failure).
> you’re a VC++ developer
I use MS dev tools when writing Win32 software. I use other tools when writing software for other OS’s.
> As an aside, because a C compiler puts a function’s return value in a
> CPU register is why you can’t return a structure from a C function.
> (ie, You have to return a _pointer_ to the structure).
What I want to say is the C standard does allow a function to return a
structure (though it is not a good practice to use this due to memory
overhead). And the caller allocates the return value on stack.
given a sample program:
# cat test.c
struct A {
int b,c,d,e;
};
struct A foo(void)
{
struct A retval;
retval.b = 1;
retval.c = 2;
retval.d = 3;
retval.e = 4;
return retval;
}
void test_foo(void)
{
struct A b = foo();
}
# gcc -S test.c
# cat test.s
.file “test.c”
.text
.globl foo
.type foo, @function
foo:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
movl 8(%ebp), %edx
movl $1, -24(%ebp)
movl $2, -20(%ebp)
movl $3, -16(%ebp)
movl $4, -12(%ebp)
movl -24(%ebp), %eax
movl %eax, (%edx)
movl -20(%ebp), %eax
movl %eax, 4(%edx)
movl -16(%ebp), %eax
movl %eax, 8(%edx)
movl -12(%ebp), %eax
movl %eax, 12(%edx)
movl %edx, %eax
leave
ret $4
.size foo, .-foo
.globl test_foo
.type test_foo, @function
test_foo:
pushl %ebp
movl %esp, %ebp
subl $24, %esp
leal -24(%ebp), %eax
subl $12, %esp
pushl %eax
call foo
addl $12, %esp
leave
ret
[...] با این مطلب مواجه شدم، که به موضوع مهم و جالبی اشاره کرده بود! (که [...]
jd you are such an idiot.
try the following in a windows cmd shell
xcopy
echo %ERRORLEVEL%
xcopy c:\autoexec.bat c:\autoexec.tmp
echo %ERRORLEVEL%
see how the first echo prints 4, and the second one 0. Even windows programs return 0 for successful completion, and this behaviour was taken from Unix, and still holds true today. Windows entry point WinMain, is declared as int WinMain().
You’ve run one MS console mode command, and concluded that all software doesn’t just return 0 regardless of failure? That is basing your conclusion upon the flimiest of evidence. Try running MS Word, and pass it the name of a non-existant file to load. See what you get back then.
It seems to me that this talk about registers is distracting from the point. Any compiler that allows you to write void main() and then produces code that can bomb is a bad compiler. Either it should complain if you don’t have int main() or construct code that works if you have void main(). That being said, If we are going to allow void main() then all compilers should allow it – we should change the standard. If we change the standard to allow void main() then it should be handled consistently – perhaps meaning that main does not deliver any kind of exit status and that the compiler should take care of it. As to return values being in registers we should remember that there are other processors out there some of which do not have registers available for the purpose and which therefore have to return value on the stack. If you want to write truly portable code you should make no assumtions about how parameters or results are passed.
[...] take a look at this post ( int Main() vs void main() )that I had made some time ago at another blog of mine ( Safer Code – Secure Coding In C C++ And [...]
This matter is rather simple, people make a fuss about it because they don’t know the standards.
First, for your convenience I’ll explain some terms used in both standards:
Hosted environment = programming with an operating system (Windows, Linux, Unix, RTOS etc)
Freestanding environment = programming without an operating system, or programming the operating system itself.
We then have 4 cases:
C in a hosted environment
C in a freestanding environment
C++ in a hosted environment
C++ in a freestanding environment
The C++ standard chapter 3.6.1:
“Main function
A program shall contain a global function called main, which is the designated start of the program. It is implementation-defined whether a program in a freestanding environment is required to define a main function.” /–/
“An implementation shall not predefine the main function. This function shall not be overloaded. It shall have a return type of type int, but otherwise its type is implementation-defined. All implementations shall allow both of the following definitions of main:
int main() { /* … */ }
and
int main(int argc, char* argv[]) { /* … */ } ”
The C standard:
“5.1.2.1 Freestanding environment
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined.”
/–/
“5.1.2.2 Hosted environment” /–/
“The function called at program startup is named main. The implementation declares no prototype for this function. It shall be defined with a return type of int and with no parameters:
int main(void) { /* … */ }
or with two parameters (referred to here as argc and argv, though any names may be used, as they are local to the function in which they are declared):
int main(int argc, char *argv[]) { /* … */ }
or equivalent; or in some other implementation-defined manner.”
The above is the absolute truth. You can’t argue with the standard. You can however misinterpret it. In the C standard they end the statement with “or in some implementation-defined manner.”. This refers to the -arguments- of main, not the return type. I believe this was clearified in some errata, though I can’t cite it.
The C++ standard clearly states that:
In a freestanding environment, you don’t need to declare main(), but if you do, it must return int. Now this doesn’t make any sense as there is no underlaying OS to return the value to, yet the standard says so, so we must obey. In freestanding environments you never end the program, so the calling convention of main() isn’t important. In a hosted environment, main() always returns int without exceptions.
The C standard states that:
In a freestanding environment, main() may return any type.
In a hosted environment, main() shall only return int.
In a hosted environment, the following is therefore not allowed in C/C++:
main()
void main()
nor are further variants like WinMain() legal C/C++. This has always been the case, the definition of main() has not been changed in any standard since the first ANSI/ISO C released in 1990. Early “K&R C” might have allowed all kind of crazy stuff, but it was not standardized so it doesn’t matter.
This means that main() and void main() will not compile on a C/C++ compiler made for hosted environments. There will be no runtime crashes, because the code will not even compile. If it compiles, the compiler does not follow ISO C/C++, ie it is not a C/C++ compiler but something else. If you run your C/C++ program on a “something else compiler”, yes then unexpected things are bound to happen.
This also means that most Microsoft compilers aren’t C/C++ compilers. Early Borland compilers are not C/C++ compilers. And so on.
The debate above about whether return values are pushed on the stack or not is clearly written by narrowminded Windows/desktop programmers. Yes, return values -are- saved on the stack in some cases. What Windows programmers don’t know is that there are thousands of different embedded processors out there, all with different architectures. 90% of the computers in the world are not desktop computers. The computers controlling your car saves return values on the stack.
The article is about the C/C++ language, not about disassembled Windows code. How things are done in Windows may be interesting to know, but not relevant to a C/C++ debate.
Hello webmaster
I would like to share with you a link to your site
write me here preonrelt@mail.ru
Regardless of whether the return value is pushed on the stack or returned in a register, stack corruption doesn’t occur because you declare a function like main to be void. If that was the case then you would not be able to declare any function in C/C++ void. The return type for the function tells the compiler what value to expect the function to return. If that value is to be passed on the stack then the compiler knows to allocate space on the stack for that return value. void would tell the compiler that it doesn’t need to allocate space on the stack for the return value. Now if the calling system requires that a return value be present, then the compiler for that system should either provide the stub code for returning a default value to that system or flag void main() as a violation. However, it’s ludicrous to expect compilers for other systems which either don’t requre a return value or even use a return value from applications to flag void main() as a violation. Why require extra code to be added just to satisfy cross-compliance with other systems especially when the app is not going to be used on those systems? Of course you can have the compiler for that system strip out the unnecessary code, but why write the code in the first place. If I’m writing a word processor app then it doesn’t need to run on embedded controler in an automobile. I doubt fuel injection system needs to write a dear Abbey letter.
[...] با این مطلب مواجه شدم، که به موضوع مهم و جالبی اشاره کرده بود! (که [...]