In C++, am I paying for what I am not eating?

C++ Problem Overview

Let's consider the following hello world examples in C and C++:

main.c

#include <stdio.h>

int main()
{
	printf("Hello world\n");
	return 0;
}

main.cpp

#include <iostream>

int main()
{
	std::cout<<"Hello world"<<std::endl;
	return 0;
}

When I compile them in godbolt to assembly, the size of the C code is only 9 lines (gcc -O3):

.LC0:
        .string "Hello world"
main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:.LC0
        call    puts
        xor     eax, eax
        add     rsp, 8
        ret

But the size of the C++ code is 22 lines (g++ -O3):

.LC0:
        .string "Hello world"
main:
        sub     rsp, 8
        mov     edx, 11
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:_ZSt4cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
        mov     edi, OFFSET FLAT:_ZSt4cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
        xor     eax, eax
        add     rsp, 8
        ret
_GLOBAL__sub_I_main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

... which is much larger.

It is famous that in C++ you pay for what you eat. So, in this case, what am I paying for?

C++ Solutions

Solution 1 - C++

> So, in this case, what am I paying for?

std::cout is more powerful and complicated than printf. It supports things like locales, stateful formatting flags, and more.

If you don't need those, use std::printf or std::puts - they're available in <cstdio>.

> It is famous that in C++ you pay for what you eat.

I also want to make it clear that C++ != The C++ Standard Library. The Standard Library is supposed to be general-purpose and "fast enough", but it will often be slower than a specialized implementation of what you need.

On the other hand, the C++ language strives to make it possible to write code without paying unnecessary extra hidden costs (e.g. opt-in virtual, no garbage collection).

Solution 2 - C++

You are not comparing C and C++. You are comparing printf and std::cout, which are capable of different things (locales, stateful formatting, etc).

Try to use the following code for comparison. Godbolt generates the same assembly for both files (tested with gcc 8.2, -O3).

main.c:

#include <stdio.h>

int main()
{
    int arr[6] = {1, 2, 3, 4, 5, 6};
    for (int i = 0; i < 6; ++i)
    {
        printf("%d\n", arr[i]);
    }
    return 0;
}

main.cpp:

#include <array>
#include <cstdio>

int main()
{
    std::array<int, 6> arr {1, 2, 3, 4, 5, 6};
    for (auto x : arr)
    {
        std::printf("%d\n", x);
    }
}

Solution 3 - C++

Your listings are indeed comparing apples and oranges, but not for the reason implied in most other answers.

Let’s check what your code actually does:

C:

print a single string, "Hello world\n"

C++:

stream the string "Hello world" into std::cout
stream the std::endl manipulator into std::cout

Apparently your C++ code is doing twice as much work. For a fair comparison we should combine this:

#include <iostream>

int main()
{
    std::cout<<"Hello world\n";
    return 0;
}

… and suddenly your assembly code for main looks very similar to C’s:

main:
        sub     rsp, 8
        mov     esi, OFFSET FLAT:.LC0
        mov     edi, OFFSET FLAT:_ZSt4cout
        call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
        xor     eax, eax
        add     rsp, 8
        ret

In fact, we can compare the C and C++ code line by line, and there are very few differences:

sub     rsp, 8                      sub     rsp, 8
mov     edi, OFFSET FLAT:.LC0   |   mov     esi, OFFSET FLAT:.LC0
                                >   mov     edi, OFFSET FLAT:_ZSt4cout
call    puts                    |   call    std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)
xor     eax, eax                    xor     eax, eax
add     rsp, 8                      add     rsp, 8
ret                                 ret

The only real difference is that in C++ we call operator << with two arguments (std::cout and the string). We could remove even that slight difference by using a closer C eqivalent: fprintf, which also has a first argument specifying the stream.

This leaves the assembly code for _GLOBAL__sub_I_main, which is generated for C++ but not C. This is the only true overhead that’s visible in this assembly listing (there’s more, invisible overhead for both languages, of course). This code performs a one-time setup of some C++ standard library functions at the start of the C++ program.

But, as explained in other answers, the relevant difference between these two programs won’t be found in the assembly output of the main function since all the heavy lifting happens behind the scenes.

Solution 4 - C++

What you are paying for is to call a heavy library (not as heavy as printing into console). You initialize an ostream object. There are some hidden storage. Then, you call std::endl which is not a synonym for \n. The iostream library helps you adjusting many settings and putting the burden on the processor rather than the programmer. This is what you are paying for.

Let's review the code:

.LC0:
        .string "Hello world"
main:

Initializing an ostream object + cout

    sub     rsp, 8
    mov     edx, 11
    mov     esi, OFFSET FLAT:.LC0
    mov     edi, OFFSET FLAT:_ZSt4cout
    call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)

Calling cout again to print a new line and flush

    mov     edi, OFFSET FLAT:_ZSt4cout
    call    std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
    xor     eax, eax
    add     rsp, 8
    ret

Static storage initialization:

_GLOBAL__sub_I_main:
        sub     rsp, 8
        mov     edi, OFFSET FLAT:_ZStL8__ioinit
        call    std::ios_base::Init::Init() [complete object constructor]
        mov     edx, OFFSET FLAT:__dso_handle
        mov     esi, OFFSET FLAT:_ZStL8__ioinit
        mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
        add     rsp, 8
        jmp     __cxa_atexit

Also, it is essential to distinguish between the language and the library.

BTW, this is just a part of the story. You do not know what is written in the functions you are calling.

Solution 5 - C++

> It is famous that in C++ you pay for what you eat. So, in this case, > what am I paying for?

That's simple. You pay for std::cout. "You pay for only what you eat" doesn't mean "you always get best prices". Sure, printf is cheaper. One can argue that std::cout is safer and more versatile, thus its greater cost is justified (it costs more, but provides more value), but that misses the point. You don't use printf, you use std::cout, so you pay for using std::cout. You don't pay for using printf.

A good example is virtual functions. Virtual functions have some runtime cost and space requirements - but only if you actually use them. If you don't use virtual functions, you don't pay anything.

A few remarks

Even if C++ code evaluates to more assembly instructions, it's still a handful of instructions, and any performance overhead is still likely dwarfed by actual I/O operations.
Actually, sometimes it's even better than "in C++ you pay for what you eat". For example, compiler can deduce that virtual function call is not needed in some circumstances, and transform that into non-virtual call. That means you may get virtual functions for free. Isn't that great?

Solution 6 - C++

The "assembly listing for printf" is NOT for printf, but for puts (kind of compiler optimization?); printf is prety much more complex than puts... don't forget!

Solution 7 - C++

I see some valid answers here, but I'm going to get a little bit more into the detail.

Jump to the summary below for the answer to your main question if you don't want to go through this entire wall of text.

Abstraction

> So, in this case, what am I paying for?

You are paying for abstraction. Being able to write simpler and more human friendly code comes at a cost. In C++, which is an object-oriented language, almost everything is an object. When you use any object, three main things will always happen under the hood:

Object creation, basically memory allocation for the object itself and its data.
Object initialization (usually via some init() method). Usually memory allocation happens under the hood as the first thing in this step.
Object destruction (not always).

You don't see it in the code, but every single time you use an object all of the three above things need to happen somehow. If you were to do everything manually the code would obviously be way longer.

Now, abstraction can be made efficiently without adding overhead: method inlining and other techniques can be used by both compilers and programmers to remove overheads of abstraction, but this is not your case.

What's really happening in C++?

Here it is, broken down:

The std::ios_base class is initialized, which is the base class for everything I/O related.
The std::cout object is initialized.
Your string is loaded and passed to std::__ostream_insert, which (as you already figured out by the name) is a method of std::cout (basically the << operator) which adds a string to the stream.
cout::endl is also passed to std::__ostream_insert.
__std_dso_handle is passed to __cxa_atexit, which is a global function that is responsible for "cleaning" before exiting the program. __std_dso_handle itself is called by this function to deallocate and destroy remaining global objects.

So using C == not paying for anything?

In the C code, very few steps are happening:

Your string is loaded and passed to puts via the edi register.
puts gets called.

No objects anywhere, hence no need to initialize/destroy anything.

This however doesn't mean that you're not "paying" for anything in C. You are still paying for abstraction, and also initialization of the C standard library and dynamic resolution the printf function (or, actually puts, which is optimized by the compiler since you don't need any format string) still happen under the hood.

If you were to write this program in pure assembly it would look something like this:

jmp start

msg db "Hello world\n"

start:
	mov rdi, 1
	mov rsi, offset msg
	mov rdx, 11
	mov rax, 1          ; write
	syscall
	xor rdi, rdi
	mov rax, 60         ; exit
	syscall

Which basically only results in invoking the write syscall followed by the exit syscall. Now this would be the bare minimum to accomplish the same thing.

To summarize

C is way more bare-bone, and only does the bare minimum that is needed, leaving full control to the user, which is able to fully optimize and customize basically anything they want. You tell the processor to load a string in a register and then call a library function to use that string. C++ on the other hand is way more complex and abstract. This has enormous advantage when writing complicated code, and allows for easier to write and more human friendly code, but it obviously comes at a cost. There's always going to be a drawback in performance in C++ if compared to C in cases like this, since C++ offers more than what's needed to accomplish such basic tasks, and thus it adds more overhead.

Answering your main question:

> Am I paying for what I am not eating?

In this specific case, yes. You are not taking advantage of anything that C++ has to offer more than C, but that's just because there's nothing in that simple piece of code that C++ could help you with: it is so simple that you really do not need C++ at all.

Oh, and just one more thing!

The advantages of C++ may not look obvious at first glance, since you wrote a very simple and small program, but look at a little bit more complex example and see the difference (both programs do the exact same thing):

#include <stdio.h>
#include <stdlib.h>

int cmp(const void *a, const void *b) {
	return *(int*)a - *(int*)b;
}

int main(void) {
	int i, n, *arr;

	printf("How many integers do you want to input? ");
	scanf("%d", &n);

	arr = malloc(sizeof(int) * n);

	for (i = 0; i < n; i++) {
		printf("Index %d: ", i);
		scanf("%d", &arr[i]);
	}

	qsort(arr, n, sizeof(int), cmp)

	puts("Here are your numbers, ordered:");

	for (i = 0; i < n; i++)
		printf("%d\n", arr[i]);

	free(arr);

	return 0;
}

C++:

#include <iostream>
#include <vector>
#include <algorithm>

using namespace std;

int main(void) {
	int n;

	cout << "How many integers do you want to input? ";
	cin >> n;

	vector<int> vec(n);

	for (int i = 0; i < vec.size(); i++) {
		cout << "Index " << i << ": ";
		cin >> vec[i];
	}

	sort(vec.begin(), vec.end());

	cout << "Here are your numbers:" << endl;

	for (int item : vec)
		cout << item << endl;

	return 0;
}

Hopefully you can clearly see what I mean here. Also notice how in C you have to manage memory at a lower level using malloc and free how you need to be more careful about indexing and sizes, and how you need to be very specific when taking input and printing.

Solution 8 - C++

There are a few misconceptions to start with. First, the C++ program does not result in 22 instructions, it's more like 22,000 of them (I pulled that number from my hat, but it's approximately in the ballpark). Also, the C code doesn't result in 9 instructions, either. Those are only the ones you see.

What the C code does is, after doing a lot of stuff that you don't see, it calls a function from the CRT (which is usually but not necessarily present as shared lib), then does not check for the return value or handle errors, and bails out. Depending on compiler and optimization settings it doesn't even really call printf but puts, or something even more primitive.
You could have written more or less the same program (except for some invisible init functions) in C++ as well, if only you called that same function the same way. Or, if you want to be super-correct, that same function prefixed with std::.

The corresponding C++ code is in reality not at all the same thing. While the whole of <iostream> it is well-known for being a fat ugly pig that adds an immense overhead for small programs (in a "real" program you don't really notice that much), a somewhat fairer interpretation is that it does an awful lot of stuff that you don't see and which just works. Including but not limited to magical formatting of pretty much any haphazard stuff, including different number formats and locales and whatnot, and buffering, and proper error-handling. Error handling? Well yes, guess what, outputting a string can actually fail, and unlike the C program, the C++ program would not ignore this silently. Considering what std::ostream does under the hood, and without anyone getting aware of, it's actually pretty lightweight. Not like I'm using it because I hate the stream syntax with a passion. But still, it's pretty awesome if you consider what it does.

But sure, C++ overall is not as efficient as C can be. It cannot be as efficient since it is not the same thing and it isn't doing the same thing. If nothing else, C++ generates exceptions (and code to generate, handle, or fail on them) and it gives some guarantees that C doesn't give. So, sure, a C++ program kinda necessarily needs to be a little bit bigger. In the big picture, however, this does not matter in any way. On the contrary, for real programs, I've not rarely found C++ performing better because for one reason or another, it seems to lend for more favorable optimizations. Don't ask me why in particular, I wouldn't know.

If, instead of fire-and-forget-hope-for-the-best you care to write C code which is correct (i.e. you actually check for errors, and the program behaves correctly in presence of errors) then the difference is marginal, if existent.

Solution 9 - C++

You are paying for a mistake. In the 80s, when compilers aren't good enough to check format strings, operator overloading was seen as a good way to enforce some semblance of type safety during io. However, every one of its banner features are either implemented badly or conceptually bankrupt from the start:

##<iomanip>

The most repugnant part of the C++ stream io api is the existence of this formatting header library. Besides being stateful and ugly and error prone, it couples formatting to the stream.

Suppose you want to print out an line with 8 digit zero filled hex unsigned int followed by a space followed by a double with 3 decimal places. With <cstdio>, you get to read a concise format string. With <ostream>, you have to save the old state, set alignment to right, set fill character, set fill width, set base to hex, output the integer, restore saved state (otherwise your integer formatting will pollute your float formatting), output the space, set notation to fixed, set precision, output the double and the newline, then restore the old formatting.

// <cstdio>
std::printf( "%08x %.3lf\n", ival, fval );

// <ostream> & <iomanip>
std::ios old_fmt {nullptr};
old_fmt.copyfmt (std::cout);
std::cout << std::right << std::setfill('0') << std::setw(8) << std::hex << ival;
std::cout.copyfmt (old_fmt);
std::cout << " " << std::fixed << std::setprecision(3) << fval << "\n";
std::cout.copyfmt (old_fmt);

##Operator Overloading <iostream> is the poster child of how not to use operator overloading:

std::cout << 2 << 3 && 0 << 5;

##Performance std::cout is several times slower printf(). The rampant featuritis and virtual dispatch does take its toll.

##Thread Safety

Both <cstdio> and <iostream> are thread safe in that every function call is atomic. But, printf() gets a lot more done per call. If you run the following program with the <cstdio> option, you will see only a row of f. If you use <iostream> on a multicore machine, you will likely see something else.

// g++ -Wall -Wextra -Wpedantic -pthread -std=c++17 cout.test.cpp

#define USE_STREAM 1
#define REPS 50
#define THREADS 10

#include <thread>
#include <vector>

#if USE_STREAM
	#include <iostream>
#else
	#include <cstdio>
#endif

void task()
{
	for ( int i = 0; i < REPS; ++i )
#if USE_STREAM
		std::cout << std::hex << 15 << std::dec;
#else
		std::printf ( "%x", 15);
#endif

}

int main()
{
	auto threads = std::vector<std::thread> {};
	for ( int i = 0; i < THREADS; ++i )
		threads.emplace_back(task);

	for ( auto & t : threads )
		t.join();
		
#if USE_STREAM
		std::cout << "\n<iostream>\n";
#else
		std::printf ( "\n<cstdio>\n" );
#endif
}

The retort to this example is that most people exercise discipline to never write to a single file descriptor from multiple threads anyway. Well, in that case, you'll have to observe that <iostream> will helpfully grab a lock on every << and every >>. Whereas in <cstdio>, you won't be locking as often, and you even have the option of not locking.

<iostream> expends more locks to achieve a less consistent result.

Solution 10 - C++

In addition to what all the other answers have said,
there's also the fact that std::endl is not the same as '\n'.

This is an unfortunately common misconception. std::endl does not mean "new line",
it means "print new line and then flush the stream". Flushing is not cheap!

Completely ignoring the differences between printf and std::cout for a moment, to be functionally eqvuialent to your C example, your C++ example ought to look like this:

#include <iostream>

int main()
{
    std::cout << "Hello world\n";
    return 0;
}

And here's an example of what your examples should be like if you include flushing.

#include <stdio.h>

int main()
{
    printf("Hello world\n");
    fflush(stdout);
    return 0;
}

C++

#include <iostream>

int main()
{
    std::cout << "Hello world\n";
    std::cout << std::flush;
    return 0;
}

When comparing code, you should always be careful that you're comparing like for like and that you understand the implications of what your code is doing. Sometimes even the simplest examples are more complicated than some people realise.

Solution 11 - C++

While the existing technical answers are correct, I think that the question ultimately stems from this misconception:

> It is famous that in C++ you pay for what you eat.

This is just marketing talk from the C++ community. (To be fair, there's marketing talk in every language community.) It doesn't mean anything concrete that you can seriously depend on.

"You pay for what you use" is supposed to mean that a C++ feature only has overhead if you're using that feature. But the definition of "a feature" is not infinitely granular. Often you will end up activating features that have multiple aspects, and even though you only need a subset of those aspects, it's often not practical or possible for the implementation to bring the feature in partially.

In general, many (though arguably not all) languages strive to be efficient, with varying degrees of success. C++ is somewhere on the scale, but there is nothing special or magical about its design that would allow it to be perfectly successful in this goal.

Solution 12 - C++

The Input / Output functions in C++ are elegantly written and are designed so they are simple to use. In many respects they are a showcase for the object-orientated features in C++.

But you do indeed give up a bit of performance in return, but that's negligible compared to the time taken by your operating system to handle the functions at a lower level.

You can always fall back to the C style functions as they are part of the C++ standard, or perhaps give up portability altogether and use direct calls to your operating system.

Solution 13 - C++

As you have seen in other answers, you pay when you link in general libraries and call complex constructors. There is no particular question here, more a gripe. I'll point out some real-world aspects:

Barne had a core design principle to never let efficiency be a reason for staying in C rather than C++. That said, one needs to be careful to get these efficiencies, and there are occasional efficiencies that always worked but were not 'technically' within the C spec. For example, the layout of bit fields was not really specified.
Try looking through ostream. Oh my god its bloated! I wouldn't be surprised to find a flight simulator in there. Even stdlib's printf() usally runs about 50K. These aren't lazy programmers: half of the printf size was to do with indirect precision arguments that most people never use. Almost every really constrained processor's library creates its own output code instead of printf.
The increase in size is usually providing a more contained and flexible experience. As an analogy, a vending machine will sell a cup of coffee-like-substance for a few coins and the whole transaction takes under a minute. Dropping into a good restaurant involves a table setting, being seated, ordering, waiting, getting a nice cup, getting a bill, paying in your choice of forms, adding a tip, and being wished a good day on your way out. Its a different experience, and more convenient if you are dropping in with friends for a complex meal.
People still write ANSI C, though rarely K&R C. My experience is we always compile it with a C++ compiler using a few configuration tweaks to limit what is dragged in. There are good arguments for other languages: Go removes the polymorphic overhead and crazy preprocessor; there have been some good arguments for smarter field packing and memory layout. IMHO I think any language design should start with a listing of goals, much like the Zen of Python.

It's been a fun discussion. You ask why can't you have magically small, simple, elegant, complete, and flexible libraries?

There is no answer. There will not be an answer. That is the answer.

Content Type	Original Author	Original Content on Stackoverflow
Question	Saher	View Question on Stackoverflow
Solution 1 - C++	Vittorio Romeo	View Answer on Stackoverflow
Solution 2 - C++	pschill	View Answer on Stackoverflow
Solution 3 - C++	Konrad Rudolph	View Answer on Stackoverflow
Solution 4 - C++	Arash	View Answer on Stackoverflow
Solution 5 - C++	el.pescado - нет войне	View Answer on Stackoverflow
Solution 6 - C++	Álvaro Gustavo López	View Answer on Stackoverflow
Solution 7 - C++	Marco Bonelli	View Answer on Stackoverflow
Solution 8 - C++	Damon	View Answer on Stackoverflow
Solution 9 - C++	KevinZ	View Answer on Stackoverflow
Solution 10 - C++	Pharap	View Answer on Stackoverflow
Solution 11 - C++	Theodoros Chatzigiannakis	View Answer on Stackoverflow
Solution 12 - C++	Bathsheba	View Answer on Stackoverflow
Solution 13 - C++	Charles Merriam	View Answer on Stackoverflow