To GC or Not To GC

C++Performance Garbage Collection D Heap Memory

C++ Problem Overview

I've recently seen two really nice and educating languages talks:

This first one by Herb Sutter, presents all the nice and cool features of C++0x, why C++'s future seems brighter than ever, and how M$ is said to be a good guy in this game. The talk revolves around efficiency and how minimizing heap activity very often improves performance.

This other one, by Andrei Alexandrescu, motivates a transition from C/C++ to his new game-changer D. Most of D's stuff seems really well motivated and designed. One thing, however, surprised me, namely that D pushes for garbage collection and that all classes are created solely by reference. Even more confusing, the book The D Programming Language Ref Manual specifically in the section about Resource Management states the following, quote:

Garbage collection eliminates the tedious, error prone memory allocation tracking code necessary in C and C++. This not only means much faster development time and lower maintenance costs, but the resulting program frequently runs faster!

This conflicts with Sutter's constant talk about minimizing heap activity. I strongly respect both Sutter's and Alexandrescou's insights, so I feel a bit confused about these two key questions

Doesn't creating class instances solely by reference result in a lot of unnecesseary heap activity?
In which cases can we use Garbage Collection without sacrificing run-time performance?

C++ Solutions

Solution 1 - C++

To directly answer your two questions:

Yes, creating class instances by reference does result in a lot of heap activity, but:

a. In D, you have struct as well as class. A struct has value semantics and can do everything a class can, except polymorphism.

b. Polymorphism and value semantics have never worked well together due to the slicing problem.

c. In D, if you really need to allocate a class instance on the stack in some performance-critical code and don't care about the loss of safety, you can do so without unreasonable hassle via the scoped function.
GC can be comparable to or faster than manual memory management if:

a. You still allocate on the stack where possible (as you typically do in D) instead of relying on the heap for everything (as you often do in other GC'd languages).

b. You have a top-of-the-line garbage collector (D's current GC implementation is admittedly somewhat naive, though it has seen some major optimizations in the past few releases, so it's not as bad as it was).

c. You're allocating mostly small objects. If you allocate mostly large arrays and performance ends up being a problem, you may want to switch a few of these to the C heap (you have access to C's malloc and free in D) or, if it has a scoped lifetime, some other allocator like RegionAllocator. (RegionAllocator is currently being discussed and refined for eventual inclusion in D's standard library).

d. You don't care that much about space efficiency. If you make the GC run too frequently to keep the memory footprint ultra-low, performance will suffer.

Solution 2 - C++

The reason creating an object on the heap is slower than creating it on the stack is that the memory allocation methods need to deal with things like heap fragmentation. Allocating memory on the stack is as simple as incrementing the stack pointer (a constant-time operation).

Yet, with a compacting garbage collector, you don't have to worry about heap fragmentation, heap allocations can be as fast as stack allocations. The Garbage Collection page for the D Programming Language explains this in more detail.

The assertion that GC'd languages run faster is probably assuming that many programs allocate memory on the heap much more often than on the stack. Assuming that heap allocation could be faster in a GC'd language, then it follows that you have just optimized a huge part of most programs (heap allocation).

Solution 3 - C++

An answer to 1):

As long as your heap is contiguous, allocating on it is just as cheap as allocating on the stack.

On top of that, while you allocate objects that lie next to each other, your memory caching performance will be great.

As long as you don't have to run the garbage collector, no performance is lost, and the heap stays contiguous.

That's the good news :)

Answer to 2):

GC technology has advanced greatly; they even come in real-time flavors nowadays. That means that guaranteeing contiguous memory is a policy-driven, implementation-dependent issue.

So if

you can afford a real-time gc
there are enough allocation-pauses in your application
it can keep your free-list a free-block

You may end up with better performance.

Answer to unasked question:

If developers are freed from memory-management issues, they may have more time to spend on real performance and scalability aspects in their code. That's a non-technical factor coming into play, too.

Solution 4 - C++

It's not either "garbage collection" or "tedious error prone" handwritten code. Smart pointers that are truly smart can give you stack semantics and mean you never type "delete" but you aren't paying for garbage collection. Here's another video by Herb that makes the point - safe and fast - that's what we want.

Solution 5 - C++

Another point to consider is the 80:20 rule. It is likely that that vast majority of the places you allocate are irrelevant and you won't gain much over a GC even if you could push the cost there to zero. If you accept that, then the simplicity you can gain by using a GC can displace the cost of using it. This is particularly true if you can avoid doing copies. What D provides is a GC for the 80% cases and access to stack allocation and malloc for the 20%.

Solution 6 - C++

Even if you had ideal garbage collector, it still would have been slower than creating things on stack. So you have to have a language that allows both at the same time. Furthermore, the only way to achieve the same performance with garbage collector as with manually managed memory allocations (done the right way), is to make it do the same things with memory as experienced developer would have had done, and that in many cases would require a garbage collector decisions to be made in compile-time and executed in run-time. Usually, garbage collection makes things slower, languages working with dynamic memory only are slower, and predictability of execution of programs written in those languages is low while latency of execution is higher. Frankly, I personally don't see why one would need a garbage collector. Managing memory manually is not hard. At least not in C++. Of course, I won't mind compiler generate code that clean-ups all things for me as I would have done, but this doesn't seem possible at the moment.

Solution 7 - C++

In many cases a compiler can optimize heap-allocation back to stack allocation. This is the case if your object doesn't escape the local scope.

A decent compiler will almost certainly make x stack-allocated in the following example:

void f() {
    Foo* x = new Foo();
    x->doStuff(); // Assuming doStuff doesn't assign 'this' anywhere
    // delete x or assume the GC gets it
}

What the compiler does is called escape analysis.

Also, D could in theory have a moving GC, which means potential performance improvements by improved cache usage when the GC compacts your heap objects together. It also combats heap fragmentation as explained in Jack Edmonds' answer. Similar things can be done with manual memory management, but it's extra work.

Solution 8 - C++

A incremental low priority GC will collect garbage when high priority task are not running. The high priority threads will run faster since no memory deallocation will be done. This is the idea of Henriksson's RT Java GC see http://www.oracle.com/technetwork/articles/javase/index-138577.html

Solution 9 - C++

Garbage collection does in fact slow code down. It's adding extra functionality to the program that has to run in addition to your code. There are other problems with it as well, such as for example, the GC not running until memory is actually needed. This can result in small memory leaks. Another issue is if a reference is not removed properly, the GC will not pick it up, and once again result in a leak. My other issue with GC is that it kind of promotes lazyness in programmers. I'm an advocate of learning the low level concepts of memory management before jumping into higher level. It's like Mathematics. You learn how to solve for the roots of a quadratic, or how to take a derivative by hand first, then you learn how to do it on the calculator. Use these things as tools, not crutches.

If you don't want to hit your performance, be smart about the GC and your heap vs stack usage.

Solution 10 - C++

My point is that GC is inferior to malloc when you do normal procedural programming. You just go from procedure to procedure, allocate and free, use global variables, and declare some functions _inline or _register. This is C style.

But once you go higher abstraction layer, you need at least reference counting. So you can pass by reference, count them and free once the counter is zero. This is good, and superior to malloc after the amount and hierarchy of objects become too difficult to manage manually. This is C++ style. You will define constructors and destructors to increment counters, you will copy-on-modify, so the shared object will split in two, once some part of it is modified by one party, but another party still needs the original value. So you can pass huge amount of data from function to function without thinking whether you need to copy data here or just send a pointer there. The ref-counting does those decisions for you.

Then comes the whole new world, closures, functional programming, duck typing, circular references, asynchronouse execution. Code and data start mixing, you find yourself passing function as parameter more often than normal data. You realize that metaprogramming can be done without macros or templates. Your code starts to soak in the sky and loosing solid ground, because you are executing something inside callbacks of callbacks of callbacks, data becomes unrooted, things become asynchronous, you get addicted to closure variables. So this is where timer based, memory-walking GC is the only possible solution, otherwise closures and circular references are not possible at all. This is JavaScript way.

You mentioned D, but D is still improved C++ so malloc or ref counting in constructors, stack allocations, global variables (even if they are compicated trees of entities of all kinds) is probably what you choose.

Content Type	Original Author	Original Content on Stackoverflow
Question	Nordlöw	View Question on Stackoverflow
Solution 1 - C++	dsimcha	View Answer on Stackoverflow
Solution 2 - C++	Jack Edmonds	View Answer on Stackoverflow
Solution 3 - C++	xtofl	View Answer on Stackoverflow
Solution 4 - C++	Kate Gregory	View Answer on Stackoverflow
Solution 5 - C++	BCS	View Answer on Stackoverflow
Solution 6 - C++	user405725	View Answer on Stackoverflow
Solution 7 - C++	mpartel	View Answer on Stackoverflow
Solution 8 - C++	Jonas W	View Answer on Stackoverflow
Solution 9 - C++	MGZero	View Answer on Stackoverflow
Solution 10 - C++	exebook	View Answer on Stackoverflow