Why does the delete[] syntax exist in C++?

C++Memory ManagementSyntaxLanguage LawyerStandards

C++ Problem Overview


Every time somebody asks a question about delete[] on here, there is always a pretty general "that's how C++ does it, use delete[]" kind of response. Coming from a vanilla C background what I don't understand is why there needs to be a different invocation at all.

With malloc()/free() your options are to get a pointer to a contiguous block of memory and to free a block of contiguous memory. Something in implementation land comes along and knows what size the block you allocated was based on the base address, for when you have to free it.

There is no function free_array(). I've seen some crazy theories on other questions tangentially related to this, such as calling delete ptr will only free the top of the array, not the whole array. Or the more correct, it is not defined by the implementation. And sure... if this was the first version of C++ and you made a weird design choice that makes sense. But why with $PRESENT_YEAR's standard of C++ has it not been overloaded???

It seems to be the only extra bit that C++ adds is going through the array and calling destructors, and I think maybe this is the crux of it, and it literally is using a separate function to save us a single runtime length lookup, or nullptr at end of the list in exchange for torturing every new C++ programmer or programmer who had a fuzzy day and forgot that there is a different reserve word.

Can someone please clarify once and for all if there is a reason besides "that's what the standard says and nobody questions it"?

C++ Solutions


Solution 1 - C++

Objects in C++ often have destructors that need to run at the end of their lifetime. delete[] makes sure the destructors of each element of the array are called. But doing this has unspecified overhead, while delete does not. This is why there are two forms of delete expressions. One for arrays, which pays the overhead and one for single objects which does not.

In order to only have one version, an implementation would need a mechanism for tracking extra information about every pointer. But one of the founding principles of C++ is that the user shouldn't be forced to pay a cost that they don't absolutely have to.

Always delete what you new and always delete[] what you new[]. But in modern C++, new and new[] are generally not used anymore. Use std::make_unique, std::make_shared, std::vector or other more expressive and safer alternatives.

Solution 2 - C++

Basically, malloc and free allocate memory, and new and delete create and destroy objects. So you have to know what the objects are.

To elaborate on the unspecified overhead François Andrieux's answer mentions, you can see my answer on this question in which I examined what does a specific implementation do (Visual C++ 2013, 32-bit). Other implementations may or may not do a similar thing.

In case the new[] was used with an array of objects with a non-trivial destructor, what it did was allocating 4 bytes more, and returning the pointer shifted by 4 bytes ahead, so when delete[] wants to know how many objects are there, it takes the pointer, shifts it 4 bytes prior, and takes the number at that address and treats it as the number of objects stored there. It then calls a destructor on each object (the size of the object is known from the type of the pointer passed). Then, in order to release the exact address, it passes the address that was 4 bytes prior to the passed address.

On this implementation, passing an array allocated with new[] to a regular delete results in calling a single destructor, of the first element, followed by passing the wrong address to the deallocation function, corrupting the heap. Don't do it!

Solution 3 - C++

Something not mentioned in the other (all good) answers is that the root cause of this is that arrays - inherited from C - have never been a "first-class" thing in C++.

They have primitive C semantics and do not have C++ semantics, and therefore C++ compiler and runtime support, which would let you or the compiler runtime systems do useful things with pointers to them.

In fact, they're so unsupported by C++ that a pointer to an array of things looks just like a pointer to a single thing. That, in particular, would not happen if arrays were proper parts of the language - even as part of a library, like string or vector.

This wart on the C++ language happened because of this heritage from C. And it remains part of the language - even though we now have std::array for fixed-length arrays and (have always had) std::vector for variable-length arrays - largely for purposes of compatibility: Being able to call out from C++ to operating system APIs and to libraries written in other languages using C-language interop.

And ... because there are truckloads of books and websites and classrooms out there teaching arrays very early in their C++ pedagogy, because of a) being able to write useful/interesting examples early on that do in fact call OS APIs, and of course because of the awesome power of b) "that's the way we've always done it".

Solution 4 - C++

Generally, C++ compilers and their associated runtimes build on top of the platform's C runtime. In particular in this case the C memory manager.

The C memory manager allows you to free a block of memory without knowing its size, but there is no standard way to get the size of the block from the runtime and there is no guarantee that the block that was actually allocated is exactly the size you requested. It may well be larger.

Thus the block size stored by the C memory manager can't usefully be used to enable higher-level functionality. If higher-level functionality needs information on the size of the allocation then it must store it itself. (And C++ delete[] does need this for types with destructors, to run them for every element.)

C++ also has an attitude of "you only pay for what you use", storing an extra length field for every allocation (separate from the underlying allocator's bookkeeping) would not fit well with this attitude.

Since the normal way to represent an array of unknown (at compile time) size in C and C++ is with a pointer to its first element, there is no way the compiler can distinguish between a single object allocation and an array allocation based on the type system. So it leaves it up to the programmer to distinguish.

Solution 5 - C++

The cover story is that delete is required because of C++'s relationship with C.

The new operator can make a dynamically allocated object of almost any object type.

But, due to the C heritage, a pointer to an object type is ambiguous between two abstractions:

  • being the location of a single object, and
  • being the base of a dynamic array.

The delete versus delete[] situation just follows from that.

However, that's does not ring true, because, in spite of the above observations being true, a single delete operator could be used. It does not logically follow that two operators are required.

Here is informal proof. The new T operator invocation (single object case) could implicitly behave as if it were new T[1]. So that is to say, every new could always allocate an array. When no array syntax is mentioned, it could be implicit that an array of [1] will be allocated. Then, there would just have to exist a single delete which behaves like today's delete[].

Why isn't that design followed?

I think it boils down to the usual: it's a goat that was sacrificed to the gods of efficiency. When you allocate an array with new [], extra storage is allocated for meta-data to keep track of the number of elements, so that delete [] can know how many elements need to be iterated for destruction. When you allocate a single object with new, no such meta-data is required. The object can be constructed directly in the memory which comes from the underlying allocator without any extra header.

It's a part of "don't pay for what you don't use" in terms of run-time costs. If you're allocating single objects, you don't have to "pay" for any representational overhead in those objects to deal with the possibility that any dynamic object referenced by pointer might be an array. However, you are burdened with the responsibility of encoding that information in the way you allocate the object with the array new and subsequently delete it.

Solution 6 - C++

An example might help. When you allocate a C-style array of objects, those objects may have their own destructor that needs to be called. The delete operator does not do that. It works on container objects, but not C-style arrays. You need delete[] for them.

Here is an example:

#include <iostream>
#include <stdlib.h>
#include <string>

using std::cerr;
using std::cout;
using std::endl;

class silly_string : private std::string {
  public:
    silly_string(const char* const s) :
      std::string(s) {}
    ~silly_string() {
      cout.flush();
      cerr << "Deleting \"" << *this << "\"."
           << endl;
      // The destructor of the base class is now implicitly invoked.
    }

  friend std::ostream& operator<< ( std::ostream&, const silly_string& );
};

std::ostream& operator<< ( std::ostream& out, const silly_string& s )
{
  return out << static_cast<const std::string>(s);
}

int main()
{
  constexpr size_t nwords = 2;
  silly_string *const words = new silly_string[nwords]{
    "hello,",
    "world!" };

  cout << words[0] << ' '
       << words[1] << '\n';

  delete[] words;

  return EXIT_SUCCESS;
}

That test program explicitly instruments the destructor calls. It’s obviously a contrived example. For one thing, a program does not need to free memory immediately before it terminates and releases all its resources. But it does demonstrate what happens and in what order.

Some compilers, such as clang++, are smart enough to warn you if you leave out the [] in delete[] words;, but if you force it to compile the buggy code anyway, you get heap corruption.

Solution 7 - C++

Delete is an operator that destroys array and non-array(pointer) objects which are generated by new expression.

It can be used by either using the Delete operator or Delete [ ] operator A new operator is used for dynamic memory allocation which puts variables on heap memory. This means the Delete operator deallocates memory from the heap. Pointer to object is not destroyed, value or memory block pointed by the pointer is destroyed. The delete operator has a void return type that does not return a value.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionawiebeView Question on Stackoverflow
Solution 1 - C++François AndrieuxView Answer on Stackoverflow
Solution 2 - C++milleniumbugView Answer on Stackoverflow
Solution 3 - C++davidbakView Answer on Stackoverflow
Solution 4 - C++plugwashView Answer on Stackoverflow
Solution 5 - C++KazView Answer on Stackoverflow
Solution 6 - C++DavislorView Answer on Stackoverflow
Solution 7 - C++Aditya-aiView Answer on Stackoverflow