C++ doesn't tell you the size of a dynamic array. But why?

C++Dynamic Arrays

C++ Problem Overview


I know that there is no way in C++ to obtain the size of a dynamically created array, such as:

int* a;
a = new int[n];

What I would like to know is: Why? Did people just forget this in the specification of C++, or is there a technical reason for this?

Isn't the information stored somewhere? After all, the command

delete[] a;

seems to know how much memory it has to release, so it seems to me that delete[] has some way of knowing the size of a.

C++ Solutions


Solution 1 - C++

It's a follow on from the fundamental rule of "don't pay for what you don't need". In your example delete[] a; doesn't need to know the size of the array, because int doesn't have a destructor. If you had written:

std::string* a;
a = new std::string[n];
...
delete [] a;

Then the delete has to call destructors (and needs to know how many to call) - in which case the new has to save that count. However, given it doesn't need to be saved on all occasions, Bjarne decided not to give access to it.

(In hindsight, I think this was a mistake ...)

Even with int of course, something has to know about the size of the allocated memory, but:

  • Many allocators round up the size to some convenient multiple (say 64 bytes) for alignment and convenience reasons. The allocator knows that a block is 64 bytes long - but it doesn't know whether that is because n was 1 ... or 16.

  • The C++ run-time library may not have access to the size of the allocated block. If for example, new and delete are using malloc and free under the hood, then the C++ library has no way to know the size of a block returned by malloc. (Usually of course, new and malloc are both part of the same library - but not always.)

Solution 2 - C++

One fundamental reason is that there is no difference between a pointer to the first element of a dynamically allocated array of T and a pointer to any other T.

Consider a fictitious function that returns the number of elements a pointer points to.
Let's call it "size".

Sounds really nice, right?

If it weren't for the fact that all pointers are created equal:

char* p = new char[10];
size_t ps = size(p+1);  // What?

char a[10] = {0};
size_t as = size(a);     // Hmm...
size_t bs = size(a + 1); // Wut?

char i = 0;
size_t is = size(&i);  // OK?

You could argue that the first should be 9, the second 10, the third 9, and the last 1, but to accomplish this you need to add a "size tag" on every single object.
A char will require 128 bits of storage (because of alignment) on a 64-bit machine. This is sixteen times more than what is necessary.
(Above, the ten-character array a would require at least 168 bytes.)

This may be convenient, but it's also unacceptably expensive.

You could of course envision a version that is only well-defined if the argument really is a pointer to the first element of a dynamic allocation by the default operator new, but this isn't nearly as useful as one might think.

Solution 3 - C++

You are right that some part of the system will have to know something about the size. But getting that information is probably not covered by the API of memory management system (think malloc/free), and the exact size that you requested may not be known, because it may have been rounded up.

Solution 4 - C++

There is a curious case of overloading the operator delete that I found in the form of:

void operator delete[](void *p, size_t size);

The parameter size seems to default to the size (in bytes) of the block of memory to which void *p points. If this is true, it is reasonable to at least hope that it has a value passed by the invocation of operator new and, therefore, would merely need to be divided by sizeof(type) to deliver the number of elements stored in the array.

As for the "why" part of your question, Martin's rule of "don't pay for what you don't need" seems the most logical.

Solution 5 - C++

You will often find that memory managers will only allocate space in a certain multiple, 64 bytes for example.

So, you may ask for new int[4], i.e. 16 bytes, but the memory manager will allocate 64 bytes for your request. To free this memory it doesn't need to know how much memory you asked for, only that it has allocated you one block of 64 bytes.

The next question may be, can it not store the requested size? This is an added overhead which not everybody is prepared to pay for. An Arduino Uno for example only has 2k of RAM, and in that context 4 bytes for each allocation suddenly becomes significant.

If you need that functionality then you have std::vector (or equivalent), or you have higher-level languages. C/C++ was designed to enable you to work with as little overhead as you choose to make use of, this being one example.

Solution 6 - C++

There's no way to know how you are going to use that array. The allocation size does not necessarily match the element number so you cannot just use the allocation size (even if it was available).

This is a deep flaw in other languages not in C++. You achieve the functionality you desire with std::vector yet still retain raw access to arrays. Retaining that raw access is critical for any code that actually has to do some work.

Many times you will perform operations on subsets of the array and when you have extra book-keeping built into the language you have to reallocate the sub-arrays and copy the data out to manipulate them with an API that expects a managed array.

Just consider the trite case of sorting the data elements. If you have managed arrays then you can't use recursion without copying data to create new sub-arrays to pass recursively.

Another example is an FFT which recursively manipulates the data starting with 2x2 "butterflies" and works its way back to the whole array.

To fix the managed array you now need "something else" to patch over this defect and that "something else" is called 'iterators'. (You now have managed arrays but almost never pass them to any functions because you need iterators +90% of the time.)

Solution 7 - C++

The size of an array allocated with new[] is not visibly stored anywhere, so you can't access it. And new[] operator doesn't return an array, just a pointer to the array's first element. If you want to know the size of a dynamic array, you must store it manually or use classes from libraries such as std::vector

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjarauhView Question on Stackoverflow
Solution 1 - C++Martin Bonner supports MonicaView Answer on Stackoverflow
Solution 2 - C++molbdniloView Answer on Stackoverflow
Solution 3 - C++Carsten SView Answer on Stackoverflow
Solution 4 - C++Jean LouwView Answer on Stackoverflow
Solution 5 - C++DewiWView Answer on Stackoverflow
Solution 6 - C++QuazilView Answer on Stackoverflow
Solution 7 - C++GorView Answer on Stackoverflow