std::tuple sizeof, is it a missed optimization?

C++TuplesLanguage LawyerPadding

C++ Problem Overview


I've checked all major compilers, and sizeof(std::tuple<int, char, int, char>) is 16 for all of them. Presumably they just put elements in order into the tuple, so some space is wasted because of alignment.

If tuple stored elements internally like: int, int, char, char, then its sizeof could be 12.

Is it possible for an implementation to do this, or is it forbidden by some rule in the standard?

C++ Solutions


Solution 1 - C++

> std::tuple sizeof, is it a missed optimization?

Yep.

> Is it possible for an implementation to do this[?]

Yep.

> [Is] it forbidden by some rule in the standard?

Nope!

Reading through [tuple], there is no constraint placed upon the implementation to store the members in template-argument order.

In fact, every passage I can find seems to go to lengths to avoid making any reference to member-declaration order at all: get<N>() is used in the description of operational semantics. Other wording is stated in terms of "elements" rather than "members", which seems like quite a deliberate abstraction.

In fact, some implementations do apparently store the members in reverse order, at least, probably simply due to the way they use inheritance recursively to unpack the template arguments (and because, as above, they're permitted to).

Speaking specifically about your hypothetical optimisation, though, I'm not aware of any implementation that doesn't store elements in [some trivial function of] the user-given order; I'm guessing that it would be "hard" to come up with such an order and to provide the machinery for std::get, at least as compared to the amount of gain you'd get from doing so. If you are really concerned about padding, you may of course choose your element order carefully to avoid it (on some given platform), much as you would with a class (without delving into the world of "packed" attributes). (A "packed" tuple could be an interesting proposal…)

Solution 2 - C++

Yes, it's possible and has been (mostly) done by R. Martinho Fernandes. He used to have a blog called Flaming Danger Zone, which is now down for some reason, but its sources are still available on github.

Here are the all four parts of the Size Matters series on this exact topic: 1, 2, 3, 4.

You might wish to view them raw since github doesn't understand C++ highlighting markup used and renders code snippets as unreadable oneliners.

He essentially computes a permutation for tuple indices via C++11 template meta-program, that sorts elements by alignment in non-ascending order, stores the elements according to it, and then applies it to the index on every access.

Solution 3 - C++

They could. One possible reason they don’t: some architectures, including x86, have an indexing mode that can address an address base + size × index in a single instruction—but only when size is a power of 2. Or it might be slightly faster to do a load or store aligned to a 16-byte boundary. This could make code that addresses arrays of std::tuple slightly faster and more compact if they add four padding bytes.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiongezaView Question on Stackoverflow
Solution 1 - C++Lightness Races in OrbitView Answer on Stackoverflow
Solution 2 - C++yuri kilochekView Answer on Stackoverflow
Solution 3 - C++DavislorView Answer on Stackoverflow