C++20 bit_cast vs reinterpret_cast

C++Language LawyerType AliasC++20

C++ Problem Overview


According to the last meeting of the ISO C++ Committee, bit-cast will be introduced in C++20 standard.

I know that reinterpret_cast is not suitable for this job due to type aliasing rules but my question is why did they choose not to extend the reinterpret_cast to treat the object like it bit sequence representation and preferred to give this functionality as a new language construct?

C++ Solutions


Solution 1 - C++

Well, there is one obvious reason: because it wouldn't do everything that bit_cast does. Even in the C++20 world where we can allocate memory at compile time, reinterpret_cast is forbidden in constexpr functions. One of the explicit goals of bit_cast is to be able to do these sorts of things at compile-time:

> Furthermore, it is currently impossible to implement a constexpr bit-cast function, as memcpy itself isn’t constexpr. Marking the proposed function as constexpr doesn’t require or prevent memcpy from becoming constexpr, but requires compiler support. This leaves implementations free to use their own internal solution (e.g. LLVM has a bitcast opcode).

Now, you could say that you could just extend this specific usage of reinterpret_cast to constexpr contexts. But that makes the rules complicated. Instead of simply knowing that reinterpret_cast can't be used in constexpr code period, you have to remember the specific forms of reinterpret_cast that can't be used.

Also, there are practical concerns. Even if you wanted to go the reinterpret_cast route, std::bit_cast is a library function. And it's always easier to get a library feature through the committee than a language feature, even if it would receive some compiler support.

Then there's the more subjective stuff. reinterpret_cast is generally considered an inherently dangerous operation, indicative of "cheating" the type system in some way. By contrast, bit_cast is not. It is generating a new object as if by copying its value representation from an existing one. It's a low-level tool, but it's not a tool that messes with the type system. So it would be strange to spell a "safe" operation the same way you spell a "dangerous" one.

Indeed, if you did spell them the same way, it starts raising questions as to why this is reasonably well-defined:

float f = 20.4f;
int i = reinterpret_cast<int>(f);

But this is somehow bad:

float f = 20.4f;
int &i = reinterpret_cast<int &>(f);

And sure, a language lawyer or someone familiar with the strict aliasing rule would understand why the latter is bad. But for the lay person, if it is fine to use reinterpret_cast to do a bit-conversion, it is unclear why it is wrong to use reinterpret_cast to convert pointers/references and interpret an existing object as a converted type.

Different tools should be spelled differently.

Solution 2 - C++

There is a fundamental mismatch between the high level language nature of modern, strict interpretation of the C and C++ language standards by compilers and the notion that you can use reinterpret_cast to reinterpret a bunch of bytes as another objects. Note that the so called "strict aliasing" rule in most cases cannot even be used to disqualify any attempt at reinterpreting bytes as the code wouldn't have defined behavior in the first place: reinterpret_cast<float*>(&Int) isn't even a pointer to a float object, it's a pointer to an integer that happens to have the wrong type. You can't dereference it as there is no float object created at that place; if there was one, its lifetime wouldn't have started; and if its lifetime had started, it would be uninitialized.

Bytes that happen to represent a valid float bit pattern just can't be interpreted as such if you don't have a proper float object here.

A valid non null pointer isn't just a typed value of a start address of an area that happens to be properly aligned; a non null valid pointer points to a particular object (or one past the end of an array or a trivial "array" of one object).

I don't even see the "strict aliasing" sanctioned scalar reinterpretation casts (signed/unsigned mix) as possibly valid, as non signed (resp. unsigned) integer object exists at that address (and the compiler obviously cannot use the unsigned (resp. signed) original value either).

Either way, C++ has a broken design because it's a mix of different languages (some very low level some very high level) and is badly broken.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionbogdan tudoseView Question on Stackoverflow
Solution 1 - C++Nicol BolasView Answer on Stackoverflow
Solution 2 - C++curiousguyView Answer on Stackoverflow