Why does C++ promote an int to a float when a float cannot represent all int values?

C++TypesIntFloating Point-ConversionPromotions

C++ Problem Overview


Say I have the following:

int i = 23;
float f = 3.14;
if (i == f) // do something

i will be promoted to a float and the two float numbers will be compared, but can a float represent all int values? Why not promote both the int and the float to a double?

C++ Solutions


Solution 1 - C++

When int is promoted to unsigned in the integral promotions, negative values are also lost (which leads to such fun as 0u < -1 being true).

Like most mechanisms in C (that are inherited in C++), the usual arithmetic conversions should be understood in terms of hardware operations. The makers of C were very familiar with the assembly language of the machines with which they worked, and they wrote C to make immediate sense to themselves and people like themselves when writing things that would until then have been written in assembly (such as the UNIX kernel).

Now, processors, as a rule, do not have mixed-type instructions (add float to double, compare int to float, etc.) because it would be a huge waste of real estate on the wafer -- you'd have to implement as many times more opcodes as you want to support different types. That you only have instructions for "add int to int," "compare float to float", "multiply unsigned with unsigned" etc. makes the usual arithmetic conversions necessary in the first place -- they are a mapping of two types to the instruction family that makes most sense to use with them.

From the point of view of someone who's used to writing low-level machine code, if you have mixed types, the assembler instructions you're most likely to consider in the general case are those that require the least conversions. This is particularly the case with floating points, where conversions are runtime-expensive, and particularly back in the early 1970s, when C was developed, computers were slow, and when floating point calculations were done in software. This shows in the usual arithmetic conversions -- only one operand is ever converted (with the single exception of long/unsigned int, where the long may be converted to unsigned long, which does not require anything to be done on most machines. Perhaps not on any where the exception applies).

So, the usual arithmetic conversions are written to do what an assembly coder would do most of the time: you have two types that don't fit, convert one to the other so that it does. This is what you'd do in assembler code unless you had a specific reason to do otherwise, and to people who are used to writing assembler code and do have a specific reason to force a different conversion, explicitly requesting that conversion is natural. After all, you can simply write

if((double) i < (double) f)

It is interesting to note in this context, by the way, that unsigned is higher in the hierarchy than int, so that comparing int with unsigned will end in an unsigned comparison (hence the 0u < -1 bit from the beginning). I suspect this to be an indicator that people in olden times considered unsigned less as a restriction on int than as an extension of its value range: We don't need the sign right now, so let's use the extra bit for a larger value range. You'd use it if you had reason to expect that an int would overflow -- a much bigger worry in a world of 16-bit ints.

Solution 2 - C++

Even double may not be able to represent all int values, depending on how much bits does int contain.

Why not promote both the int and the float to a double?

Probably because it's more costly to convert both types to double than use one of the operands, which is already a float, as float. It would also introduce special rules for comparison operators incompatible with rules for arithmetic operators.

There's also no guarantee how floating point types will be represented, so it would be a blind shot to assume that converting int to double (or even long double) for comparison will solve anything.

Solution 3 - C++

The type promotion rules are designed to be simple and to work in a predictable manner. The types in C/C++ are naturally "sorted" by the range of values they can represent. See this for details. Although floating point types cannot represent all integers represented by integral types because they can't represent the same number of significant digits, they might be able to represent a wider range.

To have predictable behavior, when requiring type promotions, the numeric types are always converted to the type with the larger range to avoid overflow in the smaller one. Imagine this:

int i = 23464364; // more digits than float can represent!
float f = 123.4212E36f; // larger range than int can represent!
if (i == f) { /* do something */ }

If the conversion was done towards the integral type, the float f would certainly overflow when converted to int, leading to undefined behavior. On the other hand, converting i to f only causes a loss of precision which is irrelevant since f has the same precision so it's still possible that the comparison succeeds. It's up to the programmer at that point to interpret the result of the comparison according to the application requirements.

Finally, besides the fact that double precision floating point numbers suffer from the same problem representing integers (limited number of significant digits), using promotion on both types would lead to having a higher precision representation for i, while f is doomed to have the original precision, so the comparison will not succeed if i has a more significant digits than f to begin with. Now that is also undefined behavior: the comparison might succeed for some couples (i,f) but not for others.

Solution 4 - C++

> can a float represent all int values?

For a typical modern system where both int and float are stored in 32 bits, no. Something's gotta give. 32 bits' worth of integers doesn't map 1-to-1 onto a same-sized set that includes fractions.

> The i will be promoted to a float and the two float numbers will be compared…

Not necessarily. You don't really know what precision will apply. C++14 §5/12:

> The values of the floating operands and the results of floating expressions may be represented in greater precision and range than that required by the type; the types are not changed thereby.

Although i after promotion has nominal type float, the value may be represented using double hardware. C++ doesn't guarantee floating-point precision loss or overflow. (This is not new in C++14; it's inherited from C since olden days.)

> Why not promote both the int and the float to a double?

If you want optimal precision everywhere, use double instead and you'll never see a float. Or long double, but that might run slower. The rules are designed to be relatively sensible for the majority of use-cases of limited-precision types, considering that one machine may offer several alternative precisions.

Most of the time, fast and loose is good enough, so the machine is free to do whatever is easiest. That might mean a rounded, single-precision comparison, or double precision and no rounding.

But, such rules are ultimately compromises, and sometimes they fail. To precisely specify arithmetic in C++ (or C), it helps to make conversions and promotions explicit. Many style guides for extra-reliable software prohibit using implicit conversions altogether, and most compilers offer warnings to help you expunge them.

To learn about how these compromises came about, you can peruse the C rationale document. (The latest edition covers up to C99.) It is not just senseless baggage from the days of the PDP-11 or K&R.

Solution 5 - C++

It is fascinating that a number of answers here argue from the origin of the C language, explicitly naming K&R and historical baggage as the reason that an int is converted to a float when combined with a float.

This is pointing the blame to the wrong parties. In K&R C, there was no such thing as a float calculation. All floating point operations were done in double precision. For that reason, an integer (or anything else) was never implicitly converted to a float, but only to a double. A float also could not be the type of a function argument: you had to pass a pointer to float if you really, really, really wanted to avoid conversion into a double. For that reason, the functions

int x(float a)
{ ... }

and

int y(a)
float a;
{ ... }

have different calling conventions. The first gets a float argument, the second (by now no longer permissable as syntax) gets a double argument.

Single-precision floating point arithmetic and function arguments were only introduced with ANSI C. Kernighan/Ritchie is innocent.

Now with the newly available single float expressions (single float previously was only a storage format), there also had to be new type conversions. Whatever the ANSI C team picked here (and I would be at a loss for a better choice) is not the fault of K&R.

Solution 6 - C++

Q1: Can a float represent all int values?

IEE754 can represent all integers exactly as floats, up to about 223, as mentioned in this answer.

Q2: Why not promote both the int and the float to a double?

>The rules in the Standard for these conversions are slight modifications of those in K&R: the modifications accommodate the added types and the value preserving rules. Explicit license was added to perform calculations in a “wider” type than absolutely necessary, since this can sometimes produce smaller and faster code, not to mention the correct answer more often. Calculations can also be performed in a “narrower” type by the as if rule so long as the same end result is obtained. Explicit casting can always be used to obtain a value in a desired type.

Source

Performing calculations in a wider type means that given float f1; and float f2;, f1 + f2 might be calculated in double precision. And it means that given int i; and float f;, i == f might be calculated in double precision. But it isn't required to calculate i == f in double precision, as hvd stated in the comment.

Also C standard says so. These are known as the usual arithmetic conversions . The following description is taken straight from the ANSI C standard.

> ...if either operand has type float , the other operand is converted to type float .

Source and you can see it in the ref too.

A relevant link is this answer. A more analytic source is here.

Here is another way to explain this: The usual arithmetic conversions are implicitly performed to cast their values in a common type. Compiler first performs integer promotion, if operands still have different types then they are converted to the type that appears highest in the following hierarchy:

enter image description here

Source.

Solution 7 - C++

When a programming language is created some decisions are made intuitively.

For instance why not convert int+float to int+int instead of float+float or double+double? Why call int->float a promotion if it holds the same about of bits? Why not call float->int a promotion?

If you rely on implicit type conversions you should know how they work, otherwise just convert manually.

Some language could have been designed without any automatic type conversions at all. And not every decision during a design phase could have been made logically with a good reason.

JavaScript with it's duck typing has even more obscure decisions under the hood. Designing an absolutely logical language is impossible, I think it goes to Godel incompleteness theorem. You have to balance logic, intuition, practice and ideals.

Solution 8 - C++

The question is why: Because it is fast, easy to explain, easy to compile, and these were all very important reasons at the time when the C language was developed.

You could have had a different rule: That for every comparison of arithmetic values, the result is that of comparing the actual numerical values. That would be somewhere between trivial if one of the expressions compared is a constant, one additional instruction when comparing signed and unsigned int, and quite difficult if you compare long long and double and want correct results when the long long cannot be represented as double. (0u < -1 would be false, because it would compare the numerical values 0 and -1 without considering their types).

In Swift, the problem is solved easily by disallowing operations between different types.

Solution 9 - C++

The rules are written for 16 bit ints (smallest required size). Your compiler with 32 bit ints surely converts both sides to double. There are no float registers in modern hardware anyway so it has to convert to double. Now if you have 64 bit ints I'm not too sure what it does. long double would be appropriate (normally 80 bits but it's not even standard).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
Questionuser4344762View Question on Stackoverflow
Solution 1 - C++WintermuteView Answer on Stackoverflow
Solution 2 - C++mipView Answer on Stackoverflow
Solution 3 - C++Giovanni BottaView Answer on Stackoverflow
Solution 4 - C++PotatoswatterView Answer on Stackoverflow
Solution 5 - C++user4467309View Answer on Stackoverflow
Solution 6 - C++gsamarasView Answer on Stackoverflow
Solution 7 - C++exebookView Answer on Stackoverflow
Solution 8 - C++gnasher729View Answer on Stackoverflow
Solution 9 - C++JoshuaView Answer on Stackoverflow