0.1 float is greater than 0.1 double. I expected it to be false

C++CFloating PointDoubleRounding

C++ Problem Overview


Let:

double d = 0.1;
float f = 0.1;

should the expression

(f > d)

return true or false?

Empirically, the answer is true. However, I expected it to be false.

As 0.1 cannot be perfectly represented in binary, while double has 15 to 16 decimal digits of precision, and float has only 7. So, they both are less than 0.1, while the double is more close to 0.1.

I need an exact explanation for the true.

C++ Solutions


Solution 1 - C++

I'd say the answer depends on the rounding mode when converting the double to float. float has 24 binary bits of precision, and double has 53. In binary, 0.1 is:

0.1₁₀ = 0.0001100110011001100110011001100110011001100110011…₂
             ^        ^         ^   ^ 
             1       10        20  24

So if we round up at the 24th digit, we'll get

0.1₁₀ ~ 0.000110011001100110011001101

which is greater than the exact value and the more precise approximation at 53 digits.

Solution 2 - C++

The number 0.1 will be rounded to the closest floating-point representation with the given precision. This approximation might be either greater than or less than 0.1, so without looking at the actual values, you can't predict whether the single precision or double precision approximation is greater.

Here's what the double precision value gets rounded to (using a Python interpreter):

>>> "%.55f" % 0.1
'0.1000000000000000055511151231257827021181583404541015625'

And here's the single precision value:

>>> "%.55f" % numpy.float32("0.1")
'0.1000000014901161193847656250000000000000000000000000000'

So you can see that the single precision approximation is greater.

Solution 3 - C++

If you convert .1 to binary you get:

0.000110011001100110011001100110011001100110011001100...
repeating forever

Mapping to data types, you get:

float(.1)  = %.00011001100110011001101
^--- note rounding
double(.1) = %.0001100110011001100110011001100110011001100110011010

Convert that to base 10:

float(.1)  = .10000002384185791015625
double(.1) = .100000000000000088817841970012523233890533447265625

This was taken from an article written by Bruce Dawson. it can be found here:
Doubles are not floats, so don’t compare them

Solution 4 - C++

I think Eric Lippert's comment on the question is actually the clearest explanation, so I'll repost it as an answer:

> Suppose you are computing 1/9 in 3-digit decimal and 6-digit decimal. 0.111 < 0.111111, right? > > Now suppose you are computing 6/9. 0.667 > 0.666667, right? > > You can't have it that 6/9 in three digit decimal is 0.666 because that is not the closest 3-digit decimal to 6/9!

Solution 5 - C++

Since it can't be exactly represented, comparing 1/10 in base 2 is like comparing 1/7 in base 10.

1/7 = 0.142857142857... but comparing at different base 10 precisions (3 versus 6 decimal places) we have 0.143 > 0.142857.

Solution 6 - C++

Just to add to the other answers talking about IEEE-754 and x86: the issue is even more complicated than they make it seem. There is not "one" representation of 0.1 in IEEE-754 - there are two. Either rounding the last digit down or up would be valid. This difference can and does actually occur, because x86 does not use 64-bits for its internal floating-point computations; it actually uses 80-bits! This is called double extended-precision.

So, even among just x86 compilers, it sometimes happen that the same number is represented two different ways, because some computes its binary representation with 64-bits, while others use 80.


In fact, it can happen even with the same compiler, even on the same machine!

#include <iostream>
#include <cmath>

void foo(double x, double y)
{
  if (std::cos(x) != std::cos(y)) {
    std::cout << "Huh?!?\n";  //← you might end up here when x == y!!
  }
}

int main()
{
  foo(1.0, 1.0);
  return 0;
}

See Why is cos(x) != cos(y) even though x == y? for more info.

Solution 7 - C++

The rank of double is greater than that of float in conversions. By doing a logical comparison, f is cast to double and maybe the implementation you are using is giving inconsistent results. If you suffix f so the compiler registers it as a float, then you get 0.00 which is false in double type. Unsuffixed floating types are double.

#include <stdio.h>
#include <float.h>
 
int main()
{
     double d = 0.1;
     float f = 0.1f;
     printf("%f\n", (f > d));

     return 0;
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionHesham EraqiView Question on Stackoverflow
Solution 1 - C++kennytmView Answer on Stackoverflow
Solution 2 - C++Sven MarnachView Answer on Stackoverflow
Solution 3 - C++static_castView Answer on Stackoverflow
Solution 4 - C++KipView Answer on Stackoverflow
Solution 5 - C++xanView Answer on Stackoverflow
Solution 6 - C++BlueRaja - Danny PflughoeftView Answer on Stackoverflow
Solution 7 - C++Odimegwu DavidView Answer on Stackoverflow