Compare double to zero using epsilon

C++Floating PointDouble

C++ Problem Overview


Today, I was looking through some C++ code (written by somebody else) and found this section:

double someValue = ...
if (someValue <  std::numeric_limits<double>::epsilon() && 
    someValue > -std::numeric_limits<double>::epsilon()) {
  someValue = 0.0;
}

I'm trying to figure out whether this even makes sense.

The documentation for epsilon() says:

> The function returns the difference between 1 and the smallest value greater than 1 that is representable [by a double].

Does this apply to 0 as well, i.e. epsilon() is the smallest value greater than 0? Or are there numbers between 0 and 0 + epsilon that can be represented by a double?

If not, then isn't the comparison equivalent to someValue == 0.0?

C++ Solutions


Solution 1 - C++

Assuming 64-bit IEEE double, there is a 52-bit mantissa and 11-bit exponent. Let's break it to bits:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^0 = 1

The smallest representable number greater than 1:

1.0000 00000000 00000000 00000000 00000000 00000000 00000001 × 2^0 = 1 + 2^-52

Therefore:

epsilon = (1 + 2^-52) - 1 = 2^-52

Are there any numbers between 0 and epsilon? Plenty... E.g. the minimal positive representable (normal) number is:

1.0000 00000000 00000000 00000000 00000000 00000000 00000000 × 2^-1022 = 2^-1022

In fact there are (1022 - 52 + 1)×2^52 = 4372995238176751616 numbers between 0 and epsilon, which is 47% of all the positive representable numbers...

Solution 2 - C++

The test certainly is not the same as someValue == 0. The whole idea of floating-point numbers is that they store an exponent and a significand. They therefore represent a value with a certain number of binary significant figures of precision (53 in the case of an IEEE double). The representable values are much more densely packed near 0 than they are near 1.

To use a more familiar decimal system, suppose you store a decimal value "to 4 significant figures" with exponent. Then the next representable value greater than 1 is 1.001 * 10^0, and epsilon is 1.000 * 10^-3. But 1.000 * 10^-4 is also representable, assuming that the exponent can store -4. You can take my word for it that an IEEE double can store exponents less than the exponent of epsilon.

You can't tell from this code alone whether it makes sense or not to use epsilon specifically as the bound, you need to look at the context. It may be that epsilon is a reasonable estimate of the error in the calculation that produced someValue, and it may be that it isn't.

Solution 3 - C++

There are numbers that exist between 0 and epsilon because epsilon is the difference between 1 and the next highest number that can be represented above 1 and not the difference between 0 and the next highest number that can be represented above 0 (if it were, that code would do very little):-

#include <limits>

int main ()
{
  struct Doubles
  {
      double one;
      double epsilon;
      double half_epsilon;
  } values;

  values.one = 1.0;
  values.epsilon = std::numeric_limits<double>::epsilon();
  values.half_epsilon = values.epsilon / 2.0;
}

Using a debugger, stop the program at the end of main and look at the results and you'll see that epsilon / 2 is distinct from epsilon, zero and one.

So this function takes values between +/- epsilon and makes them zero.

Solution 4 - C++

An aproximation of epsilon (smallest possible difference) around a number (1.0, 0.0, ...) can be printed with the following program. It prints the following output:
epsilon for 0.0 is 4.940656e-324
epsilon for 1.0 is 2.220446e-16
A little thinking makes it clear, that the epsilon gets smaller the more smaller the number is we use for looking at its epsilon-value, because the exponent can adjust to the size of that number.

#include <stdio.h>
#include <assert.h>
double getEps (double m) {
  double approx=1.0;
  double lastApprox=0.0;
  while (m+approx!=m) {
    lastApprox=approx;
    approx/=2.0;
  }
  assert (lastApprox!=0);
  return lastApprox;
}
int main () {
  printf ("epsilon for 0.0 is %e\n", getEps (0.0));
  printf ("epsilon for 1.0 is %e\n", getEps (1.0));
  return 0;
}

Solution 5 - C++

The difference between X and the next value of X varies according to X.
epsilon() is only the difference between 1 and the next value of 1.
The difference between 0 and the next value of 0 is not epsilon().

Instead you can use std::nextafter to compare a double value with 0 as the following:

bool same(double a, double b)
{
  return std::nextafter(a, std::numeric_limits<double>::lowest()) <= b
    && std::nextafter(a, std::numeric_limits<double>::max()) >= b;
}

double someValue = ...
if (same (someValue, 0.0)) {
  someValue = 0.0;
}

Solution 6 - C++

Suppose we are working with toy floating point numbers that fit in a 16 bit register. There is a sign bit, a 5 bit exponent, and a 10 bit mantissa.

The value of this floating point number is the mantissa, interpreted as a binary decimal value, times two to the power of the exponent.

Around 1 the exponent equals zero. So the smallest digit of the mantissa is one part in 1024.

Near 1/2 the exponent is minus one, so the smallest part of the mantissa is half as large. With a five bit exponent it can reach negative 16, at which point the smallest part of the mantissa is worth one part in 32m. And at negative 16 exponent, the value is around one part in 32k, much closer to zero than the epsilon around one we calculated above!

Now this is a toy floating point model that does not reflect all the quirks of a real floating point system , but the ability to reflect values smaller than epsilon is reasonably similar with real floating point values.

Solution 7 - C++

You can't apply this to 0, because of mantissa and exponent parts. Due to exponent you can store very little numbers, which are smaller than epsilon, but when you try to do something like (1.0 - "very small number") you'll get 1.0. Epsilon is an indicator not of value, but of value precision, which is in mantissa. It shows how many correct consequent decimal digits of number we can store.

Solution 8 - C++

I think that depend on the precision of your computer. Take a look on this table: you can see that if your epsilon is represented by double, but your precision is higher, the comparison is not equivalent to

someValue == 0.0

Good question anyway!

Solution 9 - C++

So let's say system cannot distinguish 1.000000000000000000000 and 1.000000000000000000001. that is 1.0 and 1.0 + 1e-20. Do you think there still are some values that can be represented between -1e-20 and +1e-20?

Solution 10 - C++

With IEEE floating-point, between the smallest non-zero positive value and the smallest non-zero negative value, there exist two values: positive zero and negative zero. Testing whether a value is between the smallest non-zero values is equivalent to testing for equality with zero; the assignment, however, may have an effect, since it would change a negative zero to a positive zero.

It would be conceivable that a floating-point format might have three values between the smallest finite positive and negative values: positive infinitesimal, unsigned zero, and negative infinitesimal. I am not familiar with any floating-point formats that in fact work that way, but such a behavior would be perfectly reasonable and arguably better than that of IEEE (perhaps not enough better to be worth adding extra hardware to support it, but mathematically 1/(1/INF), 1/(-1/INF), and 1/(1-1) should represent three distinct cases illustrating three different zeroes). I don't know whether any C standard would mandate that signed infinitesimals, if they exist, would have to compare equal to zero. If they do not, code like the above could usefully ensure that e.g. dividing a number repeatedly by two would eventually yield zero rather than being stuck on "infinitesimal".

Solution 11 - C++

Also, a good reason for having such a function is to remove "denormals" (those very small numbers that can no longer use the implied leading "1" and have a special FP representation). Why would you want to do this? Because some machines (in particular, some older Pentium 4s) get really, really slow when processing denormals. Others just get somewhat slower. If your application doesn't really need these very small numbers, flushing them to zero is a good solution. Good places to consider this are the last steps of any IIR filters or decay functions.

See also: https://stackoverflow.com/questions/9314534/why-does-changing-0-1f-to-0-slow-down-performance-by-10x

and http://en.wikipedia.org/wiki/Denormal_number

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSebastian KrysmanskiView Question on Stackoverflow
Solution 1 - C++Yakov GalkaView Answer on Stackoverflow
Solution 2 - C++Steve JessopView Answer on Stackoverflow
Solution 3 - C++SkizzView Answer on Stackoverflow
Solution 4 - C++pbhdView Answer on Stackoverflow
Solution 5 - C++Daniel LaügtView Answer on Stackoverflow
Solution 6 - C++Yakk - Adam NevraumontView Answer on Stackoverflow
Solution 7 - C++Arsenii FominView Answer on Stackoverflow
Solution 8 - C++Luca DavanzoView Answer on Stackoverflow
Solution 9 - C++cababungaView Answer on Stackoverflow
Solution 10 - C++supercatView Answer on Stackoverflow
Solution 11 - C++DithermasterView Answer on Stackoverflow