Argument order to std::min changes compiler output for floating-point

AndroidC++AssemblyX86Floating Point

Android Problem Overview


I was fiddling in Compiler Explorer, and I found that the order of arguments passed to std::min changes the emitted assembly.

Here's the example on Godbolt Compiler Explorer

double std_min_xy(double x, double y) {
    return std::min(x, y);
}

double std_min_yx(double x, double y) {
    return std::min(y, x);
}

This is compiled (with -O3 on clang 9.0.0, for example), to:

std_min_xy(double, double):                       # @std_min_xy(double, double)
        minsd   xmm1, xmm0
        movapd  xmm0, xmm1
        ret
std_min_yx(double, double):                       # @std_min_yx(double, double)
        minsd   xmm0, xmm1
        ret

This persists if I change the std::min to an old-school ternary operator. It also persists across all the modern compilers I tried out (clang, gcc, icc).

The underlying instruction is minsd. Reading the documentation, the first argument of minsd is also the destination for the answer. Apparently xmm0 is where my function is supposed to put its return value, so if xmm0 is used as the first argument, there is no movapd needed. But if xmm0 is the second argument, then it has to movapd xmm0, xmm1 to get the value into xmm0. (editor's note: yes, x86-64 System V passes FP args in xmm0, xmm1, etc., and returns in xmm0.)

My question: why doesn't the compiler switch the order of the arguments itself, so that this movapd isn't necessary? It surely must know that the order of arguments to minsd does not change the answer? Is there some side-effect that I'm not appreciating?

Android Solutions


Solution 1 - Android

minsd a,b is not commutative for some special FP values, and neither is std::min, unless you use -ffast-math.

minsd a,b exactly implements (a<b) ? a : b including everything that implies about signed-zero and NaN in strict IEEE-754 semantics. (i.e. it keeps the source operand, b, on unordered1 or equal). As Artyer points out, -0.0 and +0.0 compare equal (i.e. -0. < 0. is false), but they are distinct.

std::min is defined in terms of an (a<b) comparison expression (cppreference), with (a<b) ? a : b as a possible implementation, unlike std::fmin which guarantees NaN propagation from either operand, among other things. (fmin originally came from the C math library, not a C++ template.)

See https://stackoverflow.com/questions/40196817/what-is-the-instruction-that-gives-branchless-fp-min-and-max-on-x86 for much more detail about minss/minsd / maxss/maxsd (and the corresponding intrinsics, which follow the same non-commutative rules except in some GCC versions.)

Footnote 1: Remember that NaN<b is false for any b, and for any comparison predicate. e.g. NaN == b is false, and so is NaN > b. Even NaN == NaN is false. When one or more of a pair are NaN, they are "unordered" wrt. each other.


With -ffast-math (to tell the compiler to assume no NaNs, and other assumptions and approximations), compilers will optimize either function to a single minsd. https://godbolt.org/z/a7oK91

For GCC, see https://gcc.gnu.org/wiki/FloatingPointMath
clang supports similar options, including -ffast-math as a catch-all.

Some of those options should be enabled by almost everyone, except for weird legacy codebases, e.g. -fno-math-errno. (See this Q&A for more about recommended math optimizations). And gcc -fno-trapping-math is a good idea because it doesn't fully work anyway, despite being on by default (some optimizations can still change the number of FP exceptions that would be raised if exceptions were unmasked, including sometimes even from 1 to 0 or 0 to non-zero, IIRC). gcc -ftrapping-math also blocks some optimizations that are 100% safe even wrt. exception semantics, so it's pretty bad. In code that doesn't use fenv.h, you'll never know the difference.

But treating std::min as commutative can only be accomplished with options that assume no NaNs, and stuff like that, so definitely can't be called "safe" for code that cares about exactly what happens with NaNs. e.g. -ffinite-math-only assumes no NaNs (and no infinities)

clang -funsafe-math-optimizations -ffinite-math-only will do the optimization you're looking for. (unsafe-math-optimizations implies a bunch of more specific options, including not caring about signed zero semantics).

Solution 2 - Android

Consider: std::signbit(std::min(+0.0, -0.0)) == false && std::signbit(std::min(-0.0, +0.0)) == true.

The only other difference is if both arguments are (possibly different) NaNs, the second argument should be returned.


You can allow gcc to reorder the arguments by using the -funsafe-math-optimizations -fno-math-errno optimsations (Both enabled by -ffast-math). unsafe-math-optimizations allows the compiler to not care about signed zero, and finite-math-only to not care about NaNs

Solution 3 - Android

To expand on the existing answers that say std::min isn't commutative: Here's a concrete example that reliably distinguishes std_min_xy from std_min_yx. Godbolt:

bool distinguish1() {
    return 1 / std_min_xy(0.0, -0.0) > 0.0;
}
bool distinguish2() {
    return 1 / std_min_yx(0.0, -0.0) > 0.0;
}

distinguish1() evaluates to 1 / 0.0 > 0.0, i.e. INFTY > 0.0, or true.
distinguish2() evaluates to 1 / -0.0 > 0.0, i.e. -INFTY > 0.0, or false.
(All this under IEEE rules, of course. I don't think the C++ standard mandates that compilers preserve this particular behavior. Honestly I was surprised that the expression -0.0 actually evaluated to a negative zero in the first place!

-ffinite-math-only eliminates this way of telling the difference, and -ffinite-math-only -funsafe-math-optimizations completely eliminates the difference in codegen.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRaveTheTadpoleView Question on Stackoverflow
Solution 1 - AndroidPeter CordesView Answer on Stackoverflow
Solution 2 - AndroidArtyerView Answer on Stackoverflow
Solution 3 - AndroidQuuxplusoneView Answer on Stackoverflow