Why does GCC call libc's sqrt() without using its result?

C++ Problem Overview

Using GCC 6.3, the following C++ code:

#include <cmath>
#include <iostream>

void norm(double r, double i)
{
	double n = std::sqrt(r * r + i * i);
	std::cout << "norm = " << n;
}

generates the following x86-64 assembly:

norm(double, double):
        mulsd   %xmm1, %xmm1
        subq    $24, %rsp
        mulsd   %xmm0, %xmm0
        addsd   %xmm1, %xmm0
        pxor    %xmm1, %xmm1
        ucomisd %xmm0, %xmm1
        sqrtsd  %xmm0, %xmm2
        movsd   %xmm2, 8(%rsp)
        jbe     .L2
        call    sqrt
.L2:
        movl    std::cout, %edi
        movl    $7, %edx
        movl    $.LC1, %esi
        call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
        movsd   8(%rsp), %xmm0
        movl    std::cout, %edi
        addq    $24, %rsp
        jmp     std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)

For the call to std::sqrt, GCC first does it using sqrtsd and saves the result on to the stack. If it overflows, it then calls the libc sqrt function. But it never saves the xmm0 after that and before the second call to operator<<, it restores the value from the stack (because xmm0 was lost with the first call to operator<<).

With a simpler std::cout << n;, it's even more obvious:

subq    $24, %rsp
movsd   %xmm1, 8(%rsp)
call    sqrt
movsd   8(%rsp), %xmm1
movl    std::cout, %edi
addq    $24, %rsp
movapd  %xmm1, %xmm0
jmp     std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<double>(double)

Why is GCC not using the xmm0 value computed by libc sqrt?

C++ Solutions

Solution 1 - C++

It doesn't need to call sqrt to compute the result; it's already been calculated by the SQRTSD instruction. It calls sqrt to generate the required behaviour according to the standard when a negative number is passed to sqrt (for example, set errno and/or raise a floating-point exception). The PXOR, UCOMISD, and JBE instructions test whether the argument is less than 0 and skip the call to sqrt if this isn't true.

Attributions