Is multiplication faster than float division?

C++CPerformanceOptimization

C++ Problem Overview


In C/C++, you can set up the following code:

double a, b, c;
...
c = (a + b) / 2;

This does the exact same thing as:

c = (a + b) * 0.5;

I'm wondering which is better to use. Is one operation fundamentally faster than the other?

C++ Solutions


Solution 1 - C++

Multiplication is faster than division. At university I was taught that division takes six times that of multiplication. The actual timings are architecture dependent but in general multiplication will never be slower or even as slow as division. Always optimize your code towards using multiplication if the rounding errors allow.

So in an example this would typically be slower ...

for (int i=0; i<arraySize; i++) {
    a[i] = b[i] / x;
}

... than this ...

y=1/x;
for (int i=0; i<arraySize; i++) {
    a[i] = b[i] * y;
}

Of course with rounding errors, you'll loose (a little) precision with the second method, but unless you are repeatedly calculating x=1/x; that's unlikely to cause much issue.

Edit:

Just for reference. I've dug up a third party comparison of operation timings by searching on Google.

http://gmplib.org/~tege/x86-timing.pdf

Look at the numbers on MUL and DIV. This indicates differences of between 5 and 10 times depending on the processor.

Solution 2 - C++

It is quite likely that the compiler will convert a divide to a multiply in this case, if it "thinks" it's faster. Dividing by 2 in floating point may also be faster than other float divides. If the compiler doesn't convert it, it MAY be faster to use multiply, but not certain - depends on the processor itself.

The gain from manually using multiply instead of divide can be quite large in cases where the compiler can't determine that it's "safe" to do so (e.g. 0.1 can't be stored exactly as 0.1 in a floating point number, it becomes 0.10000000149011612). See below for figures on AMD processors which can be taken as representative for the class.

To tell if your compiler does this well or not, why don't you write a bit of code to experiment. Make sure you write it so that the compiler doesn't just calculate a constant value and discards all the calculation in the loop tho'.

Edit:

AMD's optimisation guide for Family 15h processors, provide figures for fdiv and fmul are 42 and 6 respectively. SSE versions are a little closer, 24 (single) or 27 (double) cycles for DIVPS, DIVPD DIVSS and DIVSD (divide), and 6 cycles for all forms of multiply.

From memory, Intel's figures aren't that far off.

Solution 3 - C++

Floating point multiplication usually takes fewer cycles than floating point division. But with literal operands the optimizer is well aware of this kind of micro-optimizations.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionwodesuckView Question on Stackoverflow
Solution 1 - C++Philip CoulingView Answer on Stackoverflow
Solution 2 - C++Mats PeterssonView Answer on Stackoverflow
Solution 3 - C++ouahView Answer on Stackoverflow