Is 'float a = 3.0;' a correct statement?

C++C++11

C++ Problem Overview


If I have the following declaration:

float a = 3.0 ;

is that an error? I read in a book that 3.0 is a double value and that I have to specify it as float a = 3.0f. Is it so?

C++ Solutions


Solution 1 - C++

It is not an error to declare float a = 3.0 : if you do, the compiler will convert the double literal 3.0 to a float for you.


However, you should use the float literals notation in specific scenarios.

  1. For performance reasons:

    Specifically, consider:

     float foo(float x) { return x * 0.42; }
    

    Here the compiler will emit a conversion (that you will pay at runtime) for each returned value. To avoid it you should declare:

     float foo(float x) { return x * 0.42f; } // OK, no conversion required
    
  2. To avoid bugs when comparing results:

    e.g. the following comparison fails :

     float x = 4.2;
     if (x == 4.2)
        std::cout << "oops"; // Not executed!
    

    We can fix it with the float literal notation :

     if (x == 4.2f)
        std::cout << "ok !"; // Executed!
    

    (Note: of course, this is not how you should compare float or double numbers for equality in general)

  3. To call the correct overloaded function (for the same reason):

    Example:

     void foo(float f) { std::cout << "\nfloat"; }
     
     void foo(double d) { std::cout << "\ndouble"; }
     
     int main()
     {       
         foo(42.0);   // calls double overload
         foo(42.0f);  // calls float overload
         return 0;
     }
    
  4. As noted by Cyber, in a type deduction context, it is necessary to help the compiler deduce a float :

    In case of auto :

     auto d = 3;      // int
     auto e = 3.0;    // double
     auto f = 3.0f;   // float
    

    And similarly, in case of template type deduction :

     void foo(float f) { std::cout << "\nfloat"; }
     
     void foo(double d) { std::cout << "\ndouble"; }
     
     template<typename T>
     void bar(T t)
     {
           foo(t);
     }
     
     int main()
     {   
         bar(42.0);   // Deduce double
         bar(42.0f);  // Deduce float
         
         return 0;
     }
    

Live demo

Solution 2 - C++

The compiler will turn any of the following literals into floats, because you declared the variable as a float.

float a = 3;     // converted to float
float b = 3.0;   // converted to float
float c = 3.0f;  // float

It would matter is if you used auto (or other type deducting methods), for example:

auto d = 3;      // int
auto e = 3.0;    // double
auto f = 3.0f;   // float

Solution 3 - C++

Floating point literals without a suffix are of type double, this is covered in the draft C++ standard section 2.14.4 Floating literals:

>[...]The type of a floating literal is double unless explicitly specified by a suffix.[...]

so is it an error to assign 3.0 a double literal to a float?:

float a = 3.0

No, it is not, it will be converted, which is covered in section 4.8 Floating point conversions:

> A prvalue of floating point type can be converted to a prvalue of > another floating point type. If the source value can be exactly > represented in the destination type, the result of the conversion is > that exact representation. If the source value is between two adjacent > destination values, the result of the conversion is an > implementation-defined choice of either of those values. Otherwise, > the behavior is undefined.

We can read more details on the implications of this in GotW #67: double or nothing which says:

> This means that a double constant can be implicitly (i.e., silently) > converted to a float constant, even if doing so loses precision (i.e., > data). This was allowed to remain for C compatibility and usability > reasons, but it's worth keeping in mind when you do floating-point > work. > > A quality compiler will warn you if you try to do something that's > undefined behavior, namely put a double quantity into a float that's > less than the minimum, or greater than the maximum, value that a float > is able to represent. A really good compiler will provide an optional > warning if you try to do something that may be defined but could lose > information, namely put a double quantity into a float that is between > the minimum and maximum values representable by a float, but which > can't be represented exactly as a float.

So there are caveats for the general case that you should be aware of.

From a practical perspective, in this case the results will most likely be the same even though technically there is a conversion, we can see this by trying out the following code on godbolt:

#include <iostream>

float func1()
{
  return 3.0; // a double literal
}


float func2()
{
  return 3.0f ; // a float literal
}

int main()
{  
  std::cout << func1() << ":" << func2() << std::endl ;
  return 0;
}

and we see that the results for func1 and func2 are identical, using both clang and gcc:

func1():
	movss	xmm0, DWORD PTR .LC0[rip]
	ret
func2():
	movss	xmm0, DWORD PTR .LC0[rip]
	ret

As Pascal points out in this comment you won't always be able to count on this. Using 0.1 and 0.1f respectively causes the assembly generated to differ since the conversion must now be done explicitly. The following code:

float func1(float x )
{
  return x*0.1; // a double literal
}

float func2(float x)
{
  return x*0.1f ; // a float literal
}

results in the following assembly:

func1(float):  
	cvtss2sd	%xmm0, %xmm0	# x, D.31147    
	mulsd	.LC0(%rip), %xmm0	#, D.31147
	cvtsd2ss	%xmm0, %xmm0	# D.31147, D.31148
	ret
func2(float):
	mulss	.LC2(%rip), %xmm0	#, D.31155
	ret

Regardless whether you can determine if the conversion will have a performance impact or not, using the correct type better documents your intention. Using an explicit conversions for example static_cast also helps to clarify the conversion was intended as opposed to accidental, which may signify a bug or potential bug.

Note

As supercat points out, multiplication by e.g. 0.1 and 0.1f is not equivalent. I am just going to quote the comment because it was excellent and a summary probably would not do it justice:

> For example, if f was equal to 100000224 (which is exactly > representable as a float), multiplying it by one tenth should yield a > result which rounds down to 10000022, but multiplying by 0.1f will > instead yield a result which erroneously rounds up to 10000023. If the > intention is to divide by ten, multiplication by double constant 0.1 > will likely be faster than division by 10f, and more precise than > multiplication by 0.1f.

My original point was to demonstrate a false example given in another question but this finely demonstrates subtle issues can exist in toy examples.

Solution 4 - C++

It's not an error in the sense that the compiler will reject it, but it is an error in the sense that it may not be what you want.

As your book correctly states, 3.0 is a value of type double. There is an implicit conversion from double to float, so float a = 3.0; is a valid definition of a variable.

However, at least conceptually, this performs a needless conversion. Depending on the compiler, the conversion may be performed at compile time, or it may be saved for run time. A valid reason for saving it for run time is that floating-point conversions are difficult and may have unexpected side effects if the value cannot be represented exactly, and it's not always easy to verify whether the value can be represented exactly.

3.0f avoids that problem: although technically, the compiler is still allowed to calculate the constant at run time (it always is), here, there is absolutely no reason why any compiler might possibly do that.

Solution 5 - C++

While not an error, per se, it is a little sloppy. You know you want a float, so initialize it with a float.
Another programmer may come along and not be sure which part of the declaration is correct, the type or the initializer. Why not have them both be correct?
float Answer = 42.0f;

Solution 6 - C++

When you define a variable, it is initialized with the provided initializer. This may require converting the value of the initializer to the type of the variable that's being initialized. That's what's happening when you say float a = 3.0;: The value of the initializer is converted to float, and the result of the conversion becomes the initial value of a.

That's generally fine, but it doesn't hurt to write 3.0f to show that you're aware of what you're doing, and especially if you want to write auto a = 3.0f.

Solution 7 - C++

If you try out the following:

std::cout << sizeof(3.2f) <<":" << sizeof(3.2) << std::endl;

you will get output as:

4:8

that shows, size of 3.2f is taken as 4 bytes on 32-bit machine wheres 3.2 is interpreted as double value taking 8 bytes on 32-bit machine. This should provide the answer that you are looking for.

Solution 8 - C++

The compiler deduces the best-fitting type from literals, or at leas what it thinks is best-fitting. That is rather lose efficiency over precision, i.e. use a double instead of float. If in doubt, use brace-intializers to make it explicit:

auto d = double{3}; // make a double
auto f = float{3}; // make a float
auto i = int{3}; // make a int

The story gets more interesting if you initialize from another variable where type-conversion rules apply: While it is legal to constuct a double form a literal, it cant be contructed from an int without possible narrowing:

auto xxx = double{i} // warning ! narrowing conversion of 'i' from 'int' to 'double' 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTESLA____View Question on Stackoverflow
Solution 1 - C++quantdevView Answer on Stackoverflow
Solution 2 - C++Cory KramerView Answer on Stackoverflow
Solution 3 - C++Shafik YaghmourView Answer on Stackoverflow
Solution 4 - C++user743382View Answer on Stackoverflow
Solution 5 - C++EngineerView Answer on Stackoverflow
Solution 6 - C++Kerrek SBView Answer on Stackoverflow
Solution 7 - C++Dr. Debasish JanaView Answer on Stackoverflow
Solution 8 - C++truschivalView Answer on Stackoverflow