Float and double datatype in Java

Java Problem Overview

The float data type is a single-precision 32-bit IEEE 754 floating point and the double data type is a double-precision 64-bit IEEE 754 floating point.

What does it mean? And when should I use float instead of double or vice-versa?

Java Solutions

Solution 1 - Java

The Wikipedia page on it is a good place to start.

To sum up:

float is represented in 32 bits, with 1 sign bit, 8 bits of exponent, and 23 bits of the significand (or what follows from a scientific-notation number: 2.33728*10¹²; 33728 is the significand).
double is represented in 64 bits, with 1 sign bit, 11 bits of exponent, and 52 bits of significand.

By default, Java uses double to represent its floating-point numerals (so a literal 3.14 is typed double). It's also the data type that will give you a much larger number range, so I would strongly encourage its use over float.

There may be certain libraries that actually force your usage of float, but in general - unless you can guarantee that your result will be small enough to fit in float's prescribed range, then it's best to opt with double.

If you require accuracy - for instance, you can't have a decimal value that is inaccurate (like 1/10 + 2/10), or you're doing anything with currency (for example, representing $10.33 in the system), then use a BigDecimal, which can support an arbitrary amount of precision and handle situations like that elegantly.

Solution 2 - Java

A float gives you approx. 6-7 decimal digits precision while a double gives you approx. 15-16. Also the range of numbers is larger for double.

A double needs 8 bytes of storage space while a float needs just 4 bytes.

Solution 3 - Java

Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floatingpoint types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:

   Name 	Width in Bits 	Range 
    double 	64 	            1 .7e–308 to 1.7e+308
    float 	32 	            3 .4e–038 to 3.4e+038

float

The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but don't require a large degree of precision.

Here are some example float variable declarations:

float hightemp, lowtemp;

double

Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.

Solution 4 - Java

This will give error:

public class MyClass {
    public static void main(String args[]) {
        float a = 0.5;
    }
}

/MyClass.java:3: error: incompatible types: possible lossy conversion from double to float float a = 0.5;

This will work perfectly fine

public class MyClass {
    public static void main(String args[]) {
        double a = 0.5;
    }
}

This will also work perfectly fine

public class MyClass {
    public static void main(String args[]) {
        float a = (float)0.5;
    }
}

Reason : Java by default stores real numbers as double to ensure higher precision.

Double takes more space but more precise during computation and float takes less space but less precise.

Solution 5 - Java

Java seems to have a bias towards using double for computations nonetheless:

Case in point the program I wrote earlier today, the methods didn't work when I used float, but now work great when I substituted float with double (in the NetBeans IDE):

package palettedos;
import java.util.*;

class Palettedos{
    private static Scanner Z = new Scanner(System.in);
    public static final double pi = 3.142;

    public static void main(String[]args){
        Palettedos A = new Palettedos();
        System.out.println("Enter the base and height of the triangle respectively");
        int base = Z.nextInt();
        int height = Z.nextInt();
        System.out.println("Enter the radius of the circle");
        int radius = Z.nextInt();
        System.out.println("Enter the length of the square");
        long length = Z.nextInt();
        double tArea = A.calculateArea(base, height);
        double cArea = A.calculateArea(radius);
        long sqArea = A.calculateArea(length);
        System.out.println("The area of the triangle is\t" + tArea);
        System.out.println("The area of the circle is\t" + cArea);
        System.out.println("The area of the square is\t" + sqArea);
    }

    double calculateArea(int base, int height){
        double triArea = 0.5*base*height;
        return triArea;
    }

    double calculateArea(int radius){
        double circArea = pi*radius*radius;
        return circArea;
    }

    long calculateArea(long length){
        long squaArea = length*length;
        return squaArea;
    }
}

Solution 6 - Java

According to the IEEE standards, float is a 32 bit representation of a real number while double is a 64 bit representation.

In Java programs we normally mostly see the use of double data type. It's just to avoid overflows as the range of numbers that can be accommodated using the double data type is more that the range when float is used.

Also when high precision is required, the use of double is encouraged. Few library methods that were implemented a long time ago still requires the use of float data type as a must (that is only because it was implemented using float, nothing else!).

But if you are certain that your program requires small numbers and an overflow won't occur with your use of float, then the use of float will largely improve your space complexity as floats require half the memory as required by double.

Solution 7 - Java

This example illustrates how to extract the sign (the leftmost bit), exponent (the 8 following bits) and mantissa (the 23 rightmost bits) from a float in Java.

int bits = Float.floatToIntBits(-0.005f);
int sign = bits >>> 31;
int exp = (bits >>> 23 & ((1 << 8) - 1)) - ((1 << 7) - 1);
int mantissa = bits & ((1 << 23) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Float.intBitsToFloat((sign << 31) | (exp + ((1 << 7) - 1)) << 23 | mantissa));

The same approach can be used for double’s (11 bit exponent and 52 bit mantissa).

long bits = Double.doubleToLongBits(-0.005);
long sign = bits >>> 63;
long exp = (bits >>> 52 & ((1 << 11) - 1)) - ((1 << 10) - 1);
long mantissa = bits & ((1L << 52) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Double.longBitsToDouble((sign << 63) | (exp + ((1 << 10) - 1)) << 52 | mantissa));

Credit: http://s-j.github.io/java-float/

Solution 8 - Java

You should use double instead of float for precise calculations, and float instead of double when using less accurate calculations. Float contains only decimal numbers, but double contains an IEEE754 double-precision floating point number, making it easier to contain and computate numbers more accurately. Hope this helps.

Solution 9 - Java

In regular programming calculations, we don’t use float. If we ensure that the result range is within the range of float data type then we can choose a float data type for saving memory. Generally, we use double because of two reasons:-

If we want to use the floating-point number as float data type then method caller must explicitly suffix F or f, because by default every floating-point number is treated as double. It increases the burden to the programmer. If we use a floating-point number as double data type then we don’t need to add any suffix.
Float is a single-precision data type means it occupies 4 bytes. Hence in large computations, we will not get a complete result. If we choose double data type, it occupies 8 bytes and we will get complete results.

Both float and double data types were designed especially for scientific calculations, where approximation errors are acceptable. If accuracy is the most prior concern then, it is recommended to use BigDecimal class instead of float or double data types. Source:- Float and double datatypes in Java

Content Type	Original Author	Original Content on Stackoverflow
Question	Leo	View Question on Stackoverflow
Solution 1 - Java	Makoto	View Answer on Stackoverflow
Solution 2 - Java	Henry	View Answer on Stackoverflow
Solution 3 - Java	Ye Win	View Answer on Stackoverflow
Solution 4 - Java	Himanshu Singh	View Answer on Stackoverflow
Solution 5 - Java	Raymond Wachaga	View Answer on Stackoverflow
Solution 6 - Java	Rubal	View Answer on Stackoverflow
Solution 7 - Java	okrunner	View Answer on Stackoverflow
Solution 8 - Java	boi yeet	View Answer on Stackoverflow
Solution 9 - Java	Rocco Jerry	View Answer on Stackoverflow