Getting a File's MD5 Checksum in Java

JavaMd5Checksum

Java Problem Overview


I am looking to use Java to get the MD5 checksum of a file. I was really surprised but I haven't been able to find anything that shows how to get the MD5 checksum of a file.

How is it done?

Java Solutions


Solution 1 - Java

There's an input stream decorator, java.security.DigestInputStream, so that you can compute the digest while using the input stream as you normally would, instead of having to make an extra pass over the data.

MessageDigest md = MessageDigest.getInstance("MD5");
try (InputStream is = Files.newInputStream(Paths.get("file.txt"));
     DigestInputStream dis = new DigestInputStream(is, md)) 
{
  /* Read decorated stream (dis) to EOF as normal... */
}
byte[] digest = md.digest();

Solution 2 - Java

Use DigestUtils from Apache Commons Codec library:

try (InputStream is = Files.newInputStream(Paths.get("file.zip"))) {
    String md5 = org.apache.commons.codec.digest.DigestUtils.md5Hex(is);
}

Solution 3 - Java

There's an example at Real's Java-How-to using the MessageDigest class.

Check that page for examples using CRC32 and SHA-1 as well.

import java.io.*;
import java.security.MessageDigest;

public class MD5Checksum {

   public static byte[] createChecksum(String filename) throws Exception {
       InputStream fis =  new FileInputStream(filename);

       byte[] buffer = new byte[1024];
       MessageDigest complete = MessageDigest.getInstance("MD5");
       int numRead;

       do {
           numRead = fis.read(buffer);
           if (numRead > 0) {
               complete.update(buffer, 0, numRead);
           }
       } while (numRead != -1);

       fis.close();
       return complete.digest();
   }

   // see this How-to for a faster way to convert
   // a byte array to a HEX string
   public static String getMD5Checksum(String filename) throws Exception {
       byte[] b = createChecksum(filename);
       String result = "";

       for (int i=0; i < b.length; i++) {
           result += Integer.toString( ( b[i] & 0xff ) + 0x100, 16).substring( 1 );
       }
       return result;
   }

   public static void main(String args[]) {
       try {
           System.out.println(getMD5Checksum("apache-tomcat-5.5.17.exe"));
           // output :
           //  0bb2827c5eacf570b6064e24e0e6653b
           // ref :
           //  http://www.apache.org/dist/
           //          tomcat/tomcat-5/v5.5.17/bin
           //              /apache-tomcat-5.5.17.exe.MD5
           //  0bb2827c5eacf570b6064e24e0e6653b *apache-tomcat-5.5.17.exe
       }
       catch (Exception e) {
           e.printStackTrace();
       }
   }
}

Solution 4 - Java

The com.google.common.hash API offers:

  • A unified user-friendly API for all hash functions
  • Seedable 32- and 128-bit implementations of murmur3
  • md5(), sha1(), sha256(), sha512() adapters, change only one line of code to switch between these, and murmur.
  • goodFastHash(int bits), for when you don't care what algorithm you use
  • General utilities for HashCode instances, like combineOrdered / combineUnordered

Read the User Guide (IO Explained, Hashing Explained).

For your use-case Files.hash() computes and returns the digest value for a file.

For example a [tag:SHA-1] digest calculation (change SHA-1 to MD5 to get MD5 digest)

HashCode hc = Files.asByteSource(file).hash(Hashing.sha1());
"SHA-1: " + hc.toString();

Note that [tag:CRC32] is much faster than [tag:MD5], so use [tag:CRC32] if you do not need a cryptographically secure checksum. Note also that [tag:MD5] should not be used to store passwords and the like since it is to easy to brute force, for passwords use [tag:bcrypt], [tag:scrypt] or [tag:SHA-256] instead.

For long term protection with hashes a Merkle signature scheme adds to the security and The Post Quantum Cryptography Study Group sponsored by the European Commission has recommended use of this cryptography for long term protection against quantum computers (ref).

Note that [tag:CRC32] has a higher collision rate than the others.

Solution 5 - Java

Using nio2 (Java 7+) and no external libraries:

byte[] b = Files.readAllBytes(Paths.get("/path/to/file"));
byte[] hash = MessageDigest.getInstance("MD5").digest(b);

To compare the result with an expected checksum:

String expected = "2252290BC44BEAD16AA1BF89948472E8";
String actual = DatatypeConverter.printHexBinary(hash);
System.out.println(expected.equalsIgnoreCase(actual) ? "MATCH" : "NO MATCH");

Solution 6 - Java

Guava now provides a new, consistent hashing API that is much more user-friendly than the various hashing APIs provided in the JDK. See Hashing Explained. For a file, you can get the MD5 sum, CRC32 (with version 14.0+) or many other hashes easily:

HashCode md5 = Files.hash(file, Hashing.md5());
byte[] md5Bytes = md5.asBytes();
String md5Hex = md5.toString();

HashCode crc32 = Files.hash(file, Hashing.crc32());
int crc32Int = crc32.asInt();

// the Checksum API returns a long, but it's padded with 0s for 32-bit CRC
// this is the value you would get if using that API directly
long checksumResult = crc32.padToLong();

Solution 7 - Java

Ok. I had to add. One line implementation for those who already have Spring and Apache Commons dependency or are planning to add it:

DigestUtils.md5DigestAsHex(FileUtils.readFileToByteArray(file))

For and Apache commons only option (credit @duleshi):

DigestUtils.md5Hex(FileUtils.readFileToByteArray(file))

Hope this helps someone.

Solution 8 - Java

A simple approach with no third party libraries using Java 7

String path = "your complete file path";
MessageDigest md = MessageDigest.getInstance("MD5");
md.update(Files.readAllBytes(Paths.get(path)));
byte[] digest = md.digest();

If you need to print this byte array. Use as below

System.out.println(Arrays.toString(digest));

If you need hex string out of this digest. Use as below

String digestInHex = DatatypeConverter.printHexBinary(digest).toUpperCase();
System.out.println(digestInHex);

where DatatypeConverter is javax.xml.bind.DatatypeConverter

Solution 9 - Java

I recently had to do this for just a dynamic string, MessageDigest can represent the hash in numerous ways. To get the signature of the file like you would get with the md5sum command I had to do something like the this:

try {
   String s = "TEST STRING";
   MessageDigest md5 = MessageDigest.getInstance("MD5");
   md5.update(s.getBytes(),0,s.length());
   String signature = new BigInteger(1,md5.digest()).toString(16);
   System.out.println("Signature: "+signature);

} catch (final NoSuchAlgorithmException e) {
   e.printStackTrace();
}

This obviously doesn't answer your question about how to do it specifically for a file, the above answer deals with that quiet nicely. I just spent a lot of time getting the sum to look like most application's display it, and thought you might run into the same trouble.

Solution 10 - Java

public static void main(String[] args) throws Exception {
    MessageDigest md = MessageDigest.getInstance("MD5");
    FileInputStream fis = new FileInputStream("c:\\apache\\cxf.jar");
 
    byte[] dataBytes = new byte[1024];
 
    int nread = 0;
    while ((nread = fis.read(dataBytes)) != -1) {
        md.update(dataBytes, 0, nread);
    };
    byte[] mdbytes = md.digest();
    StringBuffer sb = new StringBuffer();
    for (int i = 0; i < mdbytes.length; i++) {
        sb.append(Integer.toString((mdbytes[i] & 0xff) + 0x100, 16).substring(1));
    }
    System.out.println("Digest(in hex format):: " + sb.toString());
}

Or you may get more info http://www.asjava.com/core-java/java-md5-example/

Solution 11 - Java

We were using code that resembles the code above in a previous post using

...
String signature = new BigInteger(1,md5.digest()).toString(16);
...

However, watch out for using BigInteger.toString() here, as it will truncate leading zeros... (for an example, try s = "27", checksum should be "02e74f10e0327ad868d138f2b4fdd6f0")

I second the suggestion to use Apache Commons Codec, I replaced our own code with that.

Solution 12 - Java

public static String MD5Hash(String toHash) throws RuntimeException {
   try{
       return String.format("%032x", // produces lower case 32 char wide hexa left-padded with 0
      new BigInteger(1, // handles large POSITIVE numbers 
           MessageDigest.getInstance("MD5").digest(toHash.getBytes())));
   }
   catch (NoSuchAlgorithmException e) {
      // do whatever seems relevant
   }
}

Solution 13 - Java

Very fast & clean Java-method that doesn't rely on external libraries:

(Simply replace MD5 with SHA-1, SHA-256, SHA-384 or SHA-512 if you want those)

public String calcMD5() throws Exception{
		byte[] buffer = new byte[8192];
		MessageDigest md = MessageDigest.getInstance("MD5");
		
		DigestInputStream dis = new DigestInputStream(new FileInputStream(new File("Path to file")), md);
		try {
			while (dis.read(buffer) != -1);
		}finally{
			dis.close();
		}

		byte[] bytes = md.digest();
	
		// bytesToHex-method
		char[] hexChars = new char[bytes.length * 2];
		for ( int j = 0; j < bytes.length; j++ ) {
			int v = bytes[j] & 0xFF;
			hexChars[j * 2] = hexArray[v >>> 4];
			hexChars[j * 2 + 1] = hexArray[v & 0x0F];
		}

		return new String(hexChars);
}

Solution 14 - Java

String checksum = DigestUtils.md5Hex(new FileInputStream(filePath));

Solution 15 - Java

Here is a handy variation that makes use of InputStream.transferTo() from Java 9, and OutputStream.nullOutputStream() from Java 11. It requires no external libraries and does not need to load the entire file into memory.

public static String hashFile(String algorithm, File f) throws IOException, NoSuchAlgorithmException {
    MessageDigest md = MessageDigest.getInstance(algorithm);

    try(BufferedInputStream in = new BufferedInputStream((new FileInputStream(f)));
        DigestOutputStream out = new DigestOutputStream(OutputStream.nullOutputStream(), md)) {
        in.transferTo(out);
    }

    String fx = "%0" + (md.getDigestLength()*2) + "x";
    return String.format(fx, new BigInteger(1, md.digest()));
}

and

hashFile("SHA-512", Path.of("src", "test", "resources", "some.txt").toFile());

returns

"e30fa2784ba15be37833d569280e2163c6f106506dfb9b07dde67a24bfb90da65c661110cf2c5c6f71185754ee5ae3fd83a5465c92f72abd888b03187229da29"

Solution 16 - Java

Another implementation: Fast MD5 Implementation in Java

String hash = MD5.asHex(MD5.getHash(new File(filename)));

Solution 17 - Java

Standard Java Runtime Environment way:

public String checksum(File file) {
  try {
    InputStream fin = new FileInputStream(file);
    java.security.MessageDigest md5er =
        MessageDigest.getInstance("MD5");
    byte[] buffer = new byte[1024];
    int read;
    do {
      read = fin.read(buffer);
      if (read > 0)
        md5er.update(buffer, 0, read);
    } while (read != -1);
    fin.close();
    byte[] digest = md5er.digest();
    if (digest == null)
      return null;
    String strDigest = "0x";
    for (int i = 0; i < digest.length; i++) {
      strDigest += Integer.toString((digest[i] & 0xff) 
                + 0x100, 16).substring(1).toUpperCase();
    }
    return strDigest;
  } catch (Exception e) {
    return null;
  }
}

The result is equal of linux md5sum utility.

Solution 18 - Java

Here is a simple function that wraps around Sunil's code so that it takes a File as a parameter. The function does not need any external libraries, but it does require Java 7.

import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

import javax.xml.bind.DatatypeConverter;

public class Checksum {
    
    /**
     * Generates an MD5 checksum as a String.
     * @param file The file that is being checksummed.
     * @return Hex string of the checksum value.
     * @throws NoSuchAlgorithmException
     * @throws IOException
     */
    public static String generate(File file) throws NoSuchAlgorithmException,IOException {
    
        MessageDigest messageDigest = MessageDigest.getInstance("MD5");
        messageDigest.update(Files.readAllBytes(file.toPath()));
        byte[] hash = messageDigest.digest();
    
        return DatatypeConverter.printHexBinary(hash).toUpperCase();
    }
    
    public static void main(String argv[]) throws NoSuchAlgorithmException, IOException {
        File file = new File("/Users/foo.bar/Documents/file.jar");          
        String hex = Checksum.generate(file);
        System.out.printf("hex=%s\n", hex);            
    }


}

Example output:

hex=B117DD0C3CBBD009AC4EF65B6D75C97B

Solution 19 - Java

If you're using ANT to build, this is dead-simple. Add the following to your build.xml:

<checksum file="${jarFile}" todir="${toDir}"/>

Where jarFile is the JAR you want to generate the MD5 against, and toDir is the directory you want to place the MD5 file.

More info here.

Solution 20 - Java

Google guava provides a new API. Find the one below :

public static HashCode hash(File file,
            HashFunction hashFunction)
                     throws IOException

Computes the hash code of the file using hashFunction.

Parameters:
    file - the file to read
    hashFunction - the hash function to use to hash the data
Returns:
    the HashCode of all of the bytes in the file
Throws:
    IOException - if an I/O error occurs
Since:
    12.0

Solution 21 - Java

public static String getMd5OfFile(String filePath)
{
    String returnVal = "";
	try 
	{
		InputStream   input   = new FileInputStream(filePath); 
		byte[]        buffer  = new byte[1024];
	    MessageDigest md5Hash = MessageDigest.getInstance("MD5");
	    int           numRead = 0;
	    while (numRead != -1)
	    {
	        numRead = input.read(buffer);
	        if (numRead > 0)
	        {
	        	md5Hash.update(buffer, 0, numRead);
	        }
	    }
	    input.close();
	    
	    byte [] md5Bytes = md5Hash.digest();
	    for (int i=0; i < md5Bytes.length; i++)
	    {
	        returnVal += Integer.toString( ( md5Bytes[i] & 0xff ) + 0x100, 16).substring( 1 );
	    }
    } 
	catch(Throwable t) {t.printStackTrace();}
    return returnVal.toUpperCase();
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJackView Question on Stackoverflow
Solution 1 - JavaericksonView Answer on Stackoverflow
Solution 2 - JavaLeif GruenwoldtView Answer on Stackoverflow
Solution 3 - JavaBill the LizardView Answer on Stackoverflow
Solution 4 - JavaoluiesView Answer on Stackoverflow
Solution 5 - JavaassyliasView Answer on Stackoverflow
Solution 6 - JavaColinDView Answer on Stackoverflow
Solution 7 - JavaMickJView Answer on Stackoverflow
Solution 8 - JavasunilView Answer on Stackoverflow
Solution 9 - JavaBrian GianforcaroView Answer on Stackoverflow
Solution 10 - JavaJamView Answer on Stackoverflow
Solution 11 - Javauser552999View Answer on Stackoverflow
Solution 12 - JavaF.XView Answer on Stackoverflow
Solution 13 - JavaDavidView Answer on Stackoverflow
Solution 14 - JavaRavikiran kalalView Answer on Stackoverflow
Solution 15 - JavaBillView Answer on Stackoverflow
Solution 16 - JavaLukasz R.View Answer on Stackoverflow
Solution 17 - JavagotozeroView Answer on Stackoverflow
Solution 18 - Javastackoverflowuser2010View Answer on Stackoverflow
Solution 19 - JavaMatt BrockView Answer on Stackoverflow
Solution 20 - JavaBalaji Boggaram RamanarayanView Answer on Stackoverflow
Solution 21 - JavaXXXView Answer on Stackoverflow