How to calculate md5 hash of a file using javascript

JavascriptMd5

Javascript Problem Overview


Is there a way to calculate the MD5 hash of a file before the upload to the server using Javascript?

Javascript Solutions


Solution 1 - Javascript

While there are JS implementations of the MD5 algorithm, older browsers are generally unable to read files from the local filesystem.

I wrote that in 2009. So what about new browsers?

With a browser that supports the FileAPI, you can read the contents of a file - the user has to have selected it, either with an <input> element or drag-and-drop. As of Jan 2013, here's how the major browsers stack up:

How?

See the answer below by Benny Neugebauer which uses the MD5 function of CryptoJS

Solution 2 - Javascript

I've made a library that implements incremental md5 in order to hash large files efficiently. Basically you read a file in chunks (to keep memory low) and hash it incrementally. You got basic usage and examples in the readme.

Be aware that you need HTML5 FileAPI, so be sure to check for it. There is a full example in the test folder.

https://github.com/satazor/SparkMD5

Solution 3 - Javascript

it is pretty easy to calculate the MD5 hash using the MD5 function of CryptoJS and the HTML5 FileReader API. The following code snippet shows how you can read the binary data and calculate the MD5 hash from an image that has been dragged into your Browser:

var holder = document.getElementById('holder');

holder.ondragover = function() {
  return false;
};

holder.ondragend = function() {
  return false;
};

holder.ondrop = function(event) {
  event.preventDefault();

  var file = event.dataTransfer.files[0];
  var reader = new FileReader();

  reader.onload = function(event) {
    var binary = event.target.result;
    var md5 = CryptoJS.MD5(binary).toString();
    console.log(md5);
  };

  reader.readAsBinaryString(file);
};

I recommend to add some CSS to see the Drag & Drop area:

#holder {
  border: 10px dashed #ccc;
  width: 300px;
  height: 300px;
}

#holder.hover {
  border: 10px dashed #333;
}

More about the Drag & Drop functionality can be found here: File API & FileReader

I tested the sample in Google Chrome Version 32.

Solution 4 - Javascript

The following snippet shows an example, which can archive a throughput of 400 MB/s while reading and hashing the file.

It is using a library called hash-wasm, which is based on WebAssembly and calculates the hash faster than js-only libraries. As of 2020, all modern browsers support WebAssembly.

const chunkSize = 64 * 1024 * 1024;
const fileReader = new FileReader();
let hasher = null;

function hashChunk(chunk) {
  return new Promise((resolve, reject) => {
    fileReader.onload = async(e) => {
      const view = new Uint8Array(e.target.result);
      hasher.update(view);
      resolve();
    };

    fileReader.readAsArrayBuffer(chunk);
  });
}

const readFile = async(file) => {
  if (hasher) {
    hasher.init();
  } else {
    hasher = await hashwasm.createMD5();
  }

  const chunkNumber = Math.floor(file.size / chunkSize);

  for (let i = 0; i <= chunkNumber; i++) {
    const chunk = file.slice(
      chunkSize * i,
      Math.min(chunkSize * (i + 1), file.size)
    );
    await hashChunk(chunk);
  }

  const hash = hasher.digest();
  return Promise.resolve(hash);
};

const fileSelector = document.getElementById("file-input");
const resultElement = document.getElementById("result");

fileSelector.addEventListener("change", async(event) => {
  const file = event.target.files[0];

  resultElement.innerHTML = "Loading...";
  const start = Date.now();
  const hash = await readFile(file);
  const end = Date.now();
  const duration = end - start;
  const fileSizeMB = file.size / 1024 / 1024;
  const throughput = fileSizeMB / (duration / 1000);
  resultElement.innerHTML = `
    Hash: ${hash}<br>
    Duration: ${duration} ms<br>
    Throughput: ${throughput.toFixed(2)} MB/s
  `;
});

<script src="https://cdn.jsdelivr.net/npm/hash-wasm"></script>
<!-- defines the global `hashwasm` variable -->

<input type="file" id="file-input">
<div id="result"></div>

Solution 5 - Javascript

HTML5 + spark-md5 and Q

Assuming your'e using a modern browser (that supports HTML5 File API), here's how you calculate the MD5 Hash of a large file (it will calculate the hash on variable chunks)

function calculateMD5Hash(file, bufferSize) {
  var def = Q.defer();

  var fileReader = new FileReader();
  var fileSlicer = File.prototype.slice || File.prototype.mozSlice || File.prototype.webkitSlice;
  var hashAlgorithm = new SparkMD5();
  var totalParts = Math.ceil(file.size / bufferSize);
  var currentPart = 0;
  var startTime = new Date().getTime();

  fileReader.onload = function(e) {
    currentPart += 1;

    def.notify({
      currentPart: currentPart,
      totalParts: totalParts
    });

    var buffer = e.target.result;
    hashAlgorithm.appendBinary(buffer);

    if (currentPart < totalParts) {
      processNextPart();
      return;
    }

    def.resolve({
      hashResult: hashAlgorithm.end(),
      duration: new Date().getTime() - startTime
    });
  };

  fileReader.onerror = function(e) {
    def.reject(e);
  };

  function processNextPart() {
    var start = currentPart * bufferSize;
    var end = Math.min(start + bufferSize, file.size);
    fileReader.readAsBinaryString(fileSlicer.call(file, start, end));
  }

  processNextPart();
  return def.promise;
}

function calculate() {

  var input = document.getElementById('file');
  if (!input.files.length) {
    return;
  }

  var file = input.files[0];
  var bufferSize = Math.pow(1024, 2) * 10; // 10MB

  calculateMD5Hash(file, bufferSize).then(
    function(result) {
      // Success
      console.log(result);
    },
    function(err) {
      // There was an error,
    },
    function(progress) {
      // We get notified of the progress as it is executed
      console.log(progress.currentPart, 'of', progress.totalParts, 'Total bytes:', progress.currentPart * bufferSize, 'of', progress.totalParts * bufferSize);
    });
}

<script src="https://cdnjs.cloudflare.com/ajax/libs/q.js/1.4.1/q.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/spark-md5/2.0.2/spark-md5.min.js"></script>

<div>
  <input type="file" id="file"/>
  <input type="button" onclick="calculate();" value="Calculate" class="btn primary" />
</div>

Solution 6 - Javascript

You need to to use FileAPI. It is available in the latest FF & Chrome, but not IE9. Grab any md5 JS implementation suggested above. I've tried this and abandoned it because JS was too slow (minutes on large image files). Might revisit it if someone rewrites MD5 using typed arrays.

Code would look something like this:

HTML:     
<input type="file" id="file-dialog" multiple="true" accept="image/*">

JS (w JQuery)

$("#file-dialog").change(function() {
  handleFiles(this.files);
});

function handleFiles(files) {
    for (var i=0; i<files.length; i++) {
        var reader = new FileReader();
        reader.onload = function() {
        var md5 = binl_md5(reader.result, reader.result.length);
            console.log("MD5 is " + md5);
        };
        reader.onerror = function() {
            console.error("Could not read the file");
        };
        reader.readAsBinaryString(files.item(i));
     }
 }

Solution 7 - Javascript

> Apart from the impossibility to get > file system access in JS, I would not > put any trust at all in a > client-generated checksum. So > generating the checksum on the server > is mandatory in any case. – Tomalak > Apr 20 '09 at 14:05

Which is useless in most cases. You want the MD5 computed at client side, so that you can compare it with the code recomputed at server side and conclude the upload went wrong if they differ. I have needed to do that in applications working with large files of scientific data, where receiving uncorrupted files were key. My cases was simple, cause users had the MD5 already computed from their data analysis tools, so I just needed to ask it to them with a text field.

Solution 8 - Javascript

To get the hash of files, there are a lot of options. Normally the problem is that it's really slow to get the hash of big files.

I created a little library that get the hash of files, with the 64kb of the start of the file and the 64kb of the end of it.

Live example: http://marcu87.github.com/hashme/ and library: https://github.com/marcu87/hashme

Solution 9 - Javascript

There is a couple scripts out there on the internet to create an MD5 Hash.

The one from webtoolkit is good, http://www.webtoolkit.info/javascript-md5.html

Although, I don't believe it will have access to the local filesystem as that access is limited.

Solution 10 - Javascript

hope you have found a good solution by now. If not, the solution below is an ES6 promise implementation based on js-spark-md5

import SparkMD5 from 'spark-md5';

// Read in chunks of 2MB
const CHUCK_SIZE = 2097152;

/**
 * Incrementally calculate checksum of a given file based on MD5 algorithm
 */
export const checksum = (file) =>
  new Promise((resolve, reject) => {
    let currentChunk = 0;
    const chunks = Math.ceil(file.size / CHUCK_SIZE);
    const blobSlice =
      File.prototype.slice ||
      File.prototype.mozSlice ||
      File.prototype.webkitSlice;
    const spark = new SparkMD5.ArrayBuffer();
    const fileReader = new FileReader();

    const loadNext = () => {
      const start = currentChunk * CHUCK_SIZE;
      const end =
        start + CHUCK_SIZE >= file.size ? file.size : start + CHUCK_SIZE;

      // Selectively read the file and only store part of it in memory.
      // This allows client-side applications to process huge files without the need for huge memory
      fileReader.readAsArrayBuffer(blobSlice.call(file, start, end));
    };

    fileReader.onload = e => {
      spark.append(e.target.result);
      currentChunk++;

      if (currentChunk < chunks) loadNext();
      else resolve(spark.end());
    };

    fileReader.onerror = () => {
      return reject('Calculating file checksum failed');
    };

    loadNext();
  });

Solution 11 - Javascript

If sha256 is also fine:

  async sha256(file: File) {
    // get byte array of file
    let buffer = await file.arrayBuffer();

    // hash the message
    const hashBuffer = await crypto.subtle.digest('SHA-256', buffer);

    // convert ArrayBuffer to Array
    const hashArray = Array.from(new Uint8Array(hashBuffer));

    // convert bytes to hex string
    const hashHex = hashArray.map(b => b.toString(16).padStart(2, '0')).join('');
    return hashHex;
  }

Solution 12 - Javascript

I don't believe there is a way in javascript to access the contents of a file upload. So you therefore cannot look at the file contents to generate an MD5 sum.

You can however send the file to the server, which can then send an MD5 sum back or send the file contents back .. but that's a lot of work and probably not worthwhile for your purposes.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLuRsTView Question on Stackoverflow
Solution 1 - JavascriptPaul DixonView Answer on Stackoverflow
Solution 2 - JavascriptsatazorView Answer on Stackoverflow
Solution 3 - JavascriptBenny NeugebauerView Answer on Stackoverflow
Solution 4 - JavascriptBiró DaniView Answer on Stackoverflow
Solution 5 - JavascriptJossef Harush KadouriView Answer on Stackoverflow
Solution 6 - JavascriptAleksandar ToticView Answer on Stackoverflow
Solution 7 - JavascriptMarcoView Answer on Stackoverflow
Solution 8 - JavascriptMarco AntonioView Answer on Stackoverflow
Solution 9 - JavascriptbendeweyView Answer on Stackoverflow
Solution 10 - JavascriptZico DengView Answer on Stackoverflow
Solution 11 - JavascriptwutzebaerView Answer on Stackoverflow
Solution 12 - JavascriptkbosakView Answer on Stackoverflow