What are the pros and cons of fs.createReadStream vs fs.readFile in node.js?

JavascriptFilenode.jsFs

Javascript Problem Overview


I'm mucking about with node.js and have discovered two ways of reading a file and sending it down the wire, once I've established that it exists and have sent the proper MIME type with writeHead:

// read the entire file into memory and then spit it out

fs.readFile(filename, function(err, data){
  if (err) throw err;
  response.write(data, 'utf8');
  response.end();
});

// read and pass the file as a stream of chunks

fs.createReadStream(filename, {
  'flags': 'r',
  'encoding': 'binary',
  'mode': 0666,
  'bufferSize': 4 * 1024
}).addListener( "data", function(chunk) {
  response.write(chunk, 'binary');
}).addListener( "close",function() {
  response.end();
});

Am I correct in assuming that fs.createReadStream might provide a better user experience if the file in question was something large, like a video? It feels like it might be less block-ish; is this true? Are there other pros, cons, caveats, or gotchas I need to know?

Javascript Solutions


Solution 1 - Javascript

A better approach, if you are just going to hook up "data" to "write()" and "close" to "end()":

// 0.3.x style
fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}).pipe(response)

// 0.2.x style
sys.pump(fs.createReadStream(filename, {
  'bufferSize': 4 * 1024
}), response)

The read.pipe(write) or sys.pump(read, write) approach has the benefit of also adding flow control. So, if the write stream cannot accept data as quickly, it'll tell the read stream to back off, so as to minimize the amount of data getting buffered in memory.

The flags:"r" and mode:0666 are implied by the fact that it is a FileReadStream. The binary encoding is deprecated -- if an encoding is not specified, it'll just work with the raw data buffers.

Also, you could add some other goodies that will make your file serving a whole lot slicker:

  1. Sniff for req.headers.range and see if it matches a string like /bytes=([0-9]+)-([0-9]+)/. If so, you want to just stream from that start to end location. (Missing number means 0 or "the end".)
  2. Hash the inode and creation time from the stat() call into an ETag header. If you get a request header with "if-none-match" matching that header, send back a 304 Not Modified.
  3. Check the if-modified-since header against the mtime date on the stat object. 304 if it wasn't modified since the date provided.

Also, in general, if you can, send a Content-Length header. (You're stat-ing the file, so you should have this.)

Solution 2 - Javascript

fs.readFile will load the entire file into memory as you pointed out, while as fs.createReadStream will read the file in chunks of the size you specify.

The client will also start receiving data faster using fs.createReadStream as it is sent out in chunks as it is being read, while as fs.readFile will read the entire file out and only then start sending it to the client. This might be negligible, but can make a difference if the file is very big and the disks are slow.

Think about this though, if you run these two functions on a 100MB file, the first one will use 100MB memory to load up the file while as the latter would only use at most 4KB.

Edit: I really don't see any reason why you'd use fs.readFile especially since you said you will be opening large files.

Solution 3 - Javascript

If it's a big file then "readFile" would hog the memory as it buffer all the file content in the memory and may hang your system. While ReadStream read in chunks.

Run this code and observe the memory usage in performance tab of task manager.

 var fs = require('fs');

const file = fs.createWriteStream('./big_file');


for(let i=0; i<= 1000000000; i++) {
  file.write('Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.\n');
}

file.end();


//..............
fs.readFile('./big_file', (err, data) => {
  if (err) throw err;
  console.log("done !!");
});

Infact,you won't see "done !!" message. "readFile" wouldn't be able to read the file content as buffer is not big enough to hold the file content.

Now instead of "readFile", use readStream and monitor memory usage.

Note : code is taken from Samer buna Node course on Pluralsight

Solution 4 - Javascript

Another, perhaps not so well known thing, is that I believe that Node is better at cleaning up non-used memory after using fs.readFile compared to fs.createReadStream. You should test this to verify what works best. Also, I know that by every new version of Node, this has gotten better (i.e. the garbage collector has become smarter with these types of situations).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKent BrewsterView Question on Stackoverflow
Solution 1 - JavascriptisaacsView Answer on Stackoverflow
Solution 2 - JavascriptChristian JoudreyView Answer on Stackoverflow
Solution 3 - JavascriptDeen JohnView Answer on Stackoverflow
Solution 4 - Javascriptcarl-johan.blomqvistView Answer on Stackoverflow