Linux: When to use scatter/gather IO (readv, writev) vs a large buffer with fread

LinuxIo

Linux Problem Overview


In scatter and gather (i.e. readv and writev), Linux reads into multiple buffers and writes from multiple buffers.

If say, I have a vector of 3 buffers, I can use readv, OR I can use a single buffer, which is of combined size of 3 buffers and do fread.

Hence, I am confused: For which cases should scatter/gather be used and when should a single large buffer be used?

Linux Solutions


Solution 1 - Linux

The main convenience offered by readv, writev is:

  1. It allows working with non contiguous blocks of data. i.e. buffers need not be part of an array, but separately allocated.
  2. The I/O is 'atomic'. i.e. If you do a writev, all the elements in the vector will be written in one contiguous operation, and writes done by other processes will not occur in between them.

e.g. say, your data is naturally segmented, and comes from different sources:

struct foo *my_foo;
struct bar *my_bar;
struct baz *my_baz;

my_foo = get_my_foo();
my_bar = get_my_bar();
my_baz = get_my_baz();

Now, all three 'buffers' are not one big contiguous block. But you want to write them contiguously into a file, for whatever reason (say for example, they are fields in a file header for a file format).

If you use write you have to choose between:

  1. Copying them over into one block of memory using, say, memcpy (overhead), followed by a single write call. Then the write will be atomic.
  2. Making three separate calls to write (overhead). Also, write calls from other processes can intersperse between these writes (not atomic).

If you use writev instead, its all good:

  1. You make exactly one system call, and no memcpy to make a single buffer from the three.
  2. Also, the three buffers are written atomically, as one block write. i.e. if other processes also write, then these writes will not come in between the writes of the three vectors.

So you would do something like:

struct iovec iov[3];

iov[0].iov_base = my_foo;
iov[0].iov_len = sizeof (struct foo);
iov[1].iov_base = my_bar;
iov[1].iov_len = sizeof (struct bar);
iov[2].iov_base = my_baz;
iov[2].iov_len = sizeof (struct baz);

bytes_written = writev (fd, iov, 3);

Sources:

  1. http://pubs.opengroup.org/onlinepubs/009604499/functions/writev.html
  2. http://linux.die.net/man/2/readv

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJimmView Question on Stackoverflow
Solution 1 - LinuxArjunShankarView Answer on Stackoverflow