What is the difference between buffer and cache memory in Linux?

LinuxCachingMemoryBuffer

Linux Problem Overview


To me it's not clear what's the difference between the two Linux memory concepts : buffer and cache. I've read through this post and it seems to me that the difference between them is the expiration policy:

  1. buffer's policy is first-in, first-out
  2. cache's policy is Least Recently Used.

Am I right?

In particular, I'm looking at the two commands: free and vmstat

james@utopia:~$ vmstat -S M
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
5  0      0    173     67    912    0    0    19    59   75 1087 24  4 71  1
james@utopia:~$ free -m
             total       used       free     shared    buffers     cached
Mem:          2007       1834        172          0         67        914
-/+ buffers/cache:        853       1153
Swap:         2859          0       2859

Linux Solutions


Solution 1 - Linux

> Buffers are associated with a specific block device, and cover caching > of filesystem metadata as well as tracking in-flight pages. The cache > only contains parked file data. That is, the buffers remember what's > in directories, what file permissions are, and keep track of what > memory is being written from or read to for a particular block device. > The cache only contains the contents of the files themselves.

quote link

Solution 2 - Linux

Cited answer (for reference):

> Short answer: Cached is the size of the page cache. Buffers is the size of in-memory block I/O buffers. Cached matters; Buffers is largely irrelevant. > > Long answer: Cached is the size of the Linux page cache, minus the memory in the swap cache, which is represented by SwapCached (thus the total page cache size is Cached + SwapCached). Linux performs all file I/O through the page cache. Writes are implemented as simply marking as dirty the corresponding pages in the page cache; the flusher threads then periodically write back to disk any dirty pages. Reads are implemented by returning the data from the page cache; if the data is not yet in the cache, it is first populated. On a modern Linux system, Cached can easily be several gigabytes. It will shrink only in response to memory pressure. The system will purge the page cache along with swapping data out to disk to make available more memory as needed. > > Buffers are in-memory block I/O buffers. They are relatively short-lived. Prior to Linux kernel version 2.4, Linux had separate page and buffer caches. Since 2.4, the page and buffer cache are unified and Buffers is raw disk blocks not represented in the page cache—i.e., not file data. The Buffers metric is thus of minimal importance. On most systems, Buffers is often only tens of megabytes.

Solution 3 - Linux

"Buffers" represent how much portion of RAM is dedicated to cache disk blocks. "Cached" is similar like "Buffers", only this time it caches pages from file reading.

quote from:

Solution 4 - Linux

It's not 'quite' as simple as this, but it might help understand:

Buffer is for storing file metadata (permissions, location, etc). Every memory page is kept track of here.

Cache is for storing actual file contents.

Solution 5 - Linux

Explained by Red Hat:

Cache Pages:

A cache is the part of the memory which transparently stores data so that future requests for that data can be served faster. This memory is utilized by the kernel to cache disk data and improve i/o performance.

The Linux kernel is built in such a way that it will use as much RAM as it can to cache information from your local and remote filesystems and disks. As the time passes over various reads and writes are performed on the system, kernel tries to keep data stored in the memory for the various processes which are running on the system or the data that of relevant processes which would be used in the near future. The cache is not reclaimed at the time when process get stop/exit, however when the other processes requires more memory then the free available memory, kernel will run heuristics to reclaim the memory by storing the cache data and allocating that memory to new process.

When any kind of file/data is requested then the kernel will look for a copy of the part of the file the user is acting on, and, if no such copy exists, it will allocate one new page of cache memory and fill it with the appropriate contents read out from the disk.

The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere in the disk. When some data is requested, the cache is first checked to see whether it contains that data. The data can be retrieved more quickly from the cache than from its source origin.

SysV shared memory segments are also accounted as a cache, though they do not represent any data on the disks. One can check the size of the shared memory segments using ipcs -m command and checking the bytes column.

Buffers:

Buffers are the disk block representation of the data that is stored under the page caches. Buffers contains the metadata of the files/data which resides under the page cache. Example: When there is a request of any data which is present in the page cache, first the kernel checks the data in the buffers which contain the metadata which points to the actual files/data contained in the page caches. Once from the metadata the actual block address of the file is known, it is picked up by the kernel for processing.

Solution 6 - Linux

buffer and cache.

A buffer is something that has yet to be "written" to disk.

A cache is something that has been "read" from the disk and stored for later use.

Solution 7 - Linux

I think this page will help understanding the difference between buffer and cache deeply. http://www.tldp.org/LDP/sag/html/buffer-cache.html

Reading from a disk is very slow compared to accessing (real) memory. In addition, it is common to read the same part of a disk several times during relatively short periods of time. For example, one might first read an e-mail message, then read the letter into an editor when replying to it, then make the mail program read it again when copying it to a folder. Or, consider how often the command ls might be run on a system with many users. By reading the information from disk only once and then keeping it in memory until no longer needed, one can speed up all but the first read. This is called disk buffering, and the memory used for the purpose is called the buffer cache.

Since memory is, unfortunately, a finite, nay, scarce resource, the buffer cache usually cannot be big enough (it can't hold all the data one ever wants to use). When the cache fills up, the data that has been unused for the longest time is discarded and the memory thus freed is used for the new data.

Disk buffering works for writes as well. On the one hand, data that is written is often soon read again (e.g., a source code file is saved to a file, then read by the compiler), so putting data that is written in the cache is a good idea. On the other hand, by only putting the data into the cache, not writing it to disk at once, the program that writes runs quicker. The writes can then be done in the background, without slowing down the other programs.

Solution 8 - Linux

Seth Robertson's Link 2 said "For thorough understanding of those terms, refer to Linux kernel book like Linux Kernel Development by Robert M. Love."

I found some contents about 'buffer' in the 2nd edition of the book.

> Although the physical device itself is addressable at the sector level, the kernel performs all disk operations in terms of blocks. > > When a block is stored in memory (say, after a read or pending a write), it is stored in a 'buffer'. Each 'buffer' is associated with exactly one block. The 'buffer' serves as the object that represents a disk block in memory. > > A 'buffer' is the in-memory representation of a single physical disk block. > > Block I/O operations manipulate a single disk block at a time. A common block I/O operation is reading and writing inodes. The kernel provides the bread() function to perform a low-level read of a single block from disk. Via 'buffers', disk blocks are mapped to their associated in-memory pages. "

Solution 9 - Linux

Buffer is an area of memory used to temporarily store data while it's being moved from one place to another.

Cache is a temporary storage area used to store frequently accessed data for rapid access. Once the data is stored in the cache, future use can be done by accessing the cached copy rather than re-fetching the original data, so that the average access time is shorter.

Note: buffer and cache can be allocated on disk as well

Solution 10 - Linux

Quote from the book: Introduction to Information Retrieval

Cache > We want to keep as much data as possible in memory, especially those data that we need to access frequently. We call the technique of keeping frequently used disk data in main memory caching.

Buffer

> Operating systems generally read and write entire blocks. Thus, reading a single byte from disk can take as much time as reading the entire block. Block sizes of 8, 16, 32, and 64 kilobytes (KB) are common. We call the part of main memory where a block being read or written is stored a buffer.

Solution 11 - Linux

Buffer contains metadata which helps improve write performance

Cache contains the file content itself (sometimes yet to write to disk) which improves read performance

Solution 12 - Linux

Cache: This is a place acquired by kernel on physical RAM to store pages in caches. Now we need some sort of index to get the address of pages from caches. Here we need the buffer for page caches which keeps metadata of page cache.

Solution 13 - Linux

For starters the general concept would be helpful, a buffer is an area of memory used to temporarily store data while being moved from one place to another. On the other hand, a cache is a temporary storage area to store frequently accessed data for rapid access.

In Linux: The cache in Linux is called Page Cache. It is that certain amount of system memory that the kernel reserves for caching the file system disk accesses. This is to make overall performance faster. During Linux read system calls, the kernel checks if the cache contains the requested blocks of data. If it does, then that would be a successful cache hit. The cache returns this data without doing any I/O to the disk system. The Linux cache approach is called a write-back cache. This means first, the data is written to cache memory and marked as dirty until synchronized to disk. Then, the kernel maintains the internal data structure to optimize which data to evict from the cache when the cache demands any additional space. For example, when memory usage reaches certain thresholds, background tasks start writing dirty data to disk, thereby emptying the memory cache. Reading from a disk is very slow compared to accessing (real) memory. In addition, it is common to read the same part of a disk several times during relatively short periods of time. For example, one might first read an e-mail message, then read the letter into an editor when replying to it, then make the mail program read it again when copying it to a folder. Or, consider how often the command ls might be run on a system with many users. By reading the information from disk only once and then keeping it in memory until no longer needed, one can speed up all but the first read. This is called disk buffering, and the memory used for the purpose is called the buffer cache.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJames.XuView Question on Stackoverflow
Solution 1 - LinuxxoyView Answer on Stackoverflow
Solution 2 - LinuxsocketpairView Answer on Stackoverflow
Solution 3 - LinuxSeth RobertsonView Answer on Stackoverflow
Solution 4 - Linuxn00berView Answer on Stackoverflow
Solution 5 - LinuxIjaz AhmadView Answer on Stackoverflow
Solution 6 - LinuxChaiZhiView Answer on Stackoverflow
Solution 7 - LinuxEricView Answer on Stackoverflow
Solution 8 - LinuxChao YinView Answer on Stackoverflow
Solution 9 - LinuxAbigailView Answer on Stackoverflow
Solution 10 - LinuxyantaqView Answer on Stackoverflow
Solution 11 - LinuxkarthikView Answer on Stackoverflow
Solution 12 - LinuxMayank C KoliView Answer on Stackoverflow
Solution 13 - LinuxHunter8View Answer on Stackoverflow