malloc vs mmap in C

CMallocMmap

C Problem Overview


I built two programs, one using malloc and other one using mmap. The execution time using mmap is much less than using malloc.

I know for example that when you're using mmap you avoid read/writes calls to the system. And the memory access are less.

But are there any other reasons for the advantages when using mmap over malloc?

Thanks a lot

C Solutions


Solution 1 - C

Look folks, contrary to common believe, mmap is indeed a memory allocation function similar to malloc..

the mmaped file is one use of it.. you can use it as memory allocation function passing -1 as file descriptor..

so.. the common use is to use malloc for tiny objects and mmap for large ones..

this is a good strategy..

i use alloca() to for function scope only variables..

Solution 2 - C

I assume that you are referring to using mmap and malloc for reading data from files. In that case you pretty much got the main point:

  • using fread/fwrite you have to make many calls to the OS.
  • using mmap you appear to get access to the entire file in one operation. This is not entirely true because the OS probably maps the file one memory page at a time, but it is still a lot faster.

Solution 3 - C

mmap doesn't actually load the file into memory, so it will load faster, but editing it will be slower.

Another point is that mmap doesn't use any memory, but it takes up address space. On a 64bit machine, most of the memory address space will not have memory, so you could load up huge files, say 5GB, that you would not want to malloc.

Solution 4 - C

Both malloc and mmap are slow at times. It depends mostly on the usage pattern:

mmap: The kernel paging subsystem works in page sized units. This means, if you want to read a whole page from a file and want to repeatedly do that(good localization) it will be fine with mmap. Contrary, if you map that 5 Gb file and do scattered access, you'll have the kernel swap pages in and out a lot. In addition to the actual I/O the page management will take some time. If you have concerns about latency, avoid this access pattern, as the Linux page reclaim mechanism tends to be bursty and will cause noticeable lags, and the cache poisoning will slow down other processes.

malloc: It is fine when you need memory that's not in page size units. but you cannot do things like mlock() sanely. In terms of I/O the speed is very much dependent on how you do it. fread/fwrite may map pages behind the scenes, or will do buffering in userspace. Localized access will be rather fast. read/write go directly through the kernel, so small distributed accesses will still cause I/O due to cache misses, but the actual data transferred from kernel->userspace will be slightly less. I do not know if that is measurable.

Unless mlock()'ed, user pages may be swapped out/written back at any time. This takes time, too. So on systems with little memory, the variant that maps the least memory will win. With Linux kernel every system has too little memory as the unused pages are used for caching I/O, and the kernel may take noticeable time to make them available if memory use or I/O is bursty.

Solution 5 - C

mmap doesn't actually read the file. It just maps it to address space. That's why it's so fast, there is no disc I/O until you actually access that region of address space.

malloc is simply a mapping of address space to memory

Solution 6 - C

> By mmap RAM is not granted. Address space is granted.

When the address space is accessed a page fault becomes. During the page fault in page size, 4096 bytes typically, RAM is provided.

RAM content also is provided. If by a file the address space is backed then file content will appear. If by MAP_ANONYMOUS the address space is backed then zero initialized RAM appears.

By the above two boons are described. First, exactly as desired RAM can be initialized. Second, until required RAM is not provided.

For a less than 2 megabyte address request by malloc the program break is expanded. While addresses close to the program break are being provided the program break can not be contracted. Therefore, to the kernel freed RAM might not be returned. An analogy follows. Can socks be removed before shoes?

By munmap invocation to the kernel RAM is immediately returned. By mmap and munmap use swap probability is mitigated. By malloc program break expansion swap probability is incited.

By malloc less than page size memory can be allocated. Discontinuous memory becomes. Kernel memory also can fragment. Neither is perfect.

On any idle processor by the kernel RAM can be defraged. 2 megabyte sized transparent huge pages are created. As compared with 512 page faults to provide 2M When by a single page fault 2M can be provided a significant performance boon becomes.

By mmap at least one notable bane exists. For a mmap backing a pipe file descriptor can be used. An error does not become. However, in the memory address the pipe provided data does not appear.

However, if MAP_ANONYMOUS is used then from the pipe file descriptor into the mmap provided address the data can be read. While not as efficient the desired outcome becomes. By a lseek failed return and errno a pipe attached file descriptor can be identified.

By computers that can address an entire megabyte and run a disk based operating system then malloc use is essential. If using C library provided getline function then malloc and free will probably be used.

On a kernel controlled operating system instead of mmap why use malloc? Compared to malloc; mmap seems complicated? To invoke munmap the previously requested address space amount must also be provided. malloc use is more portable? malloc seems more convenient?

Yet if performance is desired then mmap is used.

Last, but not least if MAP_SHARED then with progeny processes data can be shared. Avoiding pthreads is paramount. Sometimes clone also can be avoided.

Although subjective, variable allocation methods listed in the most to least preferred follows: register/stack; mmap; global; malloc. By each different boons and banes become. By a sufficiently complicated program; three or possibly all four methods are used.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionPeterView Question on Stackoverflow
Solution 1 - Cuser410034View Answer on Stackoverflow
Solution 2 - CAntonView Answer on Stackoverflow
Solution 3 - CJeffrey AylesworthView Answer on Stackoverflow
Solution 4 - CprostatanusView Answer on Stackoverflow
Solution 5 - CjoemoeView Answer on Stackoverflow
Solution 6 - CloquaciousView Answer on Stackoverflow