memory map file with growing size

Issue

I am writing a program to read and write a file at the same time. More specifically, all write operations are appending new data to the end of the file and all read operations are reading random positions of the file.

I am thinking of creating memory-mapped file (using mmap) to achieve efficient read while writing via append (mode a in open). However, I don’t think this will work because the memory-mapped file cannot change in size*, unless I munmap and then mmap it.

While "munmap and then mmap the file again" works, it has many downsides. Not only I need to perform 2 syscalls after every write (or before every read), which hurts performance, the base address returned from the next mmap call after munmap could be different from the previous one. Since I am planning to have other in-memory data structure storing pointers to specific offset of this memory mapped file, it could be very inconvenient.

Are there more elegant and efficient ways of doing this? The program will be mostly running on Linux (but solutions with portability to other POSIX systems are preferred). I have read through the following posts, but none of them seems to give a definitive answer.

How to portably extend a file accessed using mmap()

Can the OS automatically grow an mmap backed file?

Fast resize of a mmap file

My intuition is to use mmap to "reserve" the file with a size that is large enough to accommodate the growth of file, say a few hundred of GiB (that is a very reasonable assumption in my use case). And then somehow reflect the change of file size in this mapped memory without invalidating it with munmap. However, I am aware that accessing data beyond the real file boundary could result in a bus error. And the documentation isn’t clear about whether changes in file size will get reflected.

*I am not 100% sure about this, but I couldn’t find any source of elegantly changing the size of memory-mapped file.

Solution

After some experimentations, I found a way to make it work.

First mmap the file with PROT_NONE and a large enough size. For 64-bit systems, it can be as large 1L << 46 (64TB). This does NOT consume physical memory* (at least on Linux). It will consume address space (virtual memory) for this process.

void* ptr = mmap(NULL, (1L << 40), PROT_NONE, MAP_SHARED, fd, 0);

Then, give read (and/or write) permission to the part of memory within file length using mprotect. Note that size need to be aligned with page size (which can be obtained by sysconf(_SC_PAGESIZE), usually 4096).

mprotect(ptr, aligned_size, PROT_READ | PROT_WRITE);

However, if file size is not page-size aligned, reading the portion within mapped region (with PROT_READ permission) but beyond file length will trigger a bus error, as documented on mmap manual.

Then you can use either file descriptor fd or the mapped memory to read and write file. Remember to use fsync or msync to persist the data after writing to it. The memory-mapped page with PROT_READ permission should get the latest file content (if you write to it)**. The newly mapped page with mprotect will also get the newly updated page.

Depending on the application, you might want to use ftruncate to make the file size aligned to system page size for the best performance. You might also want to use madvise with MADV_SEQUENTIAL to improve performance when reading those pages.

*This behavior is not mentioned on the manual of mmap. However, since PROT_NONE implies those pages are not accessible in anyway, it’s trivial for any OS implementation to not allocating any physical memory to it at all.

**This behavior of memory region mapped before a file write getting updated after the write is completed (fsync or msync) is also not mentioned on the manual (or at least I did not see it). But it seems to be the case at least on recent Linux kernels (4.x onward).

Answered By – lewisxy

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published