I see in the web many conflicting or unclear descriptions of the memory layout of a Linux process. Usually the common diagram looks like:
And a common description would say that:
The data segment contains only global or static variable which have a predefined value and can be modified. Heap contains the dynamically allocated data that is stored in a memory section we refer that as heap section and this section typically starts where data segments ends.
And also:
The heap is, generally speaking, one specific memory region created by the C runtime, and managed by
malloc(which in turn uses thebrkandsbrksystem calls to grow and shrink).
mmapis a way of creating new memory regions, independently ofmalloc(and so independently of the heap).munmapis simply its inverse, it releases these regions.
Many of the those explanations seem outdated, and I find many discrepancies. For instance, many articles - as the answer above - claim that the heap is used my malloc, but this is actualy a library call that's using either sbrk or mmap, as the malloc man page says:
Normally,
malloc()allocates memory from the heap, and adjusts the size of the heap as required, usingsbrk(2). When allocating blocks of memory larger than MMAP_THRESHOLD bytes, the glibcmalloc()implementation allocates the memory as a private anonymous mapping usingmmap(2).
So if malloc in many cases in implemented by mmap, what's the difference between the heap and and the mmap area?
Another thing that seems like a contradiction is that many articles (as the malloc man page itself) claim that brk/sbrk adjust the size of the heap, but their man page says it actually adjust the size of the data segment:
brk()andsbrk()change the location of the program break, which defines the end of the process's data segment (i.e., the program break is the first location after the end of the uninitialized data segment).
So I'm trying to get a clear, up-to-date overall explanation of the memory layout of processes nowadays with the different segments, that also addresses those questions:
- What is the difference between the heap and the mmap areas? (From some tests I was attempting, by looking at the addresses I got from
mmapand comparing to the range of the heap in/proc/self/maps, it seems that somemmapallocated pages are actually allocated inside the heap segment.) - Does the break signifies the end of the data segment, or the end of the heap?
Other related questions:
