0

Two hours ago I thought I had full understanding of how the stack works (at least how it's handled in C). But I've come to notice some unexpected (to me) behaviour in my programs.

We know the stack grows towards lower memory addresses (I'm talking about PCs, in my case: Intel 64 bits, Ubuntu). So when a new stack frame is created, the objects that belong to that frame have lower memory addresses that all the previous ones. What surprised me is this: the objects within a frame have higher memory addresses the later they where declared. This shocked me quite a bit because I thought variables that were declared earlier earned higher memory addresses.

Let me show what exactly what I mean with an example in C.

#include <stdio.h>

void foo()
{
    int firstVar = 1;
    int secondVar = 2;
    printf("firstVar is at: %p\n", &firstVar);
    printf("secondVar is at: %p\n", &secondVar);
}

int main(void)
{
    int mainVar = 0;
    printf("mainVar is at: %p\n", &mainVar);
    foo();
    return 0;
}

After compiling with gcc (-g, -ansi and -pedantic flags) the output is:

mainVar is at: 0x7ffd1ec0fadc
firstVar is at: 0x7ffd1ec0fab8
secondVar is at: 0x7ffd1ec0fabc

As expected, mainVar has a higher memory address than the ones in the foo() stack frame. However, firstVar has a lower memory address than secondVar even though is was declared before. Looking at the disassembling of foo() shows this behaviour:

0x000000000040052d <+0>:    push   %rbp
0x000000000040052e <+1>:    mov    %rsp,%rbp
0x0000000000400531 <+4>:    sub    $0x10,%rsp
0x0000000000400535 <+8>:    movl   $0x1,-0x8(%rbp)
0x000000000040053c <+15>:   movl   $0x2,-0x4(%rbp)
...

The 1 is put four bytes before the 2, showing once again that firstVar has a lower memory address than secondVar.

My question is: why is that? According to all the bibliography I've read, objects within the same stack frame should have higher memory addresses the earlier they where declared. And bibliography means internet (this site for example) and reputable books. I'm using a very standard system so I doubt any ELF or ABI weird stuff is going on...

Any ideas? Thanks for reading through.

5
  • 2
    You have no control over the layout of the data. The compiler may order the variables on the stack in any way it chooses. Sometimes, they will move arrays after simple variables so that array overflows won't hit the simple variables. Sometimes it won't allocate any stack storage for variables; they'll live only in registers. But there's nothing you can predict about it — except by reverse engineering the assembler (and that analysis could be invalid on the next time you compile the code). Commented Sep 22, 2015 at 5:57
  • Notice the sub $0x10,%rsp, where they allocate room on the stack. This is the space they'll place the local allocations of the integers. At that point, it's up to the compiler regarding layout, and in most compilers (and in assembler), this resolves to a "top down chart", meaning in the order in which the text is written. However, alignment issues and other considerations will cause optimizers to order this region according to complex rules, so it's not always the case. It's the sub on rsp that works in the order and direction you expect..but the region that opens is more "manually" organized Commented Sep 22, 2015 at 6:05
  • These comments are short...one more point. If they were PUSHES, then you'd be correct...they'd appear in the reverse order by push, but they aren't. The region is opened by sub on rsp, so it's just a block of RAM at that point, and used any way the compiler likes. Commented Sep 22, 2015 at 6:06
  • The cdecl defines the layout for the C function calling protocol. For the cdecl the parameters are pushed on the reverse order that they are declared in the function. this link: <tenouk.com/Bufferoverflowc/Bufferoverflow2a.html> goes into the details. Note: the compiler does NOT re-arrange the ordering of the parameters beyond the 'reverse ordering' google the cdecl calling convention for details/examples. And remember that pascal and other languages have different conventions Commented Sep 23, 2015 at 15:12
  • Here is a link to the cdecl conventions (C calling conventions), that any compatable C/C++ compiler must implement: <msdn.microsoft.com/en-us/library/zkwh89ks.aspx> You might also want to look at the stdcall conventions used by the WINAPI <msdn.microsoft.com/en-us/library/zxk0tw93.aspx> Commented Sep 23, 2015 at 15:18

2 Answers 2

1

According to all the bibliography I've read, objects within the same stack frame should have higher memory addresses the earlier they where declared

The order in which local variables are placed on the stack is by no means standardized, nor is the format of the stack frame itself. The compiler is free to allocate local variables as it pleases as it will not affect anything outside the function. Unless the variables are returned to the caller, but that is not the case here.

One observation:

gcc no optimizations:

mainVar is at:   000000000022FE4C
firstVar is at:  000000000022FE0C
secondVar is at: 000000000022FE08

gcc -O3 full optimization:

mainVar is at:   000000000022FE4C
firstVar is at:  000000000022FE08
secondVar is at: 000000000022FE0C

For whatever reason, the optimizer thought it would benefit to change the order that those two variables were allocated. To know why, you would have to study the specific compiler's optimizer in detail. And it is mildly useful knowledge.

What you don't see here, is that the optimizer would probably have loved to put those variables in CPU registers. But no can do, since you are printing their addresses and register variables don't have addresses. By using a variable's address, you are forcing it to get allocated on the stack.

So the only important thing to learn here is that you shouldn't write code which depends on the memory layout of the stack frame nor make any assumptions about memory layout that aren't guaranteed by the C standard.

If you need a specific order, you need to show the C standard down the compiler's throat:

typedef struct
{
  int firstVar;
  int secondVar;
} reorder_this_if_you_can;

void foo()
{
    reorder_this_if_you_can re;

    printf("firstVar is at:\t %p\n", &re.firstVar);
    printf("secondVar is at: %p\n", &re.secondVar);
}

Now the order is suddenly guaranteed no matter level of optimization.

Sign up to request clarification or add additional context in comments.

4 Comments

I cannot get the compiler to revert the order of allocation with any level of optimization. I don't really mind if the variables are not the way I read they would be. What bugs me is that using a very standard system with a very standard compiler, I cannot get the compiler to put the variables in the 'usual' order. Anyway, the lesson for me is never to believe the order of allocation to be one or the other without checking.
@mane95 "The usual order" is actually least significant address first, because that's how all other variables with other kinds of storage duration are allocated. In general memory allocation terms, down-counting stacks are kind of a special case. And although far less common, there are CPUs with up-counting stacks as well.
so very wrong. google cdecl for what is actually used for C programs
@user3629249 What is wrong? This is a platform-independent answer applying to computers in general. There is nothing called cdecl in C, that's a non-standard PC extension for calling convention. And you seem confused: calling convention has nothing to do with the order of allocation of local variables, it is only related to allocation of parameters and return values. Feel free to post a question on SO asking what cdecl does and does not, or, you know, google cdecl.
1

Which compiler are you using? Compilers are verry complex programs. Plus, they know C better than you do ;-) (which is in my case a good thing!) In any case, they are not obliged to follow your order of statements. What are your compilation settings? Do you optimize for speed or size? I assume none?

What can possibly happen is that since you use firstVar first (in the printf function, the compiler decides to position secondVar above firstVar. The stack memory for firstVar (which becomes free again before the stack memory of secondVar) can then be reused faster and more easily, should there be a need for.

What happens if you swap the first two lines in function foo?

2 Comments

I'm using gcc. I was using no specific optimization, though trying with O0, O1, O2, O3, Os, Ofast and Og did not change de order. Swapping the first two lines does swap the memory addresses. Swapping the printf does nothing but swapping the output. In short, the variables have lower memory addresses the earlier they were declared, no matter what optimization I use.
Then I can only say you just discovered an internal parsing mechanism of gcc ;-) Switch compilers, or switch versions, and you may get different results. If you really want to ensure your variables are laid out in memory how you want it, you have to revert to pragma's, linker scripts and/or assembly - or use structs of course, which a good catch from Lundin

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.