This is a supplement to Ben Voigt's answer that shows what assembly code GCC actually generates for this scenario. I'll start by showing the generated code for Base1 and Base2 which is fairly easy to understand:
Base1* new_base1() { return new Base1(); }
Base2* new_base2() { return new Base2(); }
The construction of Base1 and Base2 are nearly identical (code for stack manipulation, calling new, etc. omitted):
new_base1():
...
mov QWORD PTR [rax], OFFSET FLAT:vtable for Base1+16
mov DWORD PTR [rax+8], 1
...
new_base2():
...
mov QWORD PTR [rax], OFFSET FLAT:vtable for Base2+16
mov DWORD PTR [rax+8], 2
...
We put a pointer into the object's vtable at offset 0, and the actual data at offset 8. Here are the vtables:
vtable for Base1:
.quad 0
.quad typeinfo for Base1
.quad Base1::foo()
vtable for Base2:
.quad 0
.quad typeinfo for Base2
.quad Base2::bar()
The vtable pointer in the object (OFFSET FLAT:vtable for Base1+16) is offset to point directly at the methods. When we call the methods this pointer is dereferenced to get the function pointer:
void Base1_call_foo(Base1 * b1) { b1->foo(); }
void Base2_call_bar(Base2 * b2) { b2->bar(); }
Base1_call_foo(Base1*):
mov rax, QWORD PTR [rdi]
jmp [QWORD PTR [rax]]
Base2_call_bar(Base2*):
mov rax, QWORD PTR [rdi]
jmp [QWORD PTR [rax]]
At first this looks like a problem: if we pass a pointer that is actually an instance of Derived, however its vtable is organized it appears that these functions would both call the same function pointer! Obviously this doesn't happen, so let's see what Derived looks like:
Derived* new_derived() {
return new Derived();
}
new_derived():
...
mov QWORD PTR [rdx], OFFSET FLAT:vtable for Derived+16
mov DWORD PTR [rdx+8], 1
mov QWORD PTR [rdx+16], OFFSET FLAT:vtable for Derived+48
mov DWORD PTR [rdx+24], 2
mov DWORD PTR [rdx+28], 999
...
vtable for Derived:
.quad 0
.quad typeinfo for Derived
.quad Derived::foo()
.quad Derived::bar()
.quad -16
.quad typeinfo for Derived
.quad non-virtual thunk to Derived::bar()
We can see that Derived actually contains two vtable pointers; one to Derived::foo and Derived::bar, and one that points to "non-virtual thunk to Derived::bar" (this is the "trampoline" that Ben Voigt's answer mentions). These pointers are interleaved with the class data. To see how they are used, we can first look at how derived->foo() and derived->bar() are called:
void Derived_call_foo(Derived * d) { d->foo(); }
void Derived_call_bar(Derived * d) { d->bar(); }
Derived_call_foo(Derived*):
mov rax, QWORD PTR [rdi]
jmp [QWORD PTR [rax]]
Derived_call_bar(Derived*):
mov rax, QWORD PTR [rdi]
jmp [QWORD PTR [rax+8]]
They both call their respective function pointers from the first vtable. So what is the second vtable for? Finally we can look at what happens when we cast Derived to Base1 or Base2:
Base1* Derived_cast_to_Base1(Derived * d) { return d; }
Base2* Derived_case_to_Base2(Derived * d) { return d; }
Derived_cast_to_Base1(Derived*):
mov rax, rdi
ret
Derived_case_to_Base2(Derived*):
mov rax, rdi
test rdi, rdi
je .L19
add rax, 16
.L19:
ret
Casting to Base1 is a no-op (we just copy the pointer). This works because the beginning of both the Derived object's layout and vtable match the layout of Base1: foo is the first function in the vtable and b1_data immediately follows the vtable pointer.
However casting to Base2 instead returns a pointer 16 bytes into the object, where the second vtable pointer is located. (The test makes sure we leave a null pointer unchanged.) This vtable starts with a pointer to the Derived::bar trampoline and is followed by b2_data, so this is compatible with a Base2 pointer! Indeed this is a sort of "Base2 component" inside Derived. Visually:
+--------+---------+
Base1: | vtable | b1_data |
+---|----+---------+
V
+------------+
| Base1::foo |
+------------+
+--------+---------+
Base2: | vtable | b2_data |
+---|----+---------+
V
+------------+
| Base2::bar |
+------------+
+--------+---------+--------+---------+-------------+
Derived: | vtable | b1_data | vtable | b2_data | derivedData |
+---|----+---------+---|----+---------+-------------+
| V
| +--------------------+
| | Derived::bar thunk |
V +--------------------+
+--------------+--------------+
| Derived::foo | Derived::bar |
+--------------+--------------+
The one remaining function to look at is the trampoline:
.set .LTHUNK0,Derived::bar()
non-virtual thunk to Derived::bar():
sub rdi, 16
jmp .LTHUNK0
Here we subtract the offset of 16 that we added when converting from a Derived to get back the pointer to the full object, then jump to the actual Derived::bar implementation. This is safe because the vtable containing this function is only used by instances of Base2 that were converted from Derived.
See the full code here: https://godbolt.org/z/fdd1K4665
mainand make yourself a real minimal reproducible example. Remember: Every change the readers of your code have to make to get a runnable example is a chance to add a new mistake and answer based on that mistake or accidentally fix the problem and not answer at all.thisis not important to this question. You have a pointer to a class, if it has virtual methods and you call this through that pointer it will call the most derived method. That's the whole point of virtual functions. And even though implicit first argument is "this" and vtables and are a common implementation of member and virtual functions, they are not important for the observable behavior (calling the most derived virtual function), which is in the end what really matters