2

I have a simple virtual machine which I made for fun. It works in a very low level and it doesn't have any notion of types. Everything is just an integer. There are some instructions for getting a pointer and accessing memory by a pointer. The problem is that these pointers are simply stored as uint64_t, and any pointer arithmetic is an integer arithmetic. The machine casts this to a void* when using it as a pointer. Well, this kind of code is obvious in the assembly level, but the C and C++ standard doesn't let programmers do integer-pointer cast safely, especially when the integer is used to change the value of the pointer.

I'm not making this toy VM portable everywhere. It is just expected to work under x86_32/64 machines, and it does seem to work very fine even after full compiler optimizations. I think it's because pointers are represented no differently as integers in the x86 architecture.

What kind of solution is usually applied in such situations, that the language standards doesn't declare certain code as safe, but it really should be safe in the targeted hardware, and the results does seem okay?

Or as a more practical question, how can I let a compiler (gcc) not perform breaking optimizations on code like

uint64_t registers[0x100];
registers[0] = (uint64_t)malloc(8);
registers[0] += 4;
registers[2] = 0;
memcpy(&registers[2], (void*)registers[0], 4);

The above isn't real code, but a certain sequence of bytecode instructions would actually do something similar as above.

2
  • 1
    What breaking optimization do you expect gcc to perform on your code? Commented Jul 2, 2015 at 10:51
  • is the VM bytecode the same on both x86 and x86_64? Or you just need the code to be portable between architectures and VMs on 64-bit computers don't need to run on 32-bit computers and vice versa? Commented Jul 2, 2015 at 11:03

1 Answer 1

2

If you really need to cast a pointer to an integer, use at least uintptr_t, for spaghetti monster's sake! This type (along with its signed counterpart) is meant to be casted safely to/from a pointer. It is, however, not to be used for operations (but might be safe for a linear model with no modifications to the representation of both values).

Still then, your code does not seem to make sense, but might hinder the compiler actively to optimize. Without deeper knowledge about what you intend to accomplish, I would say, chances are there are better ways.

Sign up to request clarification or add additional context in comments.

Comments