5

I've used structs extensively and I've seen some interesting things, especially *value instead of value->first_value where value is a pointer to struct, first_value is the very first member, is *value safe?

Also note that sizes aren't guaranteed because of alignment, whats the alginment value based on, the architecture/register size?

We align data/code for faster execution can we tell compiler not to do this? so maybe we can guarantee certain things about structs, like their size?

When doing pointer arithmetic on struct members in order to locate member offset, I take it you do - if little endian + for big endian, or does it just depend on the compiler?

what does malloc(0) really allocate?

The following code is for educational/discovery purposes, its not meant to be of production quality.

#include <stdlib.h>
#include <stdio.h>

int main()
{
    printf("sizeof(struct {}) == %lu;\n", sizeof(struct {}));
    printf("sizeof(struct {int a}) == %lu;\n", sizeof(struct {int a;}));
    printf("sizeof(struct {int a; double b;}) == %lu;\n", sizeof(struct {int a; double b;}));
    printf("sizeof(struct {char c; double a; double b;}) == %lu;\n", sizeof(struct {char c; double a; double b;}));

    printf("malloc(0)) returns %p\n", malloc(0));
    printf("malloc(sizeof(struct {})) returns %p\n", malloc(sizeof(struct {})));

    struct {int a; double b;} *test = malloc(sizeof(struct {int a; double b;}));
    test->a = 10;
    test->b = 12.2;
    printf("test->a == %i, *test == %i \n", test->a, *(int *)test);
    printf("test->b == %f, offset of b is %i, *(test - offset_of_b) == %f\n",
        test->b, (int)((void *)test - (void *)&test->b),
        *(double *)((void *)test - ((void *)test - (void *)&test->b))); // find the offset of b, add it to the base,$

    free(test);
    return 0;
}

calling gcc test.c followed by ./a.out I get this:

sizeof(struct {}) == 0;
sizeof(struct {int a}) == 4;
sizeof(struct {int a; double b;}) == 16;
sizeof(struct {char c; double a; double b;}) == 24;
malloc(0)) returns 0x100100080
malloc(sizeof(struct {})) returns 0x100100090
test->a == 10, *test == 10 
test->b == 12.200000, offset of b is -8, *(test - offset_of_b) == 12.200000

Update this is my machine:

gcc --version

i686-apple-darwin10-gcc-4.2.1 (GCC) 4.2.1 (Apple Inc. build 5666) (dot 3)
Copyright (C) 2007 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

uname -a

Darwin MacBookPro 10.8.0 Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 i386
1
  • 7
    "Also whats malloc(0) allocating? " Arghhhhhh! One question at a time, please. The two things you have asked here have nothing to do with one another, they should be in separate questions. Commented Jun 15, 2012 at 19:44

5 Answers 5

7

From 6.2.5/20:

A structure type describes a sequentially allocated nonempty set of member objects (and, in certain circumstances, an incomplete array), each of which has an optionally specified name and possibly distinct type.

To answer:

especially *value instead of value->first_value where value is a pointer to struct, first_value is the very first member, is *value safe?

see 6.7.2.1/15:

15 Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.1

There may however be padding bytes at the end of the structure as also in-between members.

In C, malloc( 0 ) is implementation defined. (As a side note, this is one of those little things where C and C++ differ.)

[1] Emphasis mine.

Sign up to request clarification or add additional context in comments.

7 Comments

sequentially allocated nonempty set of member objects so its safe for me assume that the order of members is guaranteed? cause sets aren't ordered are they?
@samy.vilar: Yes, the order is guaranteed. Here the term set is not used in the strict mathematical sense but to indicate a grouping of typed members.
There may however be padding bytes at the end of the structure (but never in-between members). is this really true? why was my offset 8 bytes? is it because its little endian? what about data alignment, wouldn't the compiler properly pad each member for faster access?
Yes, this is to honor alignment constraints. Endianness and alignment are two different things. See Data Structure Alignment for details.
yes I know but the -8 byte offset caught me by surprise, I was expecting 4 or -4 since the first member was an int.
|
4

I've used structs extensively and I've seen some interesting things, especially *value instead of value->first_value where value is a pointer to struct, first_value is the very first member, is *value safe?

Yes, *value is safe; it yields a copy of the structure that value points at. But it is almost guaranteed to have a different type from *value->first_value, so the result of *value will almost always be different from *value->first_value.


Counter-example:

struct something { struct something *first_value; ... };
struct something data = { ... };
struct something *value = &data;
value->first_value = value;

Under this rather limited set of circumstances, you would get the same result from *value and *value->first_value. Under that scheme, the types would be the same (even if the values are not). In the general case, the type of *value and *value->first_value are of different types.


Also note that sizes aren't guaranteed because of alignment, but is alignment always on register size?

Since 'register size' is not a defined C concept, it isn't clear what you're asking. In the absence of pragmas (#pragma pack or similar), the elements of a structure will be aligned for optimal performance when the value is read (or written).

We align data/code for faster execution; can we tell compiler not to do this? So maybe we can guarantee certain things about structs, like their size?

The compiler is in charge of the size and layout of struct types. You can influence by careful design and perhaps by #pragma pack or similar directives.

These questions normally arise when people are concerned about serializing data (or, rather, trying to avoid having to serialize data by processing structure elements one at a time). Generally, I think you're better off writing a function to do the serialization, building it up from component pieces.

When doing pointer arithmetic on struct members in order to locate member offset, I take it you do subtraction if little endian, addition for big endian, or does it just depend on the compiler?

You're probably best off not doing pointer arithmetic on struct members. If you must, use the offsetof() macro from <stddef.h> to handle the offsets correctly (and that means you're not doing pointer arithmetic directly). The first structure element is always at the lowest address, regardless of big-endianness or little-endianness. Indeed, endianness has no bearing on the layout of different members within a structure; it only has an affect on the byte order of values within a (basic data type) member of a structure.

The C standard requires that the elements of a structure are laid out in the order that they are defined; the first element is at the lowest address, and the next at a higher address, and so on for each element. The compiler is not allowed to change the order. There can be no padding before the first element of the structure. There can be padding after any element of the structure as the compiler sees fit to ensure what it considers appropriate alignment. The size of a structure is such that you can allocate (N × size) bytes that are appropriately aligned (e.g. via malloc()) and treat the result as an array of the structure.

9 Comments

you are assuming *value->first_value is a pointer it may or may not be and yes it will have different type, as such its always typecasted to the appropriate type, I asked this question for educational purposes its always a good idea to understand how things work under the hood :) ...
If you have: struct something { struct something *first_value; ... }; and struct something data = { ... }; and struct something *value = &data; and value->first_value = value;, then you would get the same result from *value and *value->first_value. Under that scheme, the types would be the same. In the general case, the type of *value and *value->first_value are of different types.
+1; for The compiler is in charge of the size and layout of struct types. You can influence by careful design and perhaps by #pragma pack or similar directives.
btw what about dirkgently answer and comments? There may however be padding bytes at the end of the structure (but never in-between members). is this really true?
The sentence without the parenthetical comment is correct. The parenthetical comment is bogus. Witness: struct InteriorPad { char c; double d; }; On most systems, there will be 7 bytes of padding between c and d.
|
3

Calling malloc(0) will return a pointer that may be safely passed to free() at least once. If the same value is returned by multiple malloc(0) calls, it may be freed once for each such call. Obviously, if it returns NULL, that could be passed to free() an unlimited number of times without effect. Every call to malloc(0) which returns non-null should be balanced by a call to free() with the returned value.

6 Comments

Doesn't answer the question. Even before the "what is malloc(0) allocating?" was edited out, this answer only mentions what the call is returning.
@cHao: It may allocate something, or it may not. If it returns NULL, it doesn't allocate anything, but I don't think there's any way to tell for sure if it does. Balancing every call to malloc(0) with a free() of the returned pointer will ensure that anything malloc(0) does allocate will be freed, but if the implementation of free() were such that it would harmlessly do nothing when given some particular non-null pointer which didn't represent an actual allocation, such a pointer could be returned by malloc(0) without allocating anything.
See, now that answers the question. At least the question before it was edited to remove that part, and as well as a question like that can be answered. :)
being that he answered the question I will add it back in, thank you, I guess it just depends on the OS how it handles 0 byte allocation, stil its curious assuming that malloc is allocative consecutively, which we can't it looks like it allocating 16 bytes malloc(0)) returns 0x100100080 malloc(sizeof(struct {})) returns 0x100100090 though we can't say for sure whats happening here.
@samy.vilar: I believe sizeof(struct{}) is a minimum of one in any implementation that allows such, is it not?
|
2

If you have an inner structure it is guaranteed to start on the same address as the enclosing one if that is the first declaration of the enclosing structure.

So *value and value->first is accessing memory at the same address (but using different types) in the following

struct St {
  long first;
} *value;

Also, the ordering between memebers of the structure is guaranteed to be the same as the declaration order

To adjust alignment, you can use compiler specific directives or use bitfields.

The alignment of structure memebers are usually based on what's best to access the individual members on the target platform

Also, for malloc, it is possible it keeps some bookkeeping near the returned address, so even for zero-size memory it can return a valid address (just don't try to access anything via the returned address)

6 Comments

Also, the ordering between memebers of the structure is guaranteed to be the same as the declaration order is this part of the C standard?
It'd pretty much have to be. As low-level as C gets, it'd wreak havoc if structs could be interpreted differently between, say, the compiler you use and the one that built your OS.
interesting but I always thought structs where a language construct nothing to with the underlying hardware architecture, but I guess they need some cohession when moving data around between the OS and underlying compiled code.
@samy.vilar - In C (and C++) the language constructs very often have a close relationship to hardware architectures (at least to the most common ones at the time of their creation)
@samy.vilar - as I mentioned, alignment (thus "padding") is usually hardware based. You are correct that the whole structure does not map to anything specific on the hardware, but its (ultimate primitive) members do, which have an affect on the struct itself
|
0

It is important to learn about the way that size of struct works. for example:

struct foo{
  int i;
  char c;
}

struct bar{
  int i;
  int j;
}

struct baz{
  int i;
  char c;
  int j;
}

sizeof(foo) = 8 bytes (32 bit arch)
sizeof(bar) = 8 bytes
sizeof(baz) = 12 bytes

What this means is that struct sizes and offsets have to follow two rules:

1- The struct must be a multiple of it's first element (Why foo is 8 not 5 bytes)

2- A struct element must start on a multiple of itself. (In baz, int j could not start on 6, so bytes 6, 7, and 8 are wasted padding

7 Comments

(I realize this doesn't entirely answer your question, just a mistake I see often in code)
I'm pretty sure the C standard does not mandate alignment. Meaning int j could start at the 6th byte. It's the architecture that determines whether that's legal.
Uh oh. Was my assembly language class a lie? I'm doing some reading now, will remove if this pans out false
Assembly is not C. :) And C itself isn't bound by a particular CPU's rules. But as it turns out, x86 allows unaligned values as well, in many circumstances...it's just the performance sucks with them.
I might mention that it's "commonly" done that way (which it is). It's not necessarily that way, but it's common enough to be worth mentioning.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.