8

I want to read structures from binary. In C++ I would do it like this:

stream.read((char*)&someStruct, sizeof(someStruct));

Is there a similar way in C#? The BinaryReader only works for built-in types. In .NET 4 there is a MemoryMappedViewAccessor. It provides methods like Read<T> which seems to be what I want, except that I manually have to keep track of where in the file I want to read. Is there a better way?

3

5 Answers 5

16
public static class StreamExtensions
{
    public static T ReadStruct<T>(this Stream stream) where T : struct
    {
        var sz = Marshal.SizeOf(typeof(T));
        var buffer = new byte[sz];
        stream.Read(buffer, 0, sz);
        var pinnedBuffer = GCHandle.Alloc(buffer, GCHandleType.Pinned);
        var structure = (T) Marshal.PtrToStructure(
            pinnedBuffer.AddrOfPinnedObject(), typeof(T));
        pinnedBuffer.Free();
        return structure;
    }
}

You need to ensure your struct is declared with [StructLayout] and possibly [FieldOffset] annotations to match the binary layout in the file

EDIT:

Usage:

SomeStruct s = stream.ReadStruct<SomeStruct>();
Sign up to request clarification or add additional context in comments.

5 Comments

@jesperll: That's a really bad idea, especially if the structure is not flat. if there are pointers anywhere in the structure then that referenced structure/class will not be written to the output. Even worse, when read back in, it will point to an invalid memory space.
True. You get into problems if you have stuff like arrays in your struct since they're not value types
But if you're using it to parse a file format where much of it is header blocks with simple types then it's quite doable
+1 this is perfectly viable, I'm using pretty much the same code for reading in binary structures from ASF files - @casperOne I don't think the question was asking for complex object serialization/deserialization mechanisms
Very cool! Marshal.PtrToStructure() throws on enum types (possibly because you can't use [StructLayout] on enum?). For enums you can use typeof(T).GetEnumUnderlyingType() and it works.
4

Here is a slightly modified version of Jesper's code:

public static T? ReadStructure<T>(this Stream stream) where T : struct
{
    if (stream == null)
        return null;

    int size = Marshal.SizeOf(typeof(T));
    byte[] bytes = new byte[size];
    if (stream.Read(bytes, 0, size) != size) // can't build this structure!
        return null;

    GCHandle handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
    try
    {
        return (T)Marshal.PtrToStructure(handle.AddrOfPinnedObject(), typeof(T));
    }
    finally
    {
        handle.Free();
    }
}

It handles EOF cases successfully as it returns a nullable type.

Comments

2

It's possible to do something similar in C#, but then you would have to apply a lot of attributes to a structure so that you control exactly how it's laid out in memory. By default the JIT compiler controls how structure members are laid out in memory, which usually means that they are rearranged and padded for the most efficient layout considering speed and memory usage.

The simplest way is usually to use the BinaryReader to read the separate members of the structure in the file, and put the values in properties in a class, i.e. manually deserialise the data into a class instance.

Normally it's reading the file that is the bottle neck in this operation, so the small overhead of reading the separate members doesn't affect the performance noticeably.

3 Comments

Sounds reasonable. Performance is not the main issue here, I just thought it is a bit inconvenient.
Thinking about it a little more, I don't want to use a loop, just to read an array of something.
@B_old: It's a lot easier to write the few lines of code to read the value one at a time, than to get the attributes right for all members of a structure so that it's guaranteed to be laid out exactly in memory as the file is arranged. You won't get away from using a loop in some form whatever solution you choose.
2

Just to elaborate on Guffa's and jesperll's answer, here a sample on reading in the file header for a ASF (WMV/WMA) file using basically the same ReadStruct method (just not as extension method)

MemoryStream ms = new MemoryStream(headerData);
AsfFileHeader asfFileHeader = ReadStruct<AsfFileHeader>(ms);


[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi, Pack = 1)]
internal struct AsfFileHeader
{
    [MarshalAs(UnmanagedType.ByValArray, SizeConst = 16)] 
    public byte[] object_id;
    public UInt64 object_size;
    public UInt32 header_object_count;
    public byte r1;
    public byte r2;
}

Comments

1

There is no similar way in C#. Moreover, this is deprecated way of serializing due to its non-portability. Use http://www.codeproject.com/KB/cs/objserial.aspx instead.

5 Comments

In which cases is it not portable? I'm using that c++ code to read the same data in both x86 and x64 and seems to work fine.
If you write the data in one platform (x86, for example) and read in another (64) then you may get problems.
That is exactly what I don't understand, because I'm doing it. Do you maybe have a link to something explaining the issue in a little more detail?
No I don't have a link to comprehensive explanation, sorry. But I know that memory layout differs in different platforms and even when different compiler arguments were used. There is no exact standard that "fields in memory must appear in the same order as they appear in source code" or "long is represented by 32 bit on all platforms" or "fields are aligned in 32 bit packets everywhere" and there can not be such a standard. Successful using of that C++ code in x86 and in x64 means that you are just lucky. Try to play with compiler keys or try to compile for ARM.
It tends to be more of a compiler implementation rather than x86 vs x64.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.