0

I was given a floating point variable and wanted to know what its byte representation is. So I went to IDEOne and wrote a simple program to do so. However, to my surprise, it causes a runtime error:

#include <stdio.h>
#include <assert.h>

int main()
{
    // These are their sizes here. So just to prove it.
    assert(sizeof(char) == 1);
    assert(sizeof(short) == 2);
    assert(sizeof(float) == 4);

    // Little endian
    union {
        short s;
        char c[2];
    } endian;
    endian.s = 0x00FF; // would be stored as FF 00 on little
    assert((char)endian.c[0] == (char)0xFF);
    assert((char)endian.c[1] == (char)0x00);

    union {
        float f;
        char c[4];
    } var;
    var.f = 0.0003401360590942204;
    printf("%x %x %x %x", var.c[3], var.c[2], var.c[1], var.c[0]); // little endian
}

On IDEOne, it outputs:

39 ffffffb2 54 4a

along with a runtime error. Why is there a runtime error and why is the b2 actually ffffffb2? My guess with the b2 is sign extension.

6
  • The ffffffb2 is printed because vats.c[2] is a "char", which is a signed data type. printf will sign-extend this to a 32-bit integer (because that's what varargs do). You can either declare it as unsigned char c[4] or cast it in the printf. Commented Jul 25, 2013 at 21:04
  • Have you noticed the number of significant digits is beyond the limits? Also use the 'f' suffix on the literal, to specify a float not a double Commented Jul 25, 2013 at 21:05
  • Why ffffffb2 instead of b2 is sign extension as you guessed. But the run-time error? I hope you post what definitely caused it. (sign v. unsigned, no \n, not return, etc.) Commented Jul 25, 2013 at 22:04
  • @notNullGothik The value 0.0003401360590942204 is converted to float when assigned to var.f. Unless you suspect this number is prone to double-rounding, there is little point in adding the f suffix to the literal. Commented Jul 25, 2013 at 23:30
  • 1
    As already commented, var.c[i] gets promoted to int when passed to printf(), because this is how variadic functions work. However the %x format expect a corresponding unsigned int. So in addition to using an array of unsigned char (each of which will still promote to int because that is how C works), you should call printf("%x %x %x %x", (unsigned int) var.c[3], … Commented Jul 25, 2013 at 23:33

3 Answers 3

6

char is a signed type. If it's 8 bits long and you put anything greater than 127 in it, it will overflow. Signed integer overflow is undefined behavior, so is printing a signed value using a conversion specifier that expects an unsigned one (%x expects unsigned int, but char is promoted [implicitly converted] to signed int when passed to the variadic printf() function).

Bottom line - change char c[4] to unsigned char c[4] and it will work fine.

Sign up to request clarification or add additional context in comments.

5 Comments

C99 5.2.4.2.1 p2 says the char may be signed or unsigned. It is implementation defined. It is stated a bit more plainly in 6.3.1.1 p3: <quote>As discussed earlier, whether a ‘‘plain’’ char is treated as signed is implementation-defined.</quote>
“Signed integer overflow is undefined behavior” -> Where is there an undefined overflow in this question? Overflows in conversions to a signed type are implementation-defined. I do not see any other type of overflow.
@PascalCuoq This says it's UB. (Perhaps it's IB in C++?)
@H2CO3 “Signed integer overflow is UB” is generally true but it lacks nuance. I am not disputing that for a sentence this length, it is as accurate as it can be, but the detail is that conversions to a signed type are in fact implementation-defined. The question you linked is for arithmetic overflows such as 0x10000 * 0x10000 on a 32-bit compiler. The only overflow I see in this question is for (char)0xFF, which most implementations (signed 8-bit char, wrap-around for overflow) define as (char)-1. In C99, this is in 6.3.1.3:3.
@PascalCuoq Thanks, I'll have a look at that clause.
5

Replace char by unsigned char in the struct and add a return 0; at the end fixes all the problems: http://ideone.com/ienG2b.

4 Comments

Why do I need return 0? I thought the standard said its optional and the compiler must implicitly add it?
Any explanation as to why do these?
@ColeJohnson AFAIK that's C99 (which we should use). return 0; is not implicit in C89.
The unsigned problem was perfectly explained by @H2CO3. For the return, it's the first time I'm using ideone.com and I don't know which C they are using and how to configure it. So it was just a guess. Sorry.
2

Your approach is all kinds of wrong. Here's how you print a general object's binary representation:

template <typename T>
void hexdump(T const & x)
{
    unsigned char const * p = reinterpret_cast<unsigned char const *>(&x);
    for (std::size_t i = 0; i != sizeof(T); ++i)
    {
        std::printf("%02X", p[i]);
    }
}

The upshot is that you can always interpret any object as a character array and thus reveal its representation.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.