0

So, I have this piece of code:

f = open("crash.txt", "w")
junk = ("\xCC" * 1028)
f.write(junk)
f.close()

When I run this on Windows(3.5.1), I get a file with repeated "CC"s as hex characters. That is as expected.

However, running this on Linux(Python 3.4.2), I get repeated "c38c"s as hex characters.

I do not understand the output on Linux. Why does this happen and how do I fix it.

2
  • What do you mean by c38c as hex? Do you mean \xc3\x8c? Commented Apr 4, 2016 at 6:33
  • @Reti43 Yes. When I look at the contents of the file in a hex editor. Commented Apr 4, 2016 at 6:35

1 Answer 1

1

You aren't writing raw bytes. By default Python 3 uses Unicode strings, and those strings must be encoded to write them to a file. Also by default, open() uses text mode and the encoding used to encode text is locale.getpreferredencoding(). On US Windows, that is cp1252, but on Linux, it is usually utf8.

b'\xc3\x8c' is '\xcc' encoded in utf8.

b'\xcc' is '\xcc' encoded in cp1252.

Open the file in binary mode and write byte strings instead of Unicode to write "raw" bytes.

with open("crash.txt", "wb") as f:
    junk = b"\xCC" * 1028
    f.write(junk)
Sign up to request clarification or add additional context in comments.

1 Comment

@Mark Tolonen I am trying to write a byte type variable into a file. any suggestion?

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.