There seems to be a difference between Python2 and Python3 when bytes are written to a file.
I would like to know why Python3 suddenly starts writing bytes differently than Python2. Furthermore, what are the needed code changes to achieve the same output as with Python2.
The following Python code writes bytes to a file.
#!/usr/bin/python
badchars = (
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10"
"\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20"
"\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30"
"\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40"
"\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50"
"\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60"
"\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70"
"\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80"
"\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90"
"\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0"
"\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0"
"\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0"
"\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0"
"\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0"
"\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0"
"\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff")
with open("output.txt", "w") as text_file:
text_file.write(badchars)
With xxd we can see what bytes are written to the output.txt file. (generated with Python3)
└─$ xxd output.txt
00000000: 0102 0304 0506 0708 090a 0b0c 0d0e 0f10 ................
00000010: 1112 1314 1516 1718 191a 1b1c 1d1e 1f20 ...............
00000020: 2122 2324 2526 2728 292a 2b2c 2d2e 2f30 !"#$%&'()*+,-./0
00000030: 3132 3334 3536 3738 393a 3b3c 3d3e 3f40 123456789:;<=>?@
00000040: 4142 4344 4546 4748 494a 4b4c 4d4e 4f50 ABCDEFGHIJKLMNOP
00000050: 5152 5354 5556 5758 595a 5b5c 5d5e 5f60 QRSTUVWXYZ[\]^_`
00000060: 6162 6364 6566 6768 696a 6b6c 6d6e 6f70 abcdefghijklmnop
00000070: 7172 7374 7576 7778 797a 7b7c 7d7e 7fc2 qrstuvwxyz{|}~.. <-- difference starts here
00000080: 80c2 81c2 82c2 83c2 84c2 85c2 86c2 87c2 ................
00000090: 88c2 89c2 8ac2 8bc2 8cc2 8dc2 8ec2 8fc2 ................
000000a0: 90c2 91c2 92c2 93c2 94c2 95c2 96c2 97c2 ................
000000b0: 98c2 99c2 9ac2 9bc2 9cc2 9dc2 9ec2 9fc2 ................
000000c0: a0c2 a1c2 a2c2 a3c2 a4c2 a5c2 a6c2 a7c2 ................
000000d0: a8c2 a9c2 aac2 abc2 acc2 adc2 aec2 afc2 ................
000000e0: b0c2 b1c2 b2c2 b3c2 b4c2 b5c2 b6c2 b7c2 ................
000000f0: b8c2 b9c2 bac2 bbc2 bcc2 bdc2 bec2 bfc3 ................
00000100: 80c3 81c3 82c3 83c3 84c3 85c3 86c3 87c3 ................
00000110: 88c3 89c3 8ac3 8bc3 8cc3 8dc3 8ec3 8fc3 ................
00000120: 90c3 91c3 92c3 93c3 94c3 95c3 96c3 97c3 ................
00000130: 98c3 99c3 9ac3 9bc3 9cc3 9dc3 9ec3 9fc3 ................
00000140: a0c3 a1c3 a2c3 a3c3 a4c3 a5c3 a6c3 a7c3 ................
00000150: a8c3 a9c3 aac3 abc3 acc3 adc3 aec3 afc3 ................
00000160: b0c3 b1c3 b2c3 b3c3 b4c3 b5c3 b6c3 b7c3 ................
00000170: b8c3 b9c3 bac3 bbc3 bcc3 bdc3 bec3 bf ...............
xxd for output.txt (Python2)
└─$ xxd output.txt
00000000: 0102 0304 0506 0708 090a 0b0c 0d0e 0f10 ................
00000010: 1112 1314 1516 1718 191a 1b1c 1d1e 1f20 ...............
00000020: 2122 2324 2526 2728 292a 2b2c 2d2e 2f30 !"#$%&'()*+,-./0
00000030: 3132 3334 3536 3738 393a 3b3c 3d3e 3f40 123456789:;<=>?@
00000040: 4142 4344 4546 4748 494a 4b4c 4d4e 4f50 ABCDEFGHIJKLMNOP
00000050: 5152 5354 5556 5758 595a 5b5c 5d5e 5f60 QRSTUVWXYZ[\]^_`
00000060: 6162 6364 6566 6768 696a 6b6c 6d6e 6f70 abcdefghijklmnop
00000070: 7172 7374 7576 7778 797a 7b7c 7d7e 7f80 qrstuvwxyz{|}~..
00000080: 8182 8384 8586 8788 898a 8b8c 8d8e 8f90 ................
00000090: 9192 9394 9596 9798 999a 9b9c 9d9e 9fa0 ................
000000a0: a1a2 a3a4 a5a6 a7a8 a9aa abac adae afb0 ................
000000b0: b1b2 b3b4 b5b6 b7b8 b9ba bbbc bdbe bfc0 ................
000000c0: c1c2 c3c4 c5c6 c7c8 c9ca cbcc cdce cfd0 ................
000000d0: d1d2 d3d4 d5d6 d7d8 d9da dbdc ddde dfe0 ................
000000e0: e1e2 e3e4 e5e6 e7e8 e9ea ebec edee eff0 ................
000000f0: f1f2 f3f4 f5f6 f7f8 f9fa fbfc fdfe ff ...............
open(..., 'wb')if writing binary.