2

Test File Contents (in Binary)

00010203 04050607 08090A0B 0C0D0E0F 
10111213 14151617 18191A1B 1C1D1E1F 
20212223 24252627 28292A2B 2C2D2E2F 
30313233 34353637 38393A3B 3C3D3E3F 
40414243 44454647 48494A4B 4C4D4E4F 
50515253 54555657 58595A5B 5C5D5E5F 
60616263 64656667 68696A6B 6C6D6E6F 
70717273 74757677 78797A7B 7C7D7E7F 
80818283 84858687 88898A8B 8C8D8E8F 
90919293 94959697 98999A9B 9C9D9E9F 
A0A1A2A3 A4A5A6A7 A8A9AAAB ACADAEAF 
B0B1B2B3 B4B5B6B7 B8B9BABB BCBDBEBF 
C0C1C2C3 C4C5C6C7 C8C9CACB CCCDCECF 
D0D1D2D3 D4D5D6D7 D8D9DADB DCDDDEDF 
E0E1E2E3 E4E5E6E7 E8E9EAEB ECEDEEEF 
F0F1F2F3 F4F5F6F7 F8F9FAFB FCFDFEFF

Test Code

#open file 1
f1 = open(test.txt, 'rb')

#declare variables
address = 0

#read a byte
while(address < 256):
    byte = f1.read(1)
    print(byte)
    address = address + 1

What is Returned

b'\x00'
b'\x01'
b'\x02'
b'\x03'
b'\x04'
b'\x05'
b'\x06'
b'\x07'
b'\x08'
b'\t'
b'\n'
b'\x0b'
b'\x0c'
b'\r'
b'\x0e'
b'\x0f'
b'\x10'
b'\x11'
b'\x12'
b'\x13'
b'\x14'
b'\x15'
b'\x16'
b'\x17'
b'\x18'
b'\x19'
b'\x1a'
b'\x1b'
b'\x1c'
b'\x1d'
b'\x1e'
b'\x1f'
b' '
b'!'
b'"'
b'#'
b'$'
b'%'
b'&'
b"'"
b'('
b')'
b'*'
b'+'
b','
b'-'
b'.'
b'/'
b'0'
b'1'
b'2'
b'3'
b'4'
b'5'
b'6'
b'7'
b'8'
b'9'
b':'
b';'
b'<'
b'='
b'>'
b'?'
b'@'
b'A'
b'B'
b'C'
b'D'
b'E'
b'F'
b'G'
b'H'
b'I'
b'J'
b'K'
b'L'
b'M'
b'N'
b'O'
b'P'
b'Q'
b'R'
b'S'
b'T'
b'U'
b'V'
b'W'
b'X'
b'Y'
b'Z'
b'['
b'\\'
b']'
b'^'
b'_'
b'`'
b'a'
b'b'
b'c'
b'd'
b'e'
b'f'
b'g'
b'h'
b'i'
b'j'
b'k'
b'l'
b'm'
b'n'
b'o'
b'p'
b'q'
b'r'
b's'
b't'
b'u'
b'v'
b'w'
b'x'
b'y'
b'z'
b'{'
b'|'
b'}'
b'~'
b'\x7f'
b'\x80'
b'\x81'
b'\x82'
b'\x83'
b'\x84'
b'\x85'
b'\x86'
b'\x87'
b'\x88'
b'\x89'
b'\x8a'
b'\x8b'
b'\x8c'
b'\x8d'
b'\x8e'
b'\x8f'
b'\x90'
b'\x91'
b'\x92'
b'\x93'
b'\x94'
b'\x95'
b'\x96'
b'\x97'
b'\x98'
b'\x99'
b'\x9a'
b'\x9b'
b'\x9c'
b'\x9d'
b'\x9e'
b'\x9f'
b'\xa0'
b'\xa1'
b'\xa2'
b'\xa3'
b'\xa4'
b'\xa5'
b'\xa6'
b'\xa7'
b'\xa8'
b'\xa9'
b'\xaa'
b'\xab'
b'\xac'
b'\xad'
b'\xae'
b'\xaf'
b'\xb0'
b'\xb1'
b'\xb2'
b'\xb3'
b'\xb4'
b'\xb5'
b'\xb6'
b'\xb7'
b'\xb8'
b'\xb9'
b'\xba'
b'\xbb'
b'\xbc'
b'\xbd'
b'\xbe'
b'\xbf'
b'\xc0'
b'\xc1'
b'\xc2'
b'\xc3'
b'\xc4'
b'\xc5'
b'\xc6'
b'\xc7'
b'\xc8'
b'\xc9'
b'\xca'
b'\xcb'
b'\xcc'
b'\xcd'
b'\xce'
b'\xcf'
b'\xd0'
b'\xd1'
b'\xd2'
b'\xd3'
b'\xd4'
b'\xd5'
b'\xd6'
b'\xd7'
b'\xd8'
b'\xd9'
b'\xda'
b'\xdb'
b'\xdc'
b'\xdd'
b'\xde'
b'\xdf'
b'\xe0'
b'\xe1'
b'\xe2'
b'\xe3'
b'\xe4'
b'\xe5'
b'\xe6'
b'\xe7'
b'\xe8'
b'\xe9'
b'\xea'
b'\xeb'
b'\xec'
b'\xed'
b'\xee'
b'\xef'
b'\xf0'
b'\xf1'
b'\xf2'
b'\xf3'
b'\xf4'
b'\xf5'
b'\xf6'
b'\xf7'
b'\xf8'
b'\xf9'
b'\xfa'
b'\xfb'
b'\xfc'
b'\xfd'
b'\xfe'
b'\xff'

Running the the code returns this. For my program to work correctly, I need the values like b'!' to be returned as b'\x20'. What can I do to accomplish this? Thank for your help!

5
  • Why would b'!' be displayed as 0x20, ever? 0x20 is a space in ASCII. 0x21 is '!'. Commented Aug 12, 2015 at 17:43
  • When you're hex editing files and want to automate long and tedious processes. In my case, hacking Super Smash Bros. Commented Aug 12, 2015 at 17:48
  • '!' == b'\x20' returns True Commented Aug 12, 2015 at 17:48
  • @Jordanbarkley: but when you come across a section with embedded text, you'd hope to see the actual text, not the series of escape codes. Python picks the latter option for the default representation. If you want a different display, format the data. Commented Aug 12, 2015 at 17:51
  • If you are specifically interested in working with the hex value for each character c, you can always do: hex(ord(c)) which in the case of c = '!' would yield the string '0x21' Commented Aug 12, 2015 at 17:52

1 Answer 1

4

The byte values are correct. Python just choses to show you ASCII characters when possible, to aid debugging:

>>> bytes([0x21])
b'!'
>>> bytes([0x21])[0]
33

The actual byte value is still 33 decimal, 21 hexadecimal, but that byte maps to an ASCII character. Any printable ASCII codepoint will be displayed as such whenever you produce the representation (repr()) output for a bytes object, as that is far more readable. Certain characters (newline, carriage return) are displayed using their corresponding literal escape syntax, e.g. \n or \r, while only the remainder uses \xhh hex codes. Would you rather Python displays b'\x48\x65\x6c\x6c\x6f\x20\x77\x6f\x72\x6c\x64\x0a' or b'Hello world\n' when debugging code handling bytes?

If you want to display hex values, explicitly format the byte value:

print(format(byte[0], '02x'))

to display it as a 2-digit lowercase hex, or

print(format(byte[0], '#04x'))

to include a leading 0x. Use X for uppercase.

Demo:

>>> format(bytes([0x21])[0], '02x')
'21'
>>> format(bytes([0x21])[0], '#04x')
'0x21'

If you want to display a series of bytes, you can use the binascii.hexlify() function:

>>> from binascii import hexlify
>>> hexlify(b'Hello world\n')
b'48656c6c6f20776f726c640a'
>>> print(hexlify(b'Hello world\n').decode('ASCII'), b'Hello world\n', sep='\t')
48656c6c6f20776f726c640a    b'Hello world\n'

With a bit of formatting, you can make any binary file display in both hexadecimal and ASCII representations.

Sign up to request clarification or add additional context in comments.

3 Comments

@muddyfish: no you can't. You can use codecs.encode(bytesobj, 'hex'), but bytes objects in Python 3 have no .encode() method. codecs.encode() with 'hex' as second argument delegates to binascii.hexlify() (binascii.hexlify() is an alias for binascii.b2a_hex()), so you may as well use it directly.
Ok for python 3 but it does work in py2. I just didnt read thats what the op was in
@muddyfish: you'll only see the b'...' literal notation when printing in Python 3.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.