10

I want to parse some data with Python and scapy. Therefor I have to analyse single bits. But at the moment I have for example UDP packets with some payload like:

bytes = b'\x18\x00\x03\x61\xFF\xFF\x00\x05\x42\xFF\xFF\xFF\xFF'

Is there any elegant way to convert the bytes so that I can access single bits like:

bytes_as_bits = convert(bytes)
bit_at_index_42 = bytes_as_bits[42]
2
  • So.. what will be bit 8 for example? An MSB of the second byte? An LSB of it? Commented May 4, 2017 at 15:28
  • Have you tried something like ''.join(f'{byte:b}' for byte in bytes)? Commented May 4, 2017 at 15:29

6 Answers 6

14

That will work:

def access_bit(data, num):
    base = int(num // 8)
    shift = int(num % 8)
    return (data[base] >> shift) & 0x1

If you'd like to create a binary array you can use it like this:

[access_bit(data,i) for i in range(len(data)*8)]
Sign up to request clarification or add additional context in comments.

Comments

8

If you would like to have the bits string, or to spare yourself from creating a function, I would use format() and ord(), let me take a simpler example to illustrate

bytes = '\xf0\x0f'
bytes_as_bits = ''.join(format(ord(byte), '08b') for byte in bytes)

This should output: '1111000000001111'

If you want LSB first you can just flip the output of format(), so:

bytes = '\xf0\x0f'
bytes_as_bits = ''.join(format(ord(byte), '08b')[::-1] for byte in bytes)

This should output: '0000111111110000'

Now you want to use b'\xf0\x0f' instead of '\xf0\x0f'. For python2 the code works the same, but for python3 you have to get rid of ord() so:

bytes = b'\xf0\x0f'
bytes_as_bits = ''.join(format(byte, '08b') for byte in bytes)

And flipping the string is the same issue.

I found the format() functionality here. And the flipping ([::-1]) functionality here.

1 Comment

This and the accepted answer have advantages over one another.
5

Hm, there is no builtin bits type in python, but you can do something like

>>> bin(int.from_bytes(b"hello world", byteorder="big")).lstrip('0b')
'110100001100101011011000110110001101111001000000111011101101111011100100110110001100100'

The .lstrip('0b') method will remove any leading '0b' characters in the output of the bin() function.

3 Comments

How does your answer handle large byte arrays?
Payload doesnt seem to be large in this particular case.
Have you both even tried it before criticizing? It has 11 chars, that is 88 bits, and works just fine with much longer inputs.
3
>>> n=17
>>> [(n & (1<<x))>>x for x in [7,6,5,4,3,2,1,0]]
[0, 0, 0, 1, 0, 0, 0, 1]

Comments

0

To extend @Liran's answer I have added byteorder as an input argument which defaults to 'big'. Note I am not refering to bit packing within the bytes.

def access_bit(b: bytearray, n: int, byteorder: str = "big") -> int:
    """
    Returns the boolean value of the nth bit (n) from the byte array (b).
    The byteorder argument accepts the literal strings ['little', 'big'] and
    refers to the byte order endianness
    """
    base = int(n // 8)
    shift = int(n % 8)
    if byteorder == "big":
        return (b[-base - 1] >> shift) & 0x1
    elif byteorder == "little":
        return (b[base] >> shift) & 0x1
    else:
        raise KeyError("byteorder only recognises 'big' or 'little'")

access_bit(b, 0) returns the least significant bit of the least significant byte assuming big-endian

access_bit(b, 7) returns the most significant bit of the least significant byte assuming big-endian

access_bit(b, 0, 'little') returns the least significant bit of the least significant byte specifying little-endian

access_bit(b, 7) returns the most significant bit of the least significant byte assuming little-endian

Specifying an index n outside the range of the bytearray will result in an error (i.e. access_bit(b'\x05\x01', 16) results in an error as the max index of the bytearray is 15)

Comments

0

I would just use a simple lambda expression to convert the bytes to a string:

>>> bytes = b'\x18\x00\x03\x61\xFF\xFF\x00\x05\x42\xFF\xFF\xFF\xFF'
>>> convert = lambda x: f"{int.from_bytes(x, 'big'):b}"
>>> bytes_as_bits = convert(bytes)
>>> bytes_as_bits[42]
'1'
>>> _

'big' is the byteorder to be used. The official python documentation describes it as follows:

The byteorder argument determines the byte order used to represent the integer. If byteorder is "big", the most significant byte is at the beginning of the byte array. If byteorder is "little", the most significant byte is at the end of the byte array. To request the native byte order of the host system, use sys.byteorder as the byte order value.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.