6

This is similar to what I want to do: breaking a 32-bit number into individual fields

This is my typical "string" 00000000110000000000011000000000

I need to break it up into four equal parts:

00000000

11000000

00000110

00000000

I need to append the list to a new text file with the original string as a header.

I know how to split the string if there were separators such as spaces but my string is continuous.

These could be thought of as 32bit and 8bit binary numbers but they are just text in a text file (for now)!

I am brand new to programing in Python so please, I need patient details, no generalizations.

Do not assume I know anything.

Thank you,

Ralph

0

5 Answers 5

10

This should do what you want. See comprehensions for more details.

>>> s = "00000000110000000000011000000000"
>>> [s[i:i+8] for i in xrange(0, len(s), 8)]
['00000000', '11000000', '00000110', '00000000']
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, but what is ['00000000', '11000000', '00000110', '00000000']?
this is a 'list' containing your strings at fixed positions. Try E.g. 'mylist = ['00000000', '11000000', '00000110', '00000000']` mylist[0] will give you the first element. See also here and here
3

For reference, here are a few alternatives for splitting strings into equal length parts:

>>> import re
>>> re.findall(r'.{1,8}', s, re.S)
['00000000', '11000000', '00000110', '00000000']

>>> map(''.join, zip(*[iter(s)]*8))
['00000000', '11000000', '00000110', '00000000']

The zip method for splitting a sequence into n-length groups is documented here, but it will only work for strings whose length is evenly divisible by n (which won't be an issue for this particular question). If the string length is not evenly divisible by n you could use itertools.izip_longest(*[iter(s)]*8, fillvalue='').

1 Comment

If I have ['00000000', '11000000', '00000110', '00000000'] Why do I need to ask this question? I do not understand what is being said when you use ['00000000', '11000000', '00000110', '00000000']? The character makeup will be unknown until the line from the file is parsed. Or is ['00000000', '11000000', '00000110', '00000000'] the expected output? Thanks Ralph
3

+1 for Robert's answer. As for 'I need to append the list to a new text file with the original string as a header':

s = "00000000110000000000011000000000"
s += '\n' + '\n'.join(s[i:i+8] for i in xrange(0, len(s), 8))

will give

'00000000110000000000011000000000\n00000000\n11000000\n00000110\n00000000'

thus putting each 'byte' on a separate line as I understood from your question...

Edit: some notes to help you understand: A list [] (see here) contains your data, in this case, strings, between its brackets. The first item in a list is retrieved as in:

mylist[0]

in Python, a string is itself also an object, with specific methods that you can call. So '\n' (representing a carriage return) is an object of type 'string', and you can call it's method join() with your list as argument:

'\n'.join(mylist)

The elements in the list are then 'joined' together with the string '\n' in between each element. The result is no longer a list, but a string. Two strings can be added together, thus

s += '\n' + '\n'.join(mylist)

adds to s (which was already a string), the right part which is itself a 'sum' of strings. (I hope that clears some things up?)

2 Comments

Thanks Remi, the "00000000110000000000011000000000" was given as a example the string will need to be read from a text file so I imagine
for a long file you can read 32bit strings at-a-time: with open('data.txt') as f: A2 = f.read(3) bits= f.read(33).strip() (the .strip() takes away the trailing space)
1

Strings, Lists and Touples can be broken using the indexing operator []. Using the : operator inside of the indexing operator you can achieve fields there. Try something like:

x = "00000000110000000000011000000000"
part1, part2, part3, part4 = x[:8], x[8:16], x[16:24], x[24:]

2 Comments

Thanks everybody, the indexing operator [] with the : operator appears to be the key!! I'll need to parse a text file with patterns, a typical pattern being: A1 00000000000000111000000000000000 00000000000001111100000000000000 00000000000011000110000000000000 00000000000110000011000000000000 00000000001100000001100000000000 00000000001111111111100000000000 00000000011111111111110000000000 00000000110000000000011000000000 00000001100000000000001100000000 00000011000000000000000110000000 00000110000000000000000011000000 00001100000000000000000001100000
OK, make sure to check the available string methods; knowing them is power... E.g. s.split() will split your pattern over the spaces, resulting in a list of the 32bit-strings!
0

you need a substring

x = 01234567
x0 = x[0:2]
x1 = x[2:4]
x2 = x[4:6]
x3 = x[6:8]

So, x0 will hold '01', x1 will hold '23', etc.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.