Recoding of binary file

Question

I have a file that has this contents

1 5 9 14 15  
00000
10000
00010
11010
00010

I want to parse the file so that the following is output

UUUUUUUUUUUUUU
YUUUUUUUUUUUUU
UUUUUUUUUUUUYY
YUUUYUUUUUUUYU
UUUUUUUUUUUUYU

This means the first row provides a position. If there is a 0, it becomes U. If it is a 1 it becomes Y. Between the first two columns there are 4 unmapped cols which means that for these four cols all rows are U - and 0

I tried the following in python

    #!/usr/bin/env python2
import sys
with open(sys.argv[1]) as f:
    f.readline()
    for line in f:
        new = ''
        for char in line.rstrip():
            if char == '0':
                new += 'UU'
            elif char == '1':
                new +='YU'
        print new.rstrip()[:-1]

The problem is that this script only works if the positions are 2 apart but they can also be larger - how can I extend the script?

there is some poroblem when i run the code from, Delimity - get an error

dropbox.com/s/cf8rbv20bgyvssq/conv_inp?dl=0 these are the real da

Traceback (most recent call last):
  File "./con.py", line 8, in <module>
    for v in xrange(max(positions) + 1):
OverflowError: long int too large to convert to int

I made some edits to the question thinking I'd understand it better but I still have no idea what the first line (1 5 9 14 15 ) is for? — cheesysam
– cheesysam, Commented Jun 23, 2015 at 13:21
Are you sure that 00010 is UUUUUUUUUUUUYY? It should be UUUUUUUUUUUUYU! — Delimitry
– Delimitry, Commented Jun 23, 2015 at 13:28

dlask · Accepted Answer · 2015-06-23 13:33:41Z

1

Just a guess.

Implement the converter:

def convert(s):
    return "UUU".join({"0": "U", "1": "Y"}[c] for c in s[:-1]) + "U"

And test it:

assert convert("00000") == "UUUUUUUUUUUUUU"
assert convert("10000") == "YUUUUUUUUUUUUU"
assert convert("00010") == "UUUUUUUUUUUUYU"
assert convert("11010") == "YUUUYUUUUUUUYU"
assert convert("00010") == "UUUUUUUUUUUUYU"

answered Jun 23, 2015 at 13:33

dlask

9,0321 gold badge29 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

kagh Over a year ago

how could I just write the recoded thing to a file without the "

dlask Over a year ago

The quotation marks are there only in the source code. When you write a string to a file these quotation marks are not there.

kagh Over a year ago

how could i just use ths as a file to exe?

dlask Over a year ago

You can use your original code, just reduced in this way: for line in f: print convert(line.rstrip()). It might be necessary to skip the first input line but you are able to solve this problem by yourself.

Delimitry · Accepted Answer · 2015-06-23 14:07:34Z

0

Check this code:

#!/usr/bin/env python2
import sys

def myxrange(to):
    x = 0
    while x < to:
        yield x
        x += 1

with open(sys.argv[1]) as f:
    positions = map(lambda x: long(x) - 1, f.readline().split())
    max_pos = max(positions)
    for line in f:
        new = ''
        for i in myxrange(max_pos + 1):
            if i in positions and line[positions.index(i)] == '1':
                new += 'Y'
            else:
                new += 'U'
        print new.rstrip()

edited Jun 23, 2015 at 14:07

answered Jun 23, 2015 at 13:34

Delimitry

3,0374 gold badges33 silver badges39 bronze badges

3 Comments

kagh Over a year ago

the code works for the example but nit for my real file...the real file is here..do you think you coudl have a quick look? dropbox.com/s/cf8rbv20bgyvssq/conv_inp?dl=0

kagh Over a year ago

the error is Traceback (most recent call last): File "./con.py", line 8, in <module> for v in xrange(max(positions) + 1): OverflowError: long int too large to convert to int

kagh Over a year ago

is there a possibility to make it fast?..it is rather slo on th einpu...the file with more than 120 mb in the description

Collectives™ on Stack Overflow

Recoding of binary file

2 Answers 2

4 Comments

3 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

3 Comments

Related