2

I have been receiving indexing errors in python. I got my code to work correctly through reading in a file and simply printing the desired output, but now I am trying to write the output to a file. I seem to be having a problem with indexing when trying to write it. I've tried a couple different things, I left an attempt commented out. Either way I keep getting an indexing error. EDIT Original error may be caused by an error in eclipse, but when running on server, having a new issue*

I can now get it to run and produce output to a .txt file, however it only prints a single output

with open("blast.txt") as blast_output:
for line in blast_output:
    subFields = [item.split('|') for item in line.split()]
    #transId = str(subFields[0][0])
    #iso = str(subFields[0][1])
    #sp = str(subFields[1][3])
    #identity = str(subFields[2][0])
    out = open("parsed_blast.txt", "w")
    #out.write(transId + "\t" + iso + "\t" + sp + "\t" + identity)
    out.write((str(subFields[0][0]) + "\t" + str(subFields[0][1]) + "\t" + str(subFields[1][3]) + "\t" + str(subFields[2][0])))
    out.close()


IndexError: list index out of range

Input file looks like:

c0_g1_i1|m.1    gi|74665200|sp|Q9HGP0.1|PVG4_SCHPO      100.00  372     0       0       1       372     1       372     0.0       754
c1002_g1_i1|m.801       gi|1723464|sp|Q10302.1|YD49_SCHPO       100.00  646     0       0       1       646     1       646     0.0      1310
c1003_g1_i1|m.803       gi|74631197|sp|Q6BDR8.1|NSE4_SCHPO      100.00  246     0       0       1       246     1       246     1e-179    502
c1004_g1_i1|m.804       gi|74676184|sp|O94325.1|PEX5_SCHPO      100.00  598     0       0       1       598     1       598     0.0      1227
c1005_g1_i1|m.805       gi|9910811|sp|O42832.2|SPB1_SCHPO       100.00  802     0       0       1       802     1       802     0.0      1644
c1006_g1_i1|m.806       gi|74627042|sp|O94631.1|MRM1_SCHPO      100.00  255     0       0       1       255     47      301     0.0       525

Expected output

c0_g1_i1    m.1 Q9HGP0.1    100.00
c1002_g1_i1 m.801   Q10302.1    100.00
c1003_g1_i1 m.803   Q6BDR8.1    100.00
c1004_g1_i1 m.804   O94325.1    100.00
c1005_g1_i1 m.805   O42832.2    100.00
c1006_g1_i1 m.806   O94631.1    100.00

My output is instead only one of the lines instead of all of the lines

9
  • You need to include an example of the input file Commented Nov 13, 2016 at 0:51
  • Check the shape of subFields first. You will find out why you are getting this error. Commented Nov 13, 2016 at 0:52
  • It's not a problem with writing to the file but with your subfields variable; IndexError means that one of your indexes simply doesn't exist it migth be subFields[1][3] or subFields[2][0] . You should have guessed it with your commented attempt, python always indicate the exact line the error is in. Commented Nov 13, 2016 at 0:55
  • 1
    @JamieLeigh what do you mean it prints [] when you write to the file? Please show the exact code that is going wrong because the code you have written works for me with your sample input Commented Nov 13, 2016 at 1:06
  • 2
    You are overwriting the same file again and again. Open the file outside the for loop or open it in append mode 'a' Commented Nov 13, 2016 at 1:18

3 Answers 3

3

You are overwriting the same file again and again. Open the file outside the for loop or open it in append mode 'a'

Sign up to request clarification or add additional context in comments.

Comments

1

I suggest you write the whole file to a string.

with open("blast.txt", 'r') as fileIn:
      data = fileIn.read()

then process the data.

data = func(data)

Then write to file out.

  with open('bast_out.txt','w') as fileOut:
      fileOut.write()

Comments

1

As @H Doucet said, write the whole thing to a string, then work with it. Leave the open() function out of the loop so it only opens & closes the file once, and make sure to open as "append." I've also cleaned up your out.write() function. No need to specify those list items as strings, they already are. And added a newline ("\n") to the end of each line.

with open("blast.txt") as f:
    blast_output = f.read()

out = open("parsed_blast.txt", "a")
for line in blast_output.split("\n"):
    subFields = [item.split('|') for item in line.split()]
    out.write("{}\t{}\t{}\t{}\n".format(subFields[0][0], subFields[0][1],
                                        subFields[1][3], subFields[2][0]))

out.close()

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.