0

I am new to python , and I want to extract the data from this format

<seq id> <alignment start> <alignment end> <envelope start> <envelope end> <hmm acc> <hmm name> <type> <hmm start> <hmm end> <hmm length> <bit score> <E-value> <significance> <clan>

**FBpp0143497**      **5    151**      5    157 PF00339.22  **Arrestin_N**        Domain     1   135   149     83.4   **1.1e-23**   1 CL0135   
**FBpp0143497**    **183    323**    183    324 PF02752.15  Arrestin_C        Domain     1   137   138     58.5     **6e-16**   1 CL0135   
FBpp0131987     60    280     51    280 PF00089.19  Trypsin           Domain    14   219   219    127.7   3.7e-37   1 CL0124  

to this format

>FBpp0143497
 5      151        Arrestin_N     1.1e-23

>FBpp0143497
 183    323        Arrestin_C     6e-16
2
  • 4
    Please show the code you have so far. This is not "Write Code For Me.COM" Commented May 19, 2010 at 9:59
  • possible duplicate of problem in extracting the data from text file Commented May 20, 2010 at 13:22

3 Answers 3

1

You could parse the file with the 'csv' module, using space as a delimiter. See the documentation for csv.reader

Sign up to request clarification or add additional context in comments.

Comments

1

As this is proteomic data, probably you could find dedicated parsers in the BioPython package

Comments

0

You can use split() to separate the items at spaces and then print out the values you want from the returned list.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.