0

I have a text file in the following format (1 line):

[NN] ||| transplant ||| transplantation ||| PPDB2.0Score=5.24981 PPDB1.0Score=3.295900 -logp(LHS|e1)=0.18597 -logp(LHS|e2)=0.14031 -logp(e1|LHS)=11.83583 -logp(e1|e2)=1.80507 -logp(e1|e2,LHS)=1.46728 -logp(e2|LHS)=11.47593 -logp(e2|e1)=1.49083 -logp(e2|e1,LHS)=1.10738 AGigaSim=0.63439 Abstract=0 Adjacent=0 CharCountDiff=5 CharLogCR=0.40547 ContainsX=0 Equivalence=0.371472 Exclusion=0.000344 GlueRule=0 GoogleNgramSim=0.03067 Identity=0 Independent=0.078161 Lex(e1|e2)=9.64663 Lex(e2|e1)=59.48919 Lexical=1 LogCount=4.67283 MVLSASim=NA Monotonic=1 OtherRelated=0.372735 PhrasePenalty=1 RarityPenalty=0 ForwardEntailment=0.177287 SourceTerminalsButNoTarget=0 SourceWords=1 TargetComplexity=0.98821 TargetFormality=0.98464 TargetTerminalsButNoSource=0 TargetWords=1 UnalignedSource=0 UnalignedTarget=0 WordCountDiff=0 WordLenDiff=5.00000 WordLogCR=0 ||| 0-0 ||| OtherRelated

What I want is to extract transplant and transplantation. How would you do that? Each line in the text file is varying in length for the values between the ||| separator. To illustrate, here is a second example:

[VBZ] ||| reflects ||| understand ||| PPDB2.0Score=3.50769 PPDB1.0Score=21.844910 -logp(LHS|e1)=0.01251 -logp(LHS|e2)=10.87470 -logp(e1|LHS)=6.91653 -logp(e1|e2)=11.53225 -logp(e1|e2,LHS)=4.29729 -logp(e2|LHS)=16.55913 -logp(e2|e1)=10.31266 -logp(e2|e1,LHS)=13.93988 AGigaSim=0.54532 Abstract=0 Adjacent=0 CharCountDiff=2 CharLogCR=0.22314 ContainsX=0 Equivalence=0.006535 Exclusion=0.022332 GlueRule=0 GoogleNgramSim=0 Identity=0 Independent=0.456621 Lex(e1|e2)=62.90141 Lex(e2|e1)=62.90141 Lexical=1 LogCount=0 MVLSASim=NA Monotonic=1 OtherRelated=0.404562 PhrasePenalty=1 RarityPenalty=0.36788 ForwardEntailment=0.109950 SourceTerminalsButNoTarget=0 SourceWords=1 TargetComplexity=0.99354 TargetFormality=1.00000 TargetTerminalsButNoSource=0 TargetWords=1 UnalignedSource=0 UnalignedTarget=0 WordCountDiff=0 WordLenDiff=2.00000 WordLogCR=0 ||| 0-0 ||| Independent

The target words here are reflects and understands.

1

1 Answer 1

2

Split by ' ||| '?

your_text.split(' ||| ') Would give you a list of elements separated by ' ||| '

So

your_text.split(' ||| ')[1:3] would return ['reflects','understands']

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.