Skip to main content
added 14 characters in body
Source Link
terdon
  • 252.2k
  • 69
  • 480
  • 718

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods? I'm using OSX.

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods?

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods? I'm using OSX.

deleted 6 characters in body; edited tags
Source Link
Anthon
  • 81.4k
  • 42
  • 174
  • 228

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods? Thanks

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods? Thanks

I have the following csv file:

ID,PDBID,FirstResidue,SecondResidue,ThirdResidue,FourthResidue,Pattern
RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_3,4tov,1043,1044,1047,1048,CC/GA Naked ribose-zipper
RZ_AUTO_4,4tov,1556,1557,1514,1515,CC/GA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_6,4tow,766,767,1524,1525,AA/CG Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

I need to go through each row and print rows where the value of FirstResidue and SecondResidue can be extended (meaning the SecondResidue becomes the FirstResidue in a different row having the same PDBID). For example, (line RZ_AUTO_1 & line line RZ_AUTO_2) AND (line RZ_AUTO_5 & line line RZ_AUTO_7). The output should look something like this:

RZ_AUTO_1,4tov,1404,1405,1518,1519,CG/AA Canonical ribose-zipper
RZ_AUTO_2,4tov,1405,1406,1517,1518,GU/AA Naked ribose-zipper
RZ_AUTO_5,4tow,130,131,99,100,AU/CA Canonical ribose-zipper
RZ_AUTO_7,4tow,131,132,98,99,UC/AC Canonical ribose-zipper

Is it possible to do this using awk or other unix methods?

edited tags
Link
terdon
  • 252.2k
  • 69
  • 480
  • 718
Source Link
Sri
  • 165
  • 1
  • 5
Loading