Skip to main content
formatting
Source Link
Freddy
  • 26.3k
  • 1
  • 27
  • 64

After some processing I ended up with a file like this one:

... ALA 251 VAL 252 TYR 253 LYS 254 SER 255 ALA 256 ALA 257 MET 258 LEU 259 ASP 260 MET 261 THR 262 GLY 263 ALA 264 GLY 265 TYR 266 VAL 267 TRP 268 ...

ALA 251
VAL 252
TYR 253
LYS 254
SER 255
ALA 256
ALA 257
MET 258
LEU 259
ASP 260
MET 261
THR 262
GLY 263
ALA 264
GLY 265
TYR 266
VAL 267
TRP 268

Let's call the first column "res" and the second "num". Please note that "res" is always composed of 3 letters, and "num" going from 1 to 4 numbers.

I am looking for a way to extract the position (value of column "num") corresponding to the first "res" of the exact column pattern composed of four successive "res" like this one:

TYR LYS SER ALA

TYR
LYS
SER
ALA

In this case, according to the file and the indicated pattern, the output should then be:

253

253

I made several attemps with awk. It seems it should be doable but my skills aren't sufficient for the moment. I would be very gratefull if any brilliant user has a proposition for this.

Best

After some processing I ended up with a file like this one:

... ALA 251 VAL 252 TYR 253 LYS 254 SER 255 ALA 256 ALA 257 MET 258 LEU 259 ASP 260 MET 261 THR 262 GLY 263 ALA 264 GLY 265 TYR 266 VAL 267 TRP 268 ...

Let's call the first column "res" and the second "num". Please note that "res" is always composed of 3 letters, and "num" going from 1 to 4 numbers.

I am looking for a way to extract the position (value of column "num") corresponding to the first "res" of the exact column pattern composed of four successive "res" like this one:

TYR LYS SER ALA

In this case, according to the file and the indicated pattern, the output should then be:

253

I made several attemps with awk. It seems it should be doable but my skills aren't sufficient for the moment. I would be very gratefull if any brilliant user has a proposition for this.

Best

After some processing I ended up with a file like this one:

ALA 251
VAL 252
TYR 253
LYS 254
SER 255
ALA 256
ALA 257
MET 258
LEU 259
ASP 260
MET 261
THR 262
GLY 263
ALA 264
GLY 265
TYR 266
VAL 267
TRP 268

Let's call the first column "res" and the second "num". Please note that "res" is always composed of 3 letters, and "num" going from 1 to 4 numbers.

I am looking for a way to extract the position (value of column "num") corresponding to the first "res" of the exact column pattern composed of four successive "res" like this one:

TYR
LYS
SER
ALA

In this case, according to the file and the indicated pattern, the output should then be:

253

I made several attemps with awk. It seems it should be doable but my skills aren't sufficient for the moment. I would be very gratefull if any brilliant user has a proposition for this.

Source Link

Find position of specific column pattern

After some processing I ended up with a file like this one:

... ALA 251 VAL 252 TYR 253 LYS 254 SER 255 ALA 256 ALA 257 MET 258 LEU 259 ASP 260 MET 261 THR 262 GLY 263 ALA 264 GLY 265 TYR 266 VAL 267 TRP 268 ...

Let's call the first column "res" and the second "num". Please note that "res" is always composed of 3 letters, and "num" going from 1 to 4 numbers.

I am looking for a way to extract the position (value of column "num") corresponding to the first "res" of the exact column pattern composed of four successive "res" like this one:

TYR LYS SER ALA

In this case, according to the file and the indicated pattern, the output should then be:

253

I made several attemps with awk. It seems it should be doable but my skills aren't sufficient for the moment. I would be very gratefull if any brilliant user has a proposition for this.

Best