I am trying to grep a list of strings listed in 7253.txt which looks like this:
rs11078372 
rs1124961 
rs11651880 
rs11659047 
rs1736209
using:
grep -o -f 7253.txt *.logistic > result.txt
from multiple files *.logistic. The files are on larger size and this grep command takes forever.
.logistic files look like this:
#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213
so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.
Do you have more efficient way to parse those *.logistic files?
There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...
It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)
To extract only ID from .logistic I could do this:
awk 'FNR!=1 {print $3}' *.logistic | grep -o -w -F -f 7253.txt > result.txt
But how to extract ID and P column, which are 3rd and 12th column in .logistic
Thanks Ana