added 219 characters in body; edited tags

Source Link

edited Aug 12, 2019 at 20:08

35
1
4

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)

To extract only ID from .logistic I could do this:

awk 'FNR!=1 {print $3}' *.logistic | grep -o -w -F -f 7253.txt > result.txt

But how to extract ID and P column, which are 3rd and 12th column in .logistic

Thanks Ana

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)

Thanks Ana

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)

To extract only ID from .logistic I could do this:

awk 'FNR!=1 {print $3}' *.logistic | grep -o -w -F -f 7253.txt > result.txt

But how to extract ID and P column, which are 3rd and 12th column in .logistic

Thanks Ana

added 117 characters in body

Source Link

edited Aug 12, 2019 at 19:41

Ana

35
1
4

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)

Thanks Ana

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

Thanks Ana

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

It would be great if I can have in result.txt extracted columns from .logistic for ID and P (3rd and 12th column)

Thanks Ana

Source Link

asked Aug 12, 2019 at 18:37

Ana

35
1
4

How to parse strings from a file in multiple other files?

I am trying to grep a list of strings listed in 7253.txt which looks like this:

using:

grep -o -f 7253.txt *.logistic > result.txt

from multiple files *.logistic. The files are on larger size and this grep command takes forever.

.logistic files look like this:

#CHROM  POS  ID REF ALT A1  TEST    OBS_CT  OR  LOG(OR)_SE  Z_STAT  P
17  16933404    rs11867934  T   C   T   ADD 32232   0.974082    0.0279353   -0.940008   0.347213

so the strings from 7253.txt are matched from ID column in .logistic. And they should be the exact match.

Do you have more efficient way to parse those *.logistic files?

There is 22 of these files, and they are named like:FINchr1.pheno.glm.logistic, FINchr2.pheno.glm.logistic...

Thanks Ana

command-line grep

Stack Exchange Network

Return to Question

How to parse strings from a file in multiple other files?