Skip to main content
added 90 characters in body
Source Link
user232326
user232326

The line of headers needs to be scanned to find all "not *".
That a column "has not" an * could be stored in an array a[].
For all next lines, only the columns that exist in a[] may need change.

That could be implemented as:

awk -F'|' 'BEGIN{OFS=FS}
           NR==1 {
                   for(i=1;i<=NF;i++) if( $i !="*"= "*" ) a[i]
                 } 
           NR>1  {
                   for(i in a)        if($i=="*" $i == "*" ) $i="-"
                 } 
           1
          ' file

a|s|d|f|g|*|A|*|*|g|c|a|*|A|*
a|-|-|f|g|*|-|*|*|g|c|a|*|A|*
-|s|-|f|g|*|a|t|*|g|c|a|*|A|*
a|s|d|-|g|*|T|*|C|g|c|a|a|A|T

This implements the least amount of changes needed. It should be the fastest.

The line of headers needs to be scanned to find all "not *".
That a column "has not" an * could be stored in an array a[].
For all next lines, only the columns that exist in a[] may need change.

That could be implemented as:

awk -F'|' 'BEGIN{OFS=FS}
           NR==1 {
                   for(i=1;i<=NF;i++) if($i!="*") a[i]
                 } 
           NR>1  {
                   for(i in a) if($i=="*") $i="-"
                 } 
           1
          ' file

a|s|d|f|g|*|A|*|*|g|c|a|*|A|*
a|-|-|f|g|*|-|*|*|g|c|a|*|A|*
-|s|-|f|g|*|a|t|*|g|c|a|*|A|*
a|s|d|-|g|*|T|*|C|g|c|a|a|A|T

The line of headers needs to be scanned to find all "not *".
That a column "has not" an * could be stored in an array a[].
For all next lines, only the columns that exist in a[] may need change.

That could be implemented as:

awk -F'|' 'BEGIN{OFS=FS}
           NR==1 {
                   for(i=1;i<=NF;i++) if( $i != "*" ) a[i]
                 } 
           NR>1  {
                   for(i in a)        if( $i == "*" ) $i="-"
                 } 
           1
          ' file

a|s|d|f|g|*|A|*|*|g|c|a|*|A|*
a|-|-|f|g|*|-|*|*|g|c|a|*|A|*
-|s|-|f|g|*|a|t|*|g|c|a|*|A|*
a|s|d|-|g|*|T|*|C|g|c|a|a|A|T

This implements the least amount of changes needed. It should be the fastest.

Source Link
user232326
user232326

The line of headers needs to be scanned to find all "not *".
That a column "has not" an * could be stored in an array a[].
For all next lines, only the columns that exist in a[] may need change.

That could be implemented as:

awk -F'|' 'BEGIN{OFS=FS}
           NR==1 {
                   for(i=1;i<=NF;i++) if($i!="*") a[i]
                 } 
           NR>1  {
                   for(i in a) if($i=="*") $i="-"
                 } 
           1
          ' file

a|s|d|f|g|*|A|*|*|g|c|a|*|A|*
a|-|-|f|g|*|-|*|*|g|c|a|*|A|*
-|s|-|f|g|*|a|t|*|g|c|a|*|A|*
a|s|d|-|g|*|T|*|C|g|c|a|a|A|T