Skip to main content
Removed initial sort as it's not required until the end of the pipeline
Source Link
Chris Davies
  • 128k
  • 16
  • 178
  • 323

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' <file2 |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    sort |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    sort |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

tr '[:blank:]' $'\t' <file2 |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    sort |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out
Added additional sort into the pipeline to address concerns from comment
Source Link
Chris Davies
  • 128k
  • 16
  • 178
  • 323

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    sort |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    sort |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out
Source Link
Chris Davies
  • 128k
  • 16
  • 178
  • 323

What you have lost is the trailing ) on your stdin subclause. What I suspect you have gained is a headache trying to read your code.

Try this instead, which then lends itself to further optimisation because it's (more) readable:

sort -k1,1 file2 |
    tr '[:blank:]' $'\t' |
    awk '
        BEGIN { FS="\t"; OFS="\t" }
        { if ($2 == "u") print $0, $1; else print $0, $3 }
    ' |
    awk '
        { gsub(/ /,"\t"); l=$4; sub(/.*_/,"",l); print $2 "\t" $3 "\t" l }
    ' |
    join -t $'\t' -a1 -e "u" -1 1 -2 1 -o 1.1,2.1,2.2 file1 - >out