1

I'd like an awk script to select the columns from a file based on a list of columns in another file. For example:

$cat cols
3 2 6 4

$cat text
a b c d e f g
h i j k l m n

$awk_script cols text
c b f d
j i m k

So the 3rd, 2nd, 6th and fourth columns have been selected in that order.

Thanks

0

2 Answers 2

3

You can use this:

awk 'NR==FNR{n=split($0,c);next}{for(i=1;i<n;i++){printf "%s%s", $c[i], OFS};print ""}' cols text

We are passing two input files to awk, first the cols then the text. awk counts the number of input lines processed in the internal variable NR. FNR is the record number in the current file. When reading the first (and only) line of cols NR and FNR have a value of 1 meaning the following block gets executed.

{n=split($0,c);next} splits the whole line which is stored in $0 into the the array c using the global field delimiter and saves the number of columns to print in n. We will later use n in a for loop. next tells awk to stop processing the current line and read the next line of input.

The block {for(i=1;i<=n+1;i++){printf "%s",$c[i],OFS};print ""} gets executed on all other lines since it is not prefixed with a condition. The for loop iterates through cols and prints the corresponding columns delimited by the output file separator OFS. Finally we print a new line.

Output:

c b f d
j i m k
Sign up to request clarification or add additional context in comments.

2 Comments

Never do a printf with input data in the format field (printf $c[i]), use the full printf synopsis instead, printf "%s", $c[i]. Imagine the difference if $c[i] contained a printf formatting string, e.g. %s. You're missing printing OFS between fields. The loop should end at <=n, not <n+1 for clarity and efficiency. Also, you should use print "" instead of printf "\n" as it's briefer and uses whatever value ORS is set to instead of hard-coding the same value.
@EdMorton Thanks for the advices. Much appreciated! I've edited it. The solution adds now and additonal OFS at the end of every line, but it should be good enough since OFS is a space.
3
$ awk 'NR==FNR{n=split($0,f);next} {for (i=1;i<=n;i++) printf "%s%s", $(f[i]), (i<n?OFS:ORS)}' cols text
c b f d
j i m k

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.