I have a tab-separated file that looks like this:
cg13201342 F ARNT;ARNT;ARNT;CTSK 3'UTR;3'UTR;3'UTR;TSS1500
cg05269359 F SCN4B;SCN4B;SCN4B;SCN4B 3'UTR;3'UTR;3'UTR;Body
cg06018296 R NEK3;NEK3;NEK3;NEK3 3'UTR;3'UTR;3'UTR;Body
cg05172994 F WDR20;WDR20;WDR20;WDR20 3'UTR;3'UTR;3'UTR;Body
Desired output:
cg13201342 F ARNT 3'UTR
cg13201342 F ARNT 3'UTR
cg13201342 F ARNT 3'UTR
cg13201342 F CTSK TSS1500
cg05269359 F SCN4B 3'UTR
.
.
and so on
I tried
awk 'BEGIN {FS=OFS="\t"
FS = OFS = "\t"
}
{n=split
n = split($3, f, " *;*");
for (i=1; i<=n; i++)
print $1, f[i]
}' probe-genes-regions >chk
but that is only splitting tethe third column. I want the last clumncolumn to split together with the second column and form rows with 1st field of 3rd column and 1st field of last column and so on respectively