So, I've got a large number of files, each one with 8 columns and a lot of rows. Here's a head from one of them for an example.
ID Ct 1 2 3 4 5 6
1 0 consensus - - - - -
2 0 consensus - - - - -
3 0 consensus consensus consensus consensus consensus consensus
4 0 consensus - consensus - - -
5 0 - AT AT GC GC AT
6 0 consensus - - - consensus -
7 0 consensus - - - - -
8 0 consensus consensus consensus - consensus consensus
9 0 consensus - - - - -
I want to separate out all the rows where the last 6 columns are at least 5/6 occupied. So ID 3, 5 and 8 (row 4, 6 and 9) from my head. So I want all the rows that have less than 2 columns with "-", effectively.
I used to be able to do that with a simple awk script because the program counted how many of the columns were occupied in the second column - seems like I can't do that any more. What's the best way to do it?
awkscript may depend on that.)seems like I can't do that any more?