2

I have file with n lines.  (Each line refers to a “question”, and therefore they are labeled Q.1, Q.2, Q.3, ..., Q.n.)  Each line (question) has a “Marks” attribute, which has the value 2, 3, 4, 5, or 6.  There are n5 lines with each value.

For example: A 10-line file (i.e., n=10) might look like

amol@mypc:~$ cat questions.txt
Q.1 2 Marks
Q.2 5 Marks
Q.3 4 Marks
Q.4 3 Marks
Q.5 6 Marks
Q.6 4 Marks
Q.7 3 Marks
Q.8 2 Marks
Q.9 6 Marks
Q.10 5 Marks

I know I can split this into five homogeneous (i.e., all the same) files with something like

amol@mypc:~$ grep " 2 Marks" questions.txt > questions2Marks.txt
amol@mypc:~$ grep " 3 Marks" questions.txt > questions3Marks.txt
amol@mypc:~$ grep " 4 Marks" questions.txt > questions4Marks.txt
amol@mypc:~$ grep " 5 Marks" questions.txt > questions5Marks.txt
amol@mypc:~$ grep " 6 Marks" questions.txt > questions6Marks.txt

Each of the resulting files will have n5 lines.

I want to do the inverse operation – i.e., produce a transpose of the above result.  I want to split my questions.txt file into n5 files: questions1.txt, questions2.txt, questions3.txt, ..., questionsM.txt (using M to represent n5) where each file is five lines long and is heterogeneous (i.e., all different).

questions1.txt should contain

  • the first line in questions.txt with 2 Marks,
  • the first line in questions.txt with 3 Marks,
  • the first line in questions.txt with 4 Marks,
  • the first line in questions.txt with 5 Marks, and
  • the first line in questions.txt with 6 Marks,

in that order.  questions2.txt should contain the second line of each, etc.

So, for n=10, M obviously is 2.  I would want my example questions.txt from above broken down into these two files:

amol@mypc:~$ cat questions1.txt            
Q.1 2 Marks
Q.4 3 Marks
Q.3 4 Marks
Q.2 5 Marks
Q.5 6 Marks

amol@mypc:~$ cat questions2.txt            
Q.8 2 Marks
Q.7 3 Marks
Q.6 4 Marks
Q.10 5 Marks
Q.9 6 Marks

How can I achieve that using *nix tools (sed, awk, perl, shell script, etc...)?

1
  • So you want to read the file sequentially, and each time you get a group of values 2-3-4-5-6 from the second column, sort the group on that column, and write it to a numbered file? Commented Jul 24, 2015 at 13:33

2 Answers 2

6
sort -n -k2 -k1.3 file | awk '{$2!=a?x=1:x++} {print > "file"x; a=$2}'

First , we need to sort the file correctly. -n sorts the file numerically, -k2 sorts according to the second field (the marks 2-6), -k1.3 then sorts within this order the first field starting from the 3rd character numerically (irgnoring the leading Q.). Now awk splits the output between ascending files (file1, file2, file3, filen....).

The output looks like this, file1:

$ cat file1
Q.1 2 Marks
Q.4 3 Marks
Q.3 4 Marks
Q.2 5 Marks
Q.5 6 Marks

And file2:

$ cat file2
Q.8 2 Marks
Q.7 3 Marks
Q.6 4 Marks
Q.10 5 Marks
Q.9 6 Marks
4
  • Could also do awk '{print > "file"!(NR%2)+1}' Commented Jul 24, 2015 at 8:50
  • @chaos : You are quite correct. But I have file with lots of questions say 100 questions in a file of 2,3,4,5 and 6 Marks. How can I divide them into file1, file2, file3..............upto file20 for 100 questions. So that it will create 20 files of each 2,3,4,5 and 6 Marks..Hope you understand my query.... Commented Jul 24, 2015 at 9:21
  • @amolveer I edited my answer, now it's working with multiple files. Commented Jul 24, 2015 at 9:33
  • @chaos : Perfect, cool man Commented Jul 24, 2015 at 9:37
3

an awk answer: this will keep the order the questions the same as in the source file.

$ awk '{filename = "questions" ++n[$2] ".txt"; print > filename}' questions.txt 
$ cat questions1.txt 
Q.1 2 Marks
Q.2 5 Marks
Q.3 4 Marks
Q.4 3 Marks
Q.5 6 Marks
$ cat questions2.txt 
Q.6 4 Marks
Q.7 3 Marks
Q.8 2 Marks
Q.9 6 Marks
Q.10 5 Marks

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.