1

I have a file as below

FHEAD
THEAD
TCUST
TITEM
TTEND
TTAIL
THEAD
TCUST
TCUST
TITEM
TITEM
TTEND
TTAIL
THEAD
TCUST
TITEM
TTEND
TTAIL
THEAD
TCUST
TCUST
TITEM
TTEND
TTAIL

I need to count thr number of occurrence of ONLY TCUST records between THEAD and TTAIL where the occurrence is more than once and print that file name and line.

There will be multiple files so I need to print the filename as well.

Expected result is

THEAD TCUST TCUST TITEM TITEM TTEND TTAIL THEAD TCUST TCUST TITEM TTEND TTAIL name of file

7
  • 2
    What line do you want to print? Or is it the line number? Of THEAD, TCUST lines? Or the count of TCUST lines? Each count separately or as a total? An expected result would help. Commented Nov 25, 2016 at 12:21
  • Hi Stéphane, Thanks for your reply. I want to find all occurrences of TCUST records where it is more than 1 between THEAD and TTAIL records, then print that line from THEAD to TTAIL (with more than 1 TCUST record) and also print the filename Commented Nov 25, 2016 at 12:37
  • Hi Sundeep- expected result is THEAD TCUST TCUST TTAIL THEAD TCUST TCUST TTAIL Commented Nov 25, 2016 at 12:43
  • @Sundeep. Yes. I want to extract lines between THEAD and TTAIL if TCUST occurs more than once Commented Nov 25, 2016 at 12:57
  • can you clarify: 1) the last line should be name of input file? and should that be printed only if there was at least one matching section? 2) can there be lines not matching TCUST between THEAD and TTAIL? Commented Nov 25, 2016 at 13:03

2 Answers 2

1
$ awk '
  /THEAD/{f=1; c=0; a = $0; next}
  f{a = a ORS $0; if(/TCUST/) c++}
  /TTAIL/{f=0; if(c > 1){print a; m=1} }
  ENDFILE{if(m) print FILENAME; m=0}
  ' ip.txt
THEAD
TCUST
TCUST
TITEM
TITEM
TTEND
TTAIL
THEAD
TCUST
TCUST
TITEM
TTEND
TTAIL
ip.txt
  • /THEAD/{f=1; c=0; a = $0; next} starting pattern, set flag and initialize counter. Save current line for later printing
  • f{a = a ORS $0; if(/TCUST/) c++} when flag is set, accumulate input lines in a variable and increment counter if line matches TCUST
  • /TTAIL/{f=0; if(c > 1){print a; m=1} } ending pattern, clear flag. Print contents of a if counter is greater than 1, also set variable m that at least one match is found
  • ENDFILE{if(m) print FILENAME; m=0} after all lines are processed for a file, print input file name if m is set and clear before next file is processed (Thanks @Costas for pointing out multiple file requirement)

Note: ENDFILE is GNU awk specific, I am not sure how to handle it without ENDFILE


Thanks @Costas for solution not dependent on GNU specificENDFILE:

$ awk '
  FNR==1{if(m) print fname; m=0; fname=FILENAME}
  /THEAD/{f=1; c=0; a = $0; next}
  f{a = a ORS $0; if(/TCUST/) c++}
  /TTAIL/{f=0; if(c > 1){print a; m=1} }
  END{if(m) print fname}
  ' *.txt
5
  • Hi Sundeep- Perfect.....works well !! Thanks for your help Commented Nov 25, 2016 at 14:27
  • Should add FNR==1{m=0} if several files Commented Nov 25, 2016 at 14:51
  • @Costas, thanks for pointing out fallacy for multiple files.. I have changed to ENDFILE which is gawk specific though.. Commented Nov 25, 2016 at 15:06
  • 1
    For non-GNU awk you should add FNR==1{if(m) print fname; m=0; fname=FILENAME} altogether with END{if(m) print fname} Commented Nov 26, 2016 at 11:25
  • @ Sundeep - How to move the file (having multiple TCUST records) to some backup dir.. I am getting fatal division by 0 error Commented Nov 29, 2016 at 11:06
1

By GNU sed the task can be done by

sed -sn '
    /THEAD/{:1;N;/TTAIL/! b1} #collect lines from `THEAD' to `TTAIL'
    /TCUST.*TCUST/{p;h}       #print if there are two TCUST and set hold
    ${x;//F}                  #check hold and output if two TCUST was in it
    ' file1 file2 …

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.