How do I print line numbers but reset line counter at empty line?

Question

I have a file.txt containing:

this is the first
second line
not last line

fourth but first
second in list
seventh in file
seventh with nl

Normally I would just cat and pipe | it into nl like so:

$> cat file.txt | nl
1  this is the first
2  second line
3  not last line

4  fourth but first
5  second in list
6  seventh in file
7  seventh with nl

But I need the line numbers to reset when it encounters an empty line like so:

$> alias_or_function file.txt
1  this is the first
2  second line
3  not last line

1  fourth but first
2  second in list
3  seventh in file
4  seventh with nl

How could I do this using a quick function or alias in my ~/.zshrc?

A perl script can read the file in paragraph-at-a-time mode, onto an array, which you could print out with numbers, beginning at 1 with each "paragraph". There is a learning curve, but perl is worth learning. — waltinator
– waltinator, Commented Aug 22, 2020 at 5:41

Stéphane Chazelas · Accepted Answer · 2020-08-25 06:10:19Z

You could replace blank lines with \:\: which nl understands as the start of a new page body:

<your-file sed 's/^[[:space:]]*$/\\:\\:/' | nl

So as a function:

number-lines-of-paragraphs() {
  sed -e 's/^[[:space:]]*$/\\:\\:/' -- "$@" | nl
}

(note that nl will understand \:, \:\:, \:\:\: as header/body/footer delimiters if they occur in the input as well, which is why you generally can't use nl to add line number to arbitrary text).

You could also get the same output format without those caveats with awk as:

awk 'NF {printf "%6u\t%s\n", FNR, $0; next}; {FNR = 0; print}'

Or some of the variants posted by others here.

Above, the numbers are left padded to 6 characters and followed by a TAB character like in the default nl output format (where %6u\t%s\n is the equivalent of nl's default -s $'\t' -n rn -w 6), but you can of course adjust that format to your liking.

But now, to make it a function that takes arbitrary file names as arguments, that's where you run into awk's own caveats, namely that it chokes on filenames that contain = characters as those are interpreted as awk variable assignment (at least if what's on the left of the first = looks like a valid awk variable name). That can be worked around with gawk as:

number-lines-of-paragraphs() {
  gawk -e '
    NF {printf "%6u\t%s\n", FNR, $0; next}
    {FNR = 0; print}' -E /dev/null "$@"
}

Note that if that function is passed several files, the line numbers will be reset at the start of each file. If you'd rather the contents of all files be taken as a single stream to be numbered as a whole like in the sed | nl approach, replace FNR with NR above.

In any case, both sed and gawk will understand - as meaning stdin, not the file called - in the current directory (use ./- to work around it).

fpmurphy · Accepted Answer · 2020-08-22 05:49:33Z

2

If you are willing to use awk:

$ cat nl.awk
{
   if ( $0 == "" ) {
      count = 0
      print
   } else
      print ++count, $0
}

Outputs:

$ awk -f nl.awk infile
1 this is the first
2 second line
3 not last line

1 fourth but first
2 second in list
3 seventh in file
4 seventh with nl

answered Aug 22, 2020 at 5:49

fpmurphy

4,7563 gold badges25 silver badges26 bronze badges

I'm a bit confused. I've never used print in Unix before. I also don't know how to use awk but I will definitely be reading up on that because I'm learning now that awk has it's own syntax (I'm guessing).

ntruter42
– ntruter42

2020-08-22 12:46:50 +00:00
Commented Aug 22, 2020 at 12:46
1

@ntruter42, that's a awk script, not a shell script. So it's not a print command, but the print function in the awk language. Having said that, several shells including ksh and zsh have a print builtin. bash is an odd one out here, given that it has copied most things from ksh including many of its misdesigns, but not that most basic of builtins, the one to print text. sed also has a print command, abbreviated as p.

Stéphane Chazelas
– Stéphane Chazelas

2020-08-22 13:51:10 +00:00
Commented Aug 22, 2020 at 13:51
2

@ntruter42 awk has it's own builtin print and printf functions. Yes, awk has it's own syntax. It is basically pattern { action }

fpmurphy
– fpmurphy

2020-08-22 13:52:36 +00:00
Commented Aug 22, 2020 at 13:52

Add a comment |

score 2 · Accepted Answer · 2020-08-24 22:54:10Z

2

Using awk:

awk '{ c=NF?++c:"" } {print c,$0}' file

It means:

If there is any field NF? (any (non-space) character), increment c with ++c.
If there are no fields (no characters), make the line counter empty.
Print the counter followed by the actual line print c,$0

Sadly this short solution converts empty lines to lines that contain an space (or, actually, to the value of OFS). If that is a problem, then use this (similar) solution:

awk 'NF{$0=++c" "$0}!NF{c=0}1' file

There is no reason to change empty lines to \:\: in this solution.

edited Aug 24, 2020 at 22:54

answered Aug 23, 2020 at 6:32

user232326

Add a comment |

jubilatious1 · Accepted Answer · 2025-03-03 19:36:11Z

Using Raku (formerly known as Perl_6)

~$ raku -ne 'state $i; .chars ?? put(++$i, "\t$_") !! (put ""; $i=0);'  file

In a comment @waltinator suggests using Perl, so here's an answer written in Raku (which is in the Perl-family). The oneliner above can be cut/pasted onto the command line. To start we call Raku with the awk-like -ne (non-autoprinting, linewise) flags. A counter variable $i is stated, which means it only gets initialized once at the start of the program.

Raku's ternary operator Test ?? True !! False is used to test a line for .chars. You could write .chars.Bool or .chars.so, but .chars alone works. If True we output the line with an ++$i incremented counter. If False we output empty string and $i=0 reset the counter to zero.

Sample Input:

this is the first
second line
not last line

fourth but first
second in list
seventh in file
seventh with nl

Sample Output:

1   this is the first
2   second line
3   not last line

1   fourth but first
2   second in list
3   seventh in file
4   seventh with nl

To run this as a standalone program you use the familiar #! shebang line, and call the iterator for lines(), which substitutes for the command-line flags in the one-liner. You can still keep the state $i; statement within the block, or move it outside and declare my $i; instead (see below):

#!/opt/local/bin/raku 

my $i; for lines() {
    .chars ?? put(++$i, "\t$_") !! (put ""; $i=0);
};

You save/run this program similar to the awk answer posted:^*

~$ raku nbr.p6 infile

...or make the nbr.p6 script executable and just run nbr.p6 infile.

_{^*Purists will say after the language renaming, that now the correct extension for Raku scripts is .raku, not .p6.}

https://docs.raku.org/language/operators#index-entry-ternary
https://raku.org

Stack Exchange Network

How do I print line numbers but reset line counter at empty line?

4 Answers 4

You must log in to answer this question.

Hot Network Questions

How do I print line numbers but reset line counter at empty line?

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions