You could replace blank lines with \:\:
which nl
understands as the start of a new page body:
<your-file sed 's/^[[:space:]]*$/\\:\\:/' | nl
So as a function:
number-lines-of-paragraphs() {
sed -e 's/^[[:space:]]*$/\\:\\:/' -- "$@" | nl
}
(note that nl
will understand \:
, \:\:
, \:\:\:
as header/body/footer delimiters if they occur in the input as well, which is why you generally can't use nl
to add line number to arbitrary text).
You could also get the same output format without those caveats with awk
as:
awk 'NF {printf "%6u\t%s\n", FNR, $0; next}; {FNR = 0; print}'
Or some of the variants posted by others here.
Above, the numbers are left padded to 6 characters and followed by a TAB character like in the default nl
output format (where %6u\t%s\n
is the equivalent of nl
's default -s $'\t' -n rn -w 6
), but you can of course adjust that format to your liking.
But now, to make it a function that takes arbitrary file names as arguments, that's where you run into awk
's own caveats, namely that it chokes on filenames that contain =
characters as those are interpreted as awk variable assignment (at least if what's on the left of the first =
looks like a valid awk variable name). That can be worked around with gawk
as:
number-lines-of-paragraphs() {
gawk -e '
NF {printf "%6u\t%s\n", FNR, $0; next}
{FNR = 0; print}' -E /dev/null "$@"
}
Note that if that function is passed several files, the line numbers will be reset at the start of each file. If you'd rather the contents of all files be taken as a single stream to be numbered as a whole like in the sed | nl
approach, replace FNR
with NR
above.
In any case, both sed
and gawk
will understand -
as meaning stdin, not the file called -
in the current directory (use ./-
to work around it).
perl
script can read the file in paragraph-at-a-time mode, onto an array, which you could print out with numbers, beginning at1
with each "paragraph". There is a learning curve, butperl
is worth learning.