3

There is an example in this link about sed:

To delete the first number on all lines that start with a "#" use:

sed '/^#/ s/[0-9][0-9]*//'

What is the benefit of first pattern(/^#/)? It could be simply:

sed 's/^#[0-9][0-9]*//'
5
  • I don't get the [0-9][0-9]* why not [0-9]\+? Commented Apr 18, 2012 at 6:39
  • 1
    @Bernhard One good reason is maximum portability. I don't think \+ is guaranteed by POSIX. Commented Apr 18, 2012 at 7:22
  • 1
    @Barnhard I just copied it from the link. But this wikipedia article says that \+ is in POSIX extended regular expressions. en.wikipedia.org/wiki/Regular_expression#Syntax Commented Apr 18, 2012 at 7:54
  • 1
    POSIX sed uses BRE's though. Commented Apr 18, 2012 at 8:50
  • 1
    Every modern implementation of sed I've encountered has the ability to use EREs (sometimes with flag -r, other times with flag -E), and there is talk of adding this capacity to the POSIX standard for sed. @jw013 is correct though that the current POSIX standard doesn't require sed to handle anything other than BREs. EREs handle plain +; some sed implementations enhance their BREs to also handle \+, but if I remember rightly, this is not part of POSIX. Instead of p\+ you could use p\{1,\}, which is a POSIX BRE. Commented Oct 16, 2012 at 15:51

2 Answers 2

6

The general format of sed commands is

[address[,address]] function

When a command has a single address, it operates on all lines that match that address. When a command has no address, it operates on every single line.

Reference: POSIX sed


Regarding your specific examples:

  • /^#/ s/[0-9][0-9]*//

    • This command has an address, /^#/, which matches all lines beginning with a #.

    • The substitution pattern is /[0-9][0-9]*/. This matches the first sequence of digits wherever it occurs in the line.

    • Plain English summary: delete the first sequence of digits in every line beginning with a #.

    • Example: # non-digits|5555|non-digits|5555 becomes # non-digits||non-digits|5555

  • s/^#[0-9][0-9]*//

    • There is no address, so this command operates on every single line.

    • The substitution pattern, /^#[0-9][0-9]*/, matches a sequence of consecutive digits preceded by a # anchored at the beginning of the line.

    • Plain English summary: delete # followed by a sequence of digits (and only that pattern) from the beginning of every line.

    • Example: #5555|non-digits|5555 becomes |non-digits|5555, but # non-digits|5555|non-digits|5555 is unchanged because the substitution pattern does not match.

2

The first will match and substitute:

#abc99

The second will not.

Plus, the second will also remove the initial #.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.