1

I have a log/text file which is full of highlighted ^B lines when opened with less. Usually I see ^A and I have the following sed command to replace ^A with |. I found it a while ago and do not know how it works.

sed 's/\x1/\|/g'

This seems to be the first time I have seen ^B and the sed command does not work for it.

How can I modify my sed command to handle ^A, ^B or any other possible combination?

2
  • 2
    ^A is printed for ASCII code 1 so \x1 matches it. ^B is printed for ASCII code 2 so you'd need \x02. Commented Oct 8, 2024 at 16:02
  • To determine the octal value of any control sequences use sed -n 'l0' file and the such sequences are represented by \nnn e.g printf '\x01foo\x02bar\x03baz' | sed -n l0 will output \001foo\002bar\003baz$. Thus to replace the first control sequence using printf '\x01foo\x02bar\x03baz' | sed 's/\o001/X/' will output Xfoobarbaz. Commented Oct 9, 2024 at 13:50

2 Answers 2

3

^B is \x2, so you can just change your \x1 to \x2 to fix it. More generally, you can use "man ascii" to see the entire ASCII table and find out that ^B (hex 02) is hex 40 less than B (hex 42). Similarly, ^[ (ESC, hex 1B) is hex 40 less than [ (hex 5B).

To handle a range of characters, use brackets. For example, sed 's/[\x1-\x1f]/|/g' will replace all low-numbered control characters with vertical bars.

Sign up to request clarification or add additional context in comments.

Comments

1

As others have already mentioned, \x1 is the ASCII escape sequence for the character Control-A as represented by ^A in most displays. Rather than having to deal with [ranges of] specific ASCII escape sequences, consider using a POSIX character class:

sed 's/[[:cntrl:]]/|/g'

to just change all control-characters in each input line to |s so you don't have to worry about which control characters are present or what their associated ASCII escape values are.

For example:

$ printf '\x01foo\x02bar\x03\n' | cat -A
^Afoo^Bbar^C$

$ printf '\x01foo\x02bar\x03\n' | sed 's/[[:cntrl:]]/|/g' | cat -A
|foo|bar|$

See the POSIX regexp spec for more info on bracket expressions, [...], and the character class [:cntrl:].

You don't need the \ before | btw, there's nothing special about | in sed replacement text, nor even in a Basic Regular Expression as sed uses by default.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.