4

How do I print the remainder of a string (not just the columns without the delimiter) after the nth delimiter?

I have a text file with a bunch of registry keys, similar to:

hku\test\user\software\microsoft\windows\currentversion\runonce\delete cached update binary

I'm wanting to print everything after the 3rd \ character. So I am looking for the output to be

software\microsoft\windows\currentversion\runonce\delete cached update binary

I know how to print out specific columns with awk, but is there any simple way using bash to specify a delimiter to split the string at, instead of using the delimiter to print columns?

1

6 Answers 6

12

Pipe through cut -d \\ -f 4-.

echo 'hku\test\user\software\microsoft\windows\currentversion\runonce\delete cached update binary' | cut -d \\ -f 4-

Yields:

software\microsoft\windows\currentversion\runonce\delete cached update binary

Note the double \\, since a single \ is an escape character.

1
  • 1
    Alternatively, you can quote the ` to avoid escaping: cut -d '\' -f 4-` Commented May 19, 2021 at 16:09
6

With sed:

sed -E 's/^([^\]*[\]){3}//' infile

or the same in awk:

awk '{ sub(/([^\\]*[\\]){3}/, "") }1' infile

Match repeated of (regex) 3times; [^\]*[\] matches on zero-or-more of any-characters but not a back-slash (plus not a newline exceptionally) followed by a back-slash character.


And the shell (POSIX sh/bash/Korn/zsh) solution:

$ str='hku\test\user\software\microsoft\windows\currentversion\runonce\delete cached update binary'
$ for i in $(seq 3); do str="${str#*\\}"; done
$ printf '%s\n' "$str"

The ${parameter#word} syntax is the Parameter Expansion, that strips the shortest prefix from its parameter.

0
3

Using awk:

awk 'BEGIN{FS=OFS="\\"; }{for(i=4;i<NF;i++) printf "%s", $i OFS; print $NF }' input

Because we want to print everything after the 3rd \ character, Field separaror FS and output field separator OFS are set to \. FS="\\" because single \ is escape character. Because \ is now filed separator we use a for loop to print from filed number 4 to last field of record.

Or like this:

awk 'BEGIN{FS=OFS="\\"; }{for(i=4;i<=NF;i++) printf "%s", $i (i==NF?ORS:OFS) }' input

Here everything is same but ternary operator is used. Here for loop will print OFS after $i for all but last field. After last field this will print ORS i.e. a newline.

Another method:

awk 'BEGIN{OFS="\\"} { n=split($0,arr,OFS); $0=""; for (i=4; i<=n; ++i) $(i-3)=arr[i]; print }' input

Here split() built-in function splits $0 by OFS and creates an array arr. Then for loop changes every field of record by $(i-3)=arr[i]. For example for first element of for loop, $1 will be arr[4]. Why $1 because $(4-3) is $1. When loop is completed awk has a new $0 that starts from fourth field of old record ($0). Then print command prints new $0.

0
2
## input variables
n=3
s='hku\test\user\software\microsoft\windows\currentversion\runonce\delete cached update binary'

sed Change the n-th backslash to a newline, a character known to not be present, we then strip away everything till the newline.

printf '%s\n' "$s" |
sed -e '
  s/\\/\n/'"$n"'
  s/.*\n//
' -
software\microsoft\windows\currentversion\runonce\delete cached update binary

linux command line Convert to one-field per line, chop off the first n fields, and then join them back.

printf '%s\n' "$s" | tr '\\' '\n' | tail -n+"$((n+1))" | paste -sd '\\' -

bash builtins

set -f;IFS=\\;
declare -a a=( $s )
printf '%s\n' "${a[*]:$n}"

awk

printf '%s\n' "$s" |
awk -F '\' -v n="$n" '
NF>n {
  for (i=p=1; i<=n; i++) 
    p += 1+length($i)
  $0 = substr($0,p)
}1' -

printf '%s\n' "$s" |
perl -pals -F'/\\/,$_,$n+1' -e '
  $_=$F[-1];
' -- -n="$n"  -

python3 -c 'import sys
p,(s,n) = -1,sys.argv[1:]
for i in range(1+int(n)):
  p = 1+s.find("\\",p)
print(s[p:])
' "$s" "$n"
1

Just to not miss perl:

perl -e '@a=split /\\/, $ARGV[0]; print(join("\\", splice @a, 3), "\n")' $str

Where str is the path.

Or without the endings new line :

perl -e '@a=split /\\/, $ARGV[0]; print join "\\", splice @a, 3' $str
0
awk -F"\\" '{ OFS="\\"; $1=$2=$3=""; sub(/\\*/,""); print }' filename

output

software\microsoft\windows\currentversion\runonce\delete cached update binary
2
  • Since the solution here is to hard-code the first N fields by setting them to an empty string in the awk script, you might want to mention that in case the next person wants to use this with something like N=50. Commented May 20, 2021 at 9:10
  • 1
    Brevity is acceptable, but fuller explanations are better. Commented May 20, 2021 at 15:20

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.