1

I'm trying to replace - and : with ? only in the middle(second) section delimited by ___ (3 underscore)

input:

aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e

output:

aaa___bb?bb?bbb___cc-cc:ccc
d-d___d?ddd?d?d___e-e:e

I tried below sed command but it only replaces the last occurrence of the -: in the middle section

echo "aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e" | sed "s|\(___[^_]*\)[-:]\([^_]*___\)|\1?\2|g"

Output:

aaa___bb-bb?bbb___cc-cc:ccc
d-d___d-ddd:d?d___e-e:e

I'm not restricted to only use sed. awk, tr, etc are fine too.

4
  • Is your input coming from a shell variable? A file? A stream? Commented Jun 9, 2015 at 1:05
  • Does that matter? For my script, it's in a pipe (output of a grep). Commented Jun 9, 2015 at 1:09
  • 2
    It matters a bit -- for my existing answer, instead of <<<"$in", then, you'd want to use < <(grep ...). Commented Jun 9, 2015 at 1:09
  • 1
    (BTW, I'd argue that folks using external utilities unnecessarily is a big part of why shell scripts have their reputation as slow, ie. from the systemd folks. Sure, bash is a particularly slow shell, but it's still an orders-of-magnitude difference in performance between doing a task internally to the shell and shelling out for it, at least when operating on small amount of data. For a shell that processes large amounts of data efficiently, consider ksh93 -- the real David Korne one, not the clones). Commented Jun 9, 2015 at 1:17

3 Answers 3

5

Try:

awk -F"___" '{gsub(/[-:]/,"?",$2)}1' OFS="___"
Sign up to request clarification or add additional context in comments.

3 Comments

I like this one-liner, however, looking at the performance, @Charles solution is consistently took half the time (28ms vs 11ms) to accomplish the task.
@Sungam, ...incidentally, awk will be much faster with large enough input files (processing lots of lines); it's the constant startup time that hurts it here.
(However, with only two lines of input, I'd actually expect a much larger difference than 2x; it'd be interesting to look at the test harness used for collecting those numbers and figure out where its time is being spent).
3

In pure native bash, with no external utilities:

in='aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e'

while IFS= read -r line; do
   first=${line%%___*}
   last=${line##*___}
   middle=${line#*___}; middle=${middle%___*}
   printf '%s\n' "${first}___${middle//[-:]/?}___${last}"
done <<<"$in"

1 Comment

A better optimization still would perhaps be to replace the grep which the OP says the input comes from with a tool which does both the pattern matching and the substitutions in the same process. My bet is on Awk.
1

You were close with sed solution

sed -r ':a; s/(___.*)[-:](.*___)/\1?\2/; ta' file

conditional branching is what you needed only

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.