Replace a character in a section of a string in Bash

Question

I'm trying to replace - and : with ? only in the middle(second) section delimited by ___ (3 underscore)

input:

aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e

output:

aaa___bb?bb?bbb___cc-cc:ccc
d-d___d?ddd?d?d___e-e:e

I tried below sed command but it only replaces the last occurrence of the -: in the middle section

echo "aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e" | sed "s|\(___[^_]*\)[-:]\([^_]*___\)|\1?\2|g"

Output:

aaa___bb-bb?bbb___cc-cc:ccc
d-d___d-ddd:d?d___e-e:e

I'm not restricted to only use sed. awk, tr, etc are fine too.

Is your input coming from a shell variable? A file? A stream? — Charles Duffy
– Charles Duffy, Commented Jun 9, 2015 at 1:05
Does that matter? For my script, it's in a pipe (output of a grep). — Sungam
– Sungam, Commented Jun 9, 2015 at 1:09
It matters a bit -- for my existing answer, instead of <<<"$in", then, you'd want to use < <(grep ...). — Charles Duffy
– Charles Duffy, Commented Jun 9, 2015 at 1:09
(BTW, I'd argue that folks using external utilities unnecessarily is a big part of why shell scripts have their reputation as slow, ie. from the systemd folks. Sure, bash is a particularly slow shell, but it's still an orders-of-magnitude difference in performance between doing a task internally to the shell and shelling out for it, at least when operating on small amount of data. For a shell that processes large amounts of data efficiently, consider ksh93 -- the real David Korne one, not the clones). — Charles Duffy
– Charles Duffy, Commented Jun 9, 2015 at 1:17

Ell · Accepted Answer · 2015-06-09 01:10:29Z

5

Try:

awk -F"___" '{gsub(/[-:]/,"?",$2)}1' OFS="___"

edited Jun 9, 2015 at 1:10

answered Jun 9, 2015 at 1:10

Ell

9476 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Sungam Over a year ago

I like this one-liner, however, looking at the performance, @Charles solution is consistently took half the time (28ms vs 11ms) to accomplish the task.

Charles Duffy Over a year ago

@Sungam, ...incidentally, awk will be much faster with large enough input files (processing lots of lines); it's the constant startup time that hurts it here.

Charles Duffy Over a year ago

(However, with only two lines of input, I'd actually expect a much larger difference than 2x; it'd be interesting to look at the test harness used for collecting those numbers and figure out where its time is being spent).

Charles Duffy · Accepted Answer · 2015-06-09 01:08:27Z

3

In pure native bash, with no external utilities:

in='aaa___bb-bb:bbb___cc-cc:ccc
d-d___d-ddd:d-d___e-e:e'

while IFS= read -r line; do
   first=${line%%___*}
   last=${line##*___}
   middle=${line#*___}; middle=${middle%___*}
   printf '%s\n' "${first}___${middle//[-:]/?}___${last}"
done <<<"$in"

answered Jun 9, 2015 at 1:08

Charles Duffy

299k43 gold badges438 silver badges495 bronze badges

1 Comment

tripleee Over a year ago

A better optimization still would perhaps be to replace the grep which the OP says the input comes from with a tool which does both the pattern matching and the substitutions in the same process. My bet is on Awk.

josifoski · Accepted Answer · 2015-06-09 05:42:34Z

1

You were close with sed solution

sed -r ':a; s/(___.*)[-:](.*___)/\1?\2/; ta' file

conditional branching is what you needed only

answered Jun 9, 2015 at 5:42

josifoski

1,7261 gold badge14 silver badges20 bronze badges

Collectives™ on Stack Overflow

Replace a character in a section of a string in Bash

3 Answers 3

3 Comments

1 Comment

Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

1 Comment

Comments

Related