How can I properly use the 'sed' s command in this particular scenario

Question

I have a names.txt file where each line is of the form:

xxxxxx   random_string_of_characters    2015

where xxxxxx is a 6-digit number, and random_string_of_characters can be anything. I want to use the substitute command to replace all the empty space and the random_string_of_characters between xxxxxx and 2015 in each row, so that each string looks like this:

xxxxxx 2015

So, what would be the best way to accomplish this?

@jasonwryan - I'd use $NF instead of $3 just to be sure as "random_string_of_characters can be anything" — don_crissti
– don_crissti, Commented Apr 29, 2015 at 1:18

Eric Renouf · Accepted Answer · 2015-04-29 01:06:52Z

2

You could do

sed -i -e 's/[[:space:]]\+.\+2015$/ 2015/' names.txt

If you want to save it into the same file. Drop the -i if you just want to print to stdout, which you could redirect into another file.

It will match any number of spaces followed by anything up to 2015 at the end of the line, then replace that whole match with " 2015"

Another possibility would be to do

sed -e 's/^\([[:digit:]]\{6\}\).\+\([[:digit:]]\{4\}\)$/\1 \2/'  names.txt

Which will match 6 digits at the start of the line and 4 at the end and print those matches with a space between them. It will leave any other lines unchanged.

edited Apr 29, 2015 at 1:06

answered Apr 29, 2015 at 0:33

Eric Renouf

18.7k7 gold badges51 silver badges66 bronze badges

1

Since you're using 2015 in the RHS there's no point using it in the LHS too so sed 's/[[:blank:]].*/ 2015/' would do. If last four digits aren't always the same you could use sed 's/[[:blank:]].*[[:blank:]]/ /'

don_crissti
– don_crissti

2015-04-29 01:06:55 +00:00
Commented Apr 29, 2015 at 1:06
Provided the 2015 isn't crucial to matching some lines I like both of your solutions.

Eric Renouf
– Eric Renouf

2015-04-29 01:09:27 +00:00
Commented Apr 29, 2015 at 1:09
1

Well, I assume it isn't, as per the OP: "where each line is of the form..."

don_crissti
– don_crissti

2015-04-29 01:13:14 +00:00
Commented Apr 29, 2015 at 1:13

Add a comment |

hildred · Accepted Answer · 2015-04-29 01:09:21Z

0

my preference is to anchor to the start of the string to reduce back tracking and I like extended regexp so I would use

sed -re 's/^([0-9]{6}) .*( 2015)$/\1\2/' names.txt

what this does

-r extended regexp
-e expression to follow
s substitute
^ beginning of line
( start subpattern
[0-9] a digit
{6} previous occurs exactly six times
) end subexpression
. any single character
* previous occurs zero or more times
$ end of line
\1 first subpattern
\2 second subpattern

answered Apr 29, 2015 at 1:09

hildred

5,8593 gold badges32 silver badges43 bronze badges

1

If empty space = tab this won't work; just remove the spaces from the expression, e.g. sed -E 's/^([0-9]{6}).*(2015)$/\1 \2/' would work in all cases

don_crissti
– don_crissti

2015-04-29 01:37:11 +00:00
Commented Apr 29, 2015 at 1:37

Add a comment |

Stack Exchange Network

How can I properly use the 'sed' s command in this particular scenario

2 Answers 2

You must log in to answer this question.

Hot Network Questions

How can I properly use the 'sed' s command in this particular scenario

2 Answers 2

You must log in to answer this question.

Related

Hot Network Questions