2

I have a names.txt file where each line is of the form:

xxxxxx   random_string_of_characters    2015

where xxxxxx is a 6-digit number, and random_string_of_characters can be anything. I want to use the substitute command to replace all the empty space and the random_string_of_characters between xxxxxx and 2015 in each row, so that each string looks like this:

xxxxxx 2015

So, what would be the best way to accomplish this?

3
  • 2
    If you are not wedded to sed: awk '{print $1,$3}'... Commented Apr 29, 2015 at 1:09
  • 1
    @jasonwryan - I'd use $NF instead of $3 just to be sure as "random_string_of_characters can be anything" Commented Apr 29, 2015 at 1:18
  • @don_crissti Good point. Commented Apr 29, 2015 at 2:19

2 Answers 2

2

You could do

sed -i -e 's/[[:space:]]\+.\+2015$/ 2015/' names.txt

If you want to save it into the same file. Drop the -i if you just want to print to stdout, which you could redirect into another file.

It will match any number of spaces followed by anything up to 2015 at the end of the line, then replace that whole match with " 2015"

Another possibility would be to do

sed -e 's/^\([[:digit:]]\{6\}\).\+\([[:digit:]]\{4\}\)$/\1 \2/'  names.txt

Which will match 6 digits at the start of the line and 4 at the end and print those matches with a space between them. It will leave any other lines unchanged.

3
  • 1
    Since you're using 2015 in the RHS there's no point using it in the LHS too so sed 's/[[:blank:]].*/ 2015/' would do. If last four digits aren't always the same you could use sed 's/[[:blank:]].*[[:blank:]]/ /' Commented Apr 29, 2015 at 1:06
  • Provided the 2015 isn't crucial to matching some lines I like both of your solutions. Commented Apr 29, 2015 at 1:09
  • 1
    Well, I assume it isn't, as per the OP: "where each line is of the form..." Commented Apr 29, 2015 at 1:13
0

my preference is to anchor to the start of the string to reduce back tracking and I like extended regexp so I would use

sed -re 's/^([0-9]{6}) .*( 2015)$/\1\2/' names.txt

what this does

-r extended regexp
-e expression to follow
s substitute
^ beginning of line
( start subpattern
[0-9] a digit
{6} previous occurs exactly six times
) end subexpression
. any single character
* previous occurs zero or more times
$ end of line
\1 first subpattern
\2 second subpattern
1
  • 1
    If empty space = tab this won't work; just remove the spaces from the expression, e.g. sed -E 's/^([0-9]{6}).*(2015)$/\1 \2/' would work in all cases Commented Apr 29, 2015 at 1:37

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.