0

I have a folder containing multiple files in this format >

1    Hello1    World1    Example1
2    Hello2    World2    Example2
...

delimiter is \t

I want to remove all leading/trailing spaces from each column if any exist.

example

1    Hello1\s    World1    \sExample1

(\s) < representing space, expected output would be,

1    Hello1    World1    Example1

I don't want to just remove spaces altogether as one value may contain a space, example Hel lo.

Also I wish to edit the current file and not create a new one.

1
  • \s means "space/blank character, or tab, or newline, or carriage return, or formfeed, or vertical tab". That's not what you're trying to match so you shouldn't use \s in your example. Just say "space character" instead of "spaces" and use <space> or similar in the example to represent it to be less ambiguous. Commented Aug 25, 2020 at 12:15

2 Answers 2

1

Using GNU sed we can clip any spaces sticking around a tabb as shown

$ sed -Ei -e 's/[ ]*\t[ ]*/\t/g' file

With awk we iterate over the fields and then trim the field:

$ awk -F '\t' -v OFS='\t' '
{
  for (i=1; i<=NF; ++i) {
    gsub(/^[ ]+|[ ]+$/, "", $i)
  }
}1
' file > foo && mv foo file 
2
  • Thank you! :) @Rakesh Commented Aug 23, 2020 at 22:37
  • blank characters are literal, there's no need to put them in a bracket expression. The sed script wouldn't remove spaces at the start of the first field or end of the last field. Commented Aug 25, 2020 at 12:09
0

You just need to set the FS to allow for blanks around each tab and then assign to a field to replace all FSs with OFSs after removing leading/trailing blanks from the record as a whole:

awk -F' *\t *' -v OFS='\t' '{gsub(/^ +| +$/,""); $1=$1} 1' file

Using modified input to show leading/trailing blanks also being handled and cat -Evt to make tabs, etc. visible:

$ cat -Evt file
 1^IHello1 ^IWorld1^I Example1 $
2^IHello2^IWorld2^IExample2$

$ awk -F' *\t *' -v OFS='\t' '{gsub(/^ +| +$/,""); $1=$1} 1' file | cat -Evt
1^IHello1^IWorld1^IExample1$
2^IHello2^IWorld2^IExample2$

You also said I wish to edit the current file and not create a new one - you can't unless you use ed or get creative with your coding (eg. in awk you could read all the input into an array, modify it, close the file you were reading, and then write the modified array contents back to that original file). The tools that claim to do "inplace editing" using -i type arguments actually create a new file to write the results to and then overwrite the original file with that new on, just like if you manually wrote cmd file > tmp && mv tmp file.

Having said all of that, if you want pseudo inplace editing like I just described then, just like with GNU sed you can do sed -i ..., with GNU awk you can do:

awk -i inplace -F' *\t *' -v OFS='\t' '{gsub(/^ +| +$/,""); $1=$1} 1' file

and if you truly want editing without a temp file, then if your file is small enough to fit in memory you can do the following with any awk:

$ cat -Evt file
 1^IHello1 ^IWorld1^I Example1 $
2^IHello2^IWorld2^IExample2$

$ cat tst.awk
BEGIN {
    FS = " *\t *"
    OFS = "\t"
}
{
    gsub(/^ +| +$/,"")
    $1 = $1
    recs[NR] = $0
}
END {
    close(FILENAME)
    for (i=1; i<=NR; i++) {
        print recs[i] > FILENAME
    }
}

$ awk -f tst.awk file

$ cat -Evt file
1^IHello1^IWorld1^IExample1$
2^IHello2^IWorld2^IExample2$

The close(FILENAME) is probably not necessary as I expect the input file will already be closed on entry to the END section but it won't hurt.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.