0

I have following line in text file

abc|45|"Do not replace | in this"|0.23

I want a way to replace | character only in text in double quotes, resulting

abc|45|"Do not replace in this"|0.23

I have large amount of files and lines to do this replacement. Is there a anyway i could make it with shell script ?

4 Answers 4

4

New answer (2022) using Miller to first remove all pipe symbols from the 3rd field of the header-less CSV input, and then collapse all whitespace. The quoting is retained from the original.

$ mlr --csv --fs pipe -N --quote-original put '$3 = collapse_whitespace(gsub($3,"[|]",""))' file
abc|45|"Do not replace in this"|0.23

The same thing, but loop over all fields and try to modify all that are strings:

$ mlr --csv --fs pipe -N --quote-original put 'for (k,v in $*) { is_string(v) { $[k] = collapse_whitespace(gsub(v,"[|]","")) } }' file
abc|45|"Do not replace in this"|0.23

Applying this to individual files with in-place editing would be done using

mlr -I --csv ... *.csv

... after making sure that those files are properly backed up.


Old answer (2019):

Using csvformat from CSVKit, with sed:

$ csvformat -d '|' file | sed 's/| //' | csvformat -D '|'
abc|45|Do not replace in this|0.23

The first call to csvformat changes the CSV delimiter from | to comma. The pipe in the text (and the space after it) can then be removed with a simple call to sed. We then call csvformat again to change the delimiters back to |.

Note that the double quotes are not used in the final output. This is because they are no longer needed. They were not ever part of the actual data in the first place, but only needed to delimit that field due to the pipe used in it (the original data was a properly quoted CSV file).

Would you want quoting of the fields in the output, use -U1 with the final call to csvformat. This would quote all fields.

2

You can use a simple sed substitution to match a string that starts with " and does not contain a embedded " within and capture that group until the occurrence of | and then match the second group from there to end of the ". Just print out the matching groups, since they don't contain the | character

sed 's/\("[^"]*\).* |\([^"]*"\)/\1\2/g'
1
  • The space is not part of the OP description. It works for the specific example used, but not required in general. Commented Nov 30, 2019 at 1:37
0

Ruby has a nice CSV library, so this can be a one-liner:

ruby -rcsv -e 'CSV.filter(col_sep: "|") {|row| row.each {|field| field.gsub!(/\| /, "")}}' file
0

Using Perl (sorry: obfuscating code)

perl -pe 's/".*?"/ $& =~ tr[|][]dr /ge'     file

Explanation:

  • perl -pe proc - apply proc to all the lines
  • s/RE/ f($&) /ge - substitutes RE by the result of f(matching string)
  • tr[|][]dr - translates | by nothing (=deletes)

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.