Shell : How to replace character only inside double quoted text in a file?

Question

I have following line in text file

abc|45|"Do not replace | in this"|0.23

I want a way to replace | character only in text in double quotes, resulting

abc|45|"Do not replace in this"|0.23

I have large amount of files and lines to do this replacement. Is there a anyway i could make it with shell script ?

Kusalananda · Accepted Answer · 2022-10-16 20:01:37Z

New answer (2022) using Miller to first remove all pipe symbols from the 3rd field of the header-less CSV input, and then collapse all whitespace. The quoting is retained from the original.

$ mlr --csv --fs pipe -N --quote-original put '$3 = collapse_whitespace(gsub($3,"[|]",""))' file
abc|45|"Do not replace in this"|0.23

The same thing, but loop over all fields and try to modify all that are strings:

$ mlr --csv --fs pipe -N --quote-original put 'for (k,v in $*) { is_string(v) { $[k] = collapse_whitespace(gsub(v,"[|]","")) } }' file
abc|45|"Do not replace in this"|0.23

Applying this to individual files with in-place editing would be done using

mlr -I --csv ... *.csv

... after making sure that those files are properly backed up.

Old answer (2019):

Using csvformat from CSVKit, with sed:

$ csvformat -d '|' file | sed 's/| //' | csvformat -D '|'
abc|45|Do not replace in this|0.23

The first call to csvformat changes the CSV delimiter from | to comma. The pipe in the text (and the space after it) can then be removed with a simple call to sed. We then call csvformat again to change the delimiters back to |.

Note that the double quotes are not used in the final output. This is because they are no longer needed. They were not ever part of the actual data in the first place, but only needed to delimit that field due to the pipe used in it (the original data was a properly quoted CSV file).

Would you want quoting of the fields in the output, use -U1 with the final call to csvformat. This would quote all fields.

Inian · Accepted Answer · 2019-11-29 06:34:57Z

2

You can use a simple sed substitution to match a string that starts with " and does not contain a embedded " within and capture that group until the occurrence of | and then match the second group from there to end of the ". Just print out the matching groups, since they don't contain the | character

sed 's/\("[^"]*\).* |\([^"]*"\)/\1\2/g'

answered Nov 29, 2019 at 6:34

Inian

13.1k2 gold badges42 silver badges55 bronze badges

The space is not part of the OP description. It works for the specific example used, but not required in general.

user232326
– user232326

2019-11-30 01:37:51 +00:00
Commented Nov 30, 2019 at 1:37

Add a comment |

glenn jackman · Accepted Answer · 2019-11-29 14:33:41Z

0

Ruby has a nice CSV library, so this can be a one-liner:

ruby -rcsv -e 'CSV.filter(col_sep: "|") {|row| row.each {|field| field.gsub!(/\| /, "")}}' file

answered Nov 29, 2019 at 14:33

glenn jackman

88.5k16 gold badges124 silver badges179 bronze badges

Add a comment |

JJoao · Accepted Answer · 2020-01-09 17:02:32Z

0

Using Perl (sorry: obfuscating code)

perl -pe 's/".*?"/ $& =~ tr[|][]dr /ge'     file

Explanation:

perl -pe proc - apply proc to all the lines
s/RE/ f($&) /ge - substitutes RE by the result of f(matching string)
tr[|][]dr - translates | by nothing (=deletes)

edited Jan 9, 2020 at 17:02

answered Nov 29, 2019 at 17:34

JJoao

12.8k1 gold badge26 silver badges45 bronze badges

Add a comment |

Stack Exchange Network

Shell : How to replace character only inside double quoted text in a file?

4 Answers 4

You must log in to answer this question.

Hot Network Questions

Shell : How to replace character only inside double quoted text in a file?

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions