1

Previously in this question: "string having doublequotes in between apart from the enclosing quotes" @BernieReiter asked a follow-up question where he wanted to take CSV entries such as the following:

$ cat test.csv
17,"abc","Testurteil "sehr gut"","08/15"
99,"xyz","Testurteil "vernichtend"","4711"

And convert them so that instead of having embedded double quotes ("...") within, they were substituted to be single quotes ('...').

The results should look like this:

17,"abc","Testurteil 'sehr gut'","08/15"
99,"xyz","Testurteil 'vernichtend'","4711"

@BernieReiter had also asked how he could take @StephaneChazelas' solution that he provided to that question, where he used this Perl solution:

$ perl -pi.back -le 's/"(?:[^"]|"(?=[^,]))*"|[^",]*/($r=$&)=~
  s@(^"|"$|\\.)|"@$1||"\\\""@ge;$r/ge' file.csv

So how would one modify Stephane's solution?

1
  • @BernieReiter - see this question and answer to your your edit that you attempted to make to Stephane's answer on this Q&A: unix.stackexchange.com/questions/88366/… Commented Aug 28, 2013 at 2:49

1 Answer 1

1

The following modification to @Stephane's solution appears to provide what @BernieReiter was looking for:

$ perl -pi.back -le 's/"(?:[^"]|"(?=[^,]))*"|[^",]*/($r=$&)=~
  s@(^"|"$|\\.)|"@$1||"'\''"@ge;$r/ge' test.csv

The key thing to notice in the original Perl solution is this sub component:

s@(^"|"$|\\.)|"@$1||"\\\""@ge

Specifically this piece of code:

"\\\""

That's a double quote block around \\\". That's the piece of @Stephane's original solution that is substituting in the \" for any internal double quotes. It's what's taking this:

"Testurteil "sehr gut""

and turning it into this:

"Testurteil \"sehr gut\""

So simply swapping out the contents in between the double quotes ("\\\"") for a single quote construct is all that's required:

"'\''"

NOTE: It's required that we wrap our \' in single quotes to protect it!

Final solution

$ perl -pi.back -le 's/"(?:[^"]|"(?=[^,]))*"|[^",]*/($r=$&)=~
  s@(^"|"$|\\.)|"@$1||"'\''"@ge;$r/ge' file.csv

Example

Running this will transform the file as originally specified.

$ perl -pi.back -le 's/"(?:[^"]|"(?=[^,]))*"|[^",]*/($r=$&)=~
  s@(^"|"$|\\.)|"@$1||"'\''"@ge;$r/ge' test.csv

Results:

$ more test.csv
17,"abc","Testurteil 'sehr gut'","08/15"
99,"xyz","Testurteil 'vernichtend'","4711"
3
  • 1
    That regex is in desperate need of /x with comments. Commented Aug 28, 2013 at 2:57
  • 1
    I think that this does deserve a bit more of an explanation of what is happening with the shell quoting. In the shell, you can not escape a single quote inside of single quotes. So what you are doing is ending the quotes, escaping a single quote, then starting a new single quoted string. Commented Aug 28, 2013 at 2:59
  • @jordanm - tell me about it, I wanted to leave it as it was by in large so that if anyone came across either of these Q&A's they were similar enough that it made sense. It would probably be worthwhile if someone wrote up how to break one of Stephane's Perl solutions up so we could refer to it in the future. Commented Aug 28, 2013 at 3:00

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.