11

I want to remove extra spaces, add spaces if required and capitalize first letter of each word after special character using R

string <- "apple,banana, cat, doll and donkey;     fish,goat"

I want output as

Apple, Banana, Cat, Doll and donkey; Fish, Goat

I tried

gsub("(^.|,.|;.)", "\\U\\1", string, perl=T, useBytes = F)

It didn't work. Please help

1
  • you need to allow for whitespace gsub("(^.|[,;]\\s*.)", "\\U\\1", string, perl=TRUE) Commented Dec 7, 2015 at 14:21

1 Answer 1

8

You can use

string <- "apple,banana, cat, doll and donkey;     fish,goat"
trimws(gsub("(^|\\p{P})\\s*(.)", "\\1 \\U\\2", string, perl=T))
## => [1] "Apple, Banana, Cat, Doll and donkey; Fish, Goat"

See this IDEONE demo

The PCRE regex matches:

  • (^|\\p{P}) - (Group 1) start of string or any punctuation
  • \\s* - 0 or more whitespace symbols
  • (.) - (Group 2) any character but a newline

The replacement:

  • \\1 - backreferences Group 1
  • - inserts a space between the punctuation and the next character or at the start of string
  • \\U\\2 - turns the Group 2 character uppercase

And trimws removes the initial space we added with the regex.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.