I'm working with ; separated strings like
a="My name is Laura;My name is Martin;My name is Carl;"
I want to remove the part after name, and have this as result:
a="My name;My name;My name;"
Using gsub:
gsub("(?<=name)[^;]*", "", a, perl = TRUE)
# [1] "My name;My name;My name;"
Alternatively, we can use gregexpr and `regmatches<-` for this:
gre <- gregexpr("(?<=name)[^;]*", a, perl = TRUE)
regmatches(a, gre)
# [[1]]
# [1] " is Laura" " is Martin" " is Carl"
regmatches(a, gre) <- ""
a
# [1] "My name;My name;My name;"
Walk-through:
(?<=name) is a lookbehind regex that does not "consume" the name string, just uses it as a starting-point; this is a Perl extension, ergo perl=TRUE;(?<=name)[^;]* find all non-; text after nameregmatches(a, gre) returns the values; we don't need to run this line of code for anything other than validation of the text that will be removed;`regmatches<-` is a form of regmatches for the LHS of an assignment operator (not all functions have this form); we use this to replace the matched portions.It worked like this
gsub("My name[^;]*","My name",a)
My name, I say this isn't optimal, look at my solution or r2evans editSolution:
With gsub and perl=TRUE:
> gsub("(?<=name).*", "", unlist(strsplit(a, ";")), perl=TRUE)
[1] "My name" "My name" "My name"
>
Explanation:
This detects name in a lookbehind and removes everything (.*) after it. First of all it uses strsplit to split it into a list by ;, then I use unlist to make it a vector.
Use:
> paste(gsub("(?<=name).*", "", unlist(strsplit(a, ";")), perl=TRUE), collapse=";")
[1] "My name;My name;My name"
>