0

I have a dataframe df with a series of NA and strings and 2 matrices match and value with same ncol and nrow. match has all the possible strings in df

I would like to replace the strings in df with those in value. If a string in df matches the values in match, then it can be replaced by the string in value at the same position

I believe the first step is to create a new df with the position of match in df

df1 <- which(df %in% match) #nothing valuable...

apologies for less code from my side.


df <- as.data.frame(matrix(c("ab","bc",NA,"aa",NA,NA,"de","aa",NA,"bc","ab","ab"),ncol = 4))
match <- matrix(c("ab","bc","de","aa"),nrow = 2)
value <- matrix(c("Good","Bad","Average","Stop"),nrow = 2) 

 output <- as.data.frame(matrix(c("Good","Bad",NA,"Stop",NA,NA,"Average","Stop",NA,"Bad","Good","Good"),ncol = 4)) 
1
  • Or using plyr : matrix(mapvalues(unlist(df),c(match),c(value)),dim(df)) Commented Mar 10, 2017 at 8:32

3 Answers 3

2

This should also works

> m<-apply(df,2,function(x) match(x,match))
> df2<-as.data.frame(matrix(value[m],ncol =ncol(df),nrow=nrow(df)))
> df2
       V1      V2      V3   V4
1    Good    Stop Average  Bad
2     Bad Average    Stop Good
3 Average    Stop     Bad Good
Sign up to request clarification or add additional context in comments.

Comments

1

We can unlist the dataframe and match the elements of dataframe with that of m1 and use the index to get corresponding value from value.

df[] <- value[match(unlist(df), m1)]
df

#    V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

Note : Renamed match as m1.

4 Comments

simple and clear. in order to add a nomatch in case i want to keep the string. adding 'df[] <- value[match(unlist(df), m1,nomatch=unlist(df))]' does not keep the previous ones. any ideas? thanks.
@Chrisftw I didn't get you. to keep the string ?? do you mean an empty string instead of NA's?. What output you aiming at ?
your answer is correct. In case some strings in df do not match the strings in values , i would like to add a nomatch argument that will keep the unmatched string.
@Chrisftw That won't be straightforward with nomatch. I think you'll require something like v1 <- match(unlist(df), m1); df[] =ifelse(is.na(v1) & !is.na(unlist(df)), unlist(df), value[v1]) in that case.
1

We can use lapply with match.

df[] <- lapply(df, function(x) value[match(x, match)])
df
#   V1   V2      V3   V4
#1 Good Stop Average  Bad
#2  Bad <NA>    Stop Good
#3 <NA> <NA>    <NA> Good

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.