0

I am trying to create a new df, call it df3, out of two other datasets:

df1 = data.frame("String" = c("a", "b", "c"), "Title" = c("A", "B", "C"), "Date" = c("2020-01-01", "2020-01-02", "2020-01-03"))

and:

df2 = data.frame("String" = c("a", "x", "y"), "Title" = c("ABCDEF", "XYZ", "YZ"), "Date" = c("2020-01-03", "2020-01-20", "2020-01-30"))

The conditions for the observations that should be matched, and form a new dataset, are: df1$String %$in% df2$String grepl(df1$Title, df2$Title) == TRUE df1$Date < df$Date

What is the best way to do this kind of merging? I have tried to create an indicator along the lines of :

df1$indicator = ifelse(df1$String %in% df2$String & grepl(df1$Title, df2$Title) & df1$Date < df$Date, 1, 0)

or

df1$indicator = ifelse(df1$String %in% df2$String & grepl(df1$Title, df2$Title[df1$String %in% df2$String) & df1$Date < df2$Date[df1$String %in% df2$String, 1, 0)

to then use for merging, but I've been getting "longer object length is not a multiple of shorter object length" and "argument 'pattern' has length > 1 and only the first element will be used" warnings.

1
  • Check out the {fuzzyjoin} package. Commented Dec 14, 2022 at 18:13

1 Answer 1

1

One way: Use a crossjoin then filter the result. Note that grepl is not vectorized over both arguments, so i use mapply.

df1 = data.frame("String" = c("a", "b", "c"), "Title" = c("A", "B", "C"), "Date" = c("2020-01-01", "2020-01-02", "2020-01-03"))
df2 = data.frame("String" = c("a", "x", "y"), "Title" = c("ABCDEF", "XYZ", "YZ"), "Date" = c("2020-01-03", "2020-01-20", "2020-01-30"))


merge(df1,df2, by=NULL, suffixes = c(".x", ".y")) |> 
  subset(String.x %in% String.y 
         & mapply(grepl, Title.x, Title.y) 
         & Date.x < Date.y )
#>   String.x Title.x     Date.x String.y Title.y     Date.y
#> 1        a       A 2020-01-01        a  ABCDEF 2020-01-03
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.