5

I am trying to find elements in a character vector that match two words in no particular order, not just any single one of them, using the stringr::str_subset function. In other words, I'm looking for the intersection, not the union of the two words.

I tried using the "or" (|) operator but this only gives me either one of the two words and returns too many results. I also tried just passing a character vector with the two words as the pattern argument. This just returns the error that "longer object length is not a multiple of shorter object length" and only returns the values that match the second one of the two words.

character_vector <- c("abc ghi jkl mno def", "pqr abc def", "abc jkl pqr")
pattern <- c("def", "pqr")

str_subset(character_vector, pattern)

I'm looking for the pattern that will return only the second element of the character vector, i.e. "pqr abc def".

4 Answers 4

6

An option is str_detect. Loop over the 'pattern', check if both the 'pattern' elements match with the 'character_vector' (&), use the logical vector to extract the element from the 'character_vector'

library(tidyverse)
map(pattern, str_detect, string = character_vector) %>%
    reduce(`&`) %>% 
    magrittr::extract(character_vector, .)
#[1] "pqr abc def"

Or using str_subset

map(pattern, str_subset, string = character_vector) %>% 
         reduce(intersect)
#[1] "pqr abc def"
Sign up to request clarification or add additional context in comments.

1 Comment

The second solution is really elegant. This is the one I used. Thanks!
3

As you are looking for the intersect, you can use the function intersect() and explicit the 2 patterns you are looking for

pattern_1 <- 'pqr'

pattern_2 <- 'def'

intersect( str_subset(character_vector, pattern_1), str_subset(character_vector, pattern_2) )

2 Comments

I like the simplicity! Thanks
Don't forget to accept the answer and to give an up vote to all useful answers.
2

You can use a pure R code with out a loop using regular expression. The code is like this:

character_vector[grepl(paste0("(?=.*",pattern,")",collapse = ""), character_vector, perl = TRUE)]

the grepl would find the position of the character that full fills the regex and condition inside the paste0.

1 Comment

Interesting! What does the perl argument do here?
0

Will this work?

character_vector %>% purrr::reduce(pattern, str_subset, .init = . )

[1] "pqr abc def"

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.