1

I have a data frame, which is basically the Hello list of passengers, with columns of Hello(string), Hello (1st, 2nd or 3rd), Hello, Hello (female or male) and Hello(0 or 1).

Basically, I want to extract the unmarried women from my data frame.

I only want to extract the Hello that contain "Hello". I can't use the == operator because it will match it identically. Any help would be appreciated. Thank you all.

I have tried "Hello but that did not work.

 $ Hello    : Hello w/,..: 22 25 26 27 24 31 45 46 50 54 ...
 $ Hello  : Hello w/ 3 levels "1st","2nd","3rd": 1 1 1 1 1 1 1 1 1 1 ...
 $ Hello     : He
 - attr(*, "na.action")= 'omit' Hello int  13 14 15 30 33 36 41 46 47 53 ...
  ..- attr(*, "names")= chr  "13" "14" "15" "30" ...```

5
  • 1
    Please share a sample of your data with dput. See here for details on how to format your question. Commented May 31, 2019 at 16:39
  • Please look at my updated post :) Commented Jun 1, 2019 at 5:00
  • Please use dput not str. Commented Jun 1, 2019 at 6:27
  • I have over 1000 rows in my dataframe :( Commented Jun 1, 2019 at 12:24
  • Please read the link in my first comment. TL DR: Use dput(head(my_data,12)) for instance. Commented Jun 1, 2019 at 12:30

1 Answer 1

1

We can use filter with str_detect to match the substring "Miss" on the 'Name' column. The \\b is appended at the beginning and end to specify the word boundary

library(tidyverse)
thetitanic %>%
     filter(str_detect(Name, "\\bMiss\\b"))
Sign up to request clarification or add additional context in comments.

Comments