I'm trying to extract strings having same patterns from the text
The Tragedy of Romeo and Juliet by William Shakespeare
library(readr)
txt <- read_file('http://www.gutenberg.org/cache/epub/1112/pg1112.txt')
Text example:
Scene I.\r\nVerona. A public place.\r\n\r\nEnter Sampson and Gregory (with swords and bucklers) of the house\r\nof Capulet.
...
Scene II.\r\nA Street.\r\n\r\nEnter Capulet, County Paris, and [Servant] -the Clown.\r\n\r\n\r\n Cap.
I want to extract
Verona. A public place.
A Street
I tried with
library(stringr)
str_extract(txt, "Scene\\s[IV]+\\.\\s\\s\\b[A-Z]+\\b")
It didn't work.
Thank you in advance for your advice.