I am trying to extract some URLs from a web page using cURL command. Initially, I use the cURL command as below.
curl www.website.com/
Now, the website contains links to some other websites which am interested in extracting. So, I do a grep on the cURL command as below.
curl www.website.com/ | grep "<a href=" > new1.txt
It is extracting all lines which have <a href= in them. But am particularly interested only in lines which start with <a href= and end with title=
How can I modify the grep command?
grep "<a href=.*title="but this can get complicated when parsing HTML.