I'm creating a tool for scraping links from multiple URLs. I want to store this information, then test the scraped links for their status.
I am expecting having to test a lot of links, about 60,000. So the problem I have is deciding how to store the links to test.
What I'm thinking of doing is creating text files for the URLs I'll be scraping. I'll have to create about 40 text files for URLs I'll be scraping(the URLs I'm scraping are the same URL, just regionalised).
- Would creating lots of text files cause performance issues?
- Would I be best off storing the URLs in an array and then writing the array to the text file, or should I just write the URL to the text file as I go? Or is there a better way?
- Is there a better method than storing in text files? (I don't really want to use a database but if there is a good case for it I could be convinced)
