0

I'm creating a web crawler and I just read the html of a page and stored into into a string. I then found all of the anchor tags inside the html and stored them into an ArrayList called anchorTags. I now need to get ride of the "a href=" part of each string in the array list. To do this I wrote the following code; however, for some reason I am getting an outofbounds exception. Please note that I need to do this using loops, arraylists only:

ArrayList<String> parsedLinks = new ArrayList<String>();
    String storeHTML = "";

    for(int i = 0; i < anchorTags.size(); i++) {
        String anchorTag = anchorTags.get(i);
        int hrefIndex = anchorTag.indexOf("a href=");

        if (hrefIndex > -1) {



            int beginQuote = anchorTag.indexOf("\"", hrefIndex);

            int EndQuote = anchorTag.indexOf("\"", beginQuote +1);

            if (EndQuote > beginQuote) {
                storeHTML.substring(beginQuote +1, EndQuote);

            }


        }
    }
    parsedLinks.add(storeHTML);
    System.out.println(parsedLinks);
    return parsedLinks;


}
4
  • "I am getting an outofbounds exception" The exception will tell you exactly what is going wrong. Assuming you've looked at it and are still stumped, don't you think it would be important to post the exception to help others help you? Commented Apr 16, 2014 at 0:32
  • The exception is: Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 66 at java.lang.String.substring(Unknown Source) at WebCrawler.WebCrawler.linkParser(WebCrawler.java:127) at WebCrawler.WebCrawler.main(WebCrawler.java:28) Commented Apr 16, 2014 at 0:34
  • Good start: where are those line numbers in your code? Commented Apr 16, 2014 at 0:35
  • 127 is where I created the substring. 28 is where I reference the returned value in my main method Commented Apr 16, 2014 at 0:36

1 Answer 1

1

Shouldn't

storeHTML.substring(beginQuote +1, EndQuote);

be

storeHTML = anchorTag.substring(beginQuote +1, EndQuote); ?

Sign up to request clarification or add additional context in comments.

3 Comments

Yes! I see what I did wrong there. Thank you! This prints out my origional strings but the result keeps the anchor tags and doesn't take anything off
You need to show some sample input, expected output and actual output.
I didn't add it to the array in the loop! Thank you for your help!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.