0

I need to extract all tags and words (in chronological order) from html file. Here's the example of file: one two thre What I want at the output is an array or a list which looks like this: {"", "one", "two", "thre", ""} I know that there are tools such as jTidy or Apache Tina, but these tools are for extracting only text (or only tags) from a document. What should I do?

1 Answer 1

1

Use the JSoup library for this. It makes HTML parsing in Java incredibly easy.

Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.