0

I am trying to get xml from string. Specific symbols locate in tags title. I did it:

    public class Demo {

    public static void main(String[] args) throws Exception {
        String data = "<title> \"sad\" <<dd> ><\n   </title>";
        String pattern = "(<title>)(.+?)([<>'\"&])(.+?)(\n   </title>)";
            Matcher m = Pattern.compile(pattern).matcher(data);          
            while (m.find()) {
                String bugString = m.group(3) + m.group(4);
                String fixed = bugString.replaceAll("<", "&lt;");
                fixed = fixed.replaceAll(">", "&gt;");
                fixed = fixed.replaceAll(">", "&gt;");
                fixed = fixed.replaceAll("'", "&apos;");
                fixed = fixed.replaceAll("\"", "&quot;");
                fixed = fixed.replaceAll("&", "&amp;");
                data = data.replace(bugString, fixed);
            }
            System.out.println(data);
    }

}

But it looks a little ugly. How I can improve it, if I don't want use additional library?

1
  • This is a question for codereview.stackexchange.com However I'd use JAXB to parse XML, it's a standard library now Commented Apr 14, 2014 at 17:40

1 Answer 1

1

If you could influence the String you could put the titles tag text within a CDATA section. Within this you do not have to encode the special XML characters.

CDATA section is explained e.g. here http://en.m.wikipedia.org/wiki/CDATA

So your title could be like

 <title> <![CDATA[ here comes my special title with "/<>  ]]> </title>
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.