1

I have the following xml.. and I am trying to parse it.

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>CB9A0BB76</manager>
    </contact>
</employee>

But.. well... I am not able to do so.. Posting my code.. but my code works for "proper" formatted xml though? (uncomment "xmlString")

public class XMLReader {
    public static void main(String[] args) throws JDOMException, IOException {

        //String xmlString = "<employee >\n <firstname xml:space=\"preserve\" >John</firstname>\n <lastname>Watson</lastname>\n <age>30</age>\n <email>[email protected]</email>\n</employee>";
        String xmlString = "<employee>\n" + 
                "       <personal><id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>\n" + 
                "       <name>Lareina</name>\n" + 
                "       <age>50</age>\n" + 
                "       </personal><contact><dept>Fusce</dept>\n" + 
                "       <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager></contact>\n" + 
                "   </employee>";
        System.out.println(xmlString);


        SAXBuilder builder = new SAXBuilder();
        Reader in = new StringReader(xmlString);

        Document doc = builder.build(in);
        Element root = doc.getRootElement();
        List children = root.getChildren();
        //System.out.println(children);
        String value = "";
        for (int i = 0; i < children.size(); i++) {

                Element dataNode = (Element) children.get(i);
               // Element dataNode = (Element) dataNodes.get(j);
                value += ", " +dataNode.getText().trim();
                System.out.println(dataNode.getName() + " : " + dataNode.getText());

                //context.write(new Text(rowKey.toString()), new Text(node.getName().trim() + " " + node.getText().trim()));

            }
        //System.out.println(in);



    }
}
3
  • I un-commented the code and it works fine for me. Commented Sep 26, 2013 at 22:28
  • @SotiriosDelimanolis: which code? It works fine with the uncommented "xmlString" but the one xml I have given.. Does that works?? Commented Sep 26, 2013 at 22:29
  • Instead of parsing the XML manually, use JAXB or a similar POJO-XML marshaling library. It only takes a few lines of code to effortlessly convert between your Java objects and XML. Commented Sep 26, 2013 at 23:19

1 Answer 1

2

Your two xml strings are different. The first is

<employee>
    <firstname xml:space="preserve">John</firstname>
    <lastname>Watson</lastname>
    <age>30</age>
    <email>[email protected]</email>
</employee>

Which has four (4) children that each has text. So it prints

firstname : John
lastname : Watson
age : 30
email : [email protected]

And the second is

<employee>
    <personal>
        <id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id>
        <name>Lareina</name>
        <age>50</age>
    </personal>
    <contact>
        <dept>Fusce</dept>
        <manager>B55E6DA8-76BD-A3C8-2DDF-686CB9A0BB76</manager>
    </contact>
</employee>

In this last one, you get two children personal and contact which have no text. So you get output like

personal : 



contact : 

This is the expected output.

Sign up to request clarification or add additional context in comments.

7 Comments

So i guess, is there a way to get "<id>2D61EC47-0F56-5A33-6057-54DB0ABBDBF0</id> <name>Lareina</name> <age>50</age>" as value for personal??
No, this is not HTML and there is no 'inner-xml' capability. After parsing there is only the element tree. Each node contains sub-nodes, which can be elements or text (and some other types like attributes, PIs). If you need to represent a subtree as serialized XML (i.e. the string you showed) you must serialize it yourself.
Of course. The Element class has a getChild(name) method. You can just do getChild("personal") on the root and iterate over the children elements. I suggest you use XPath to parse the xml.
@Jim Did I misunderstand the question? You can very easily get the elements within <personal>.
You can get each node individually, but there is nothing built-in that will produce a string as the OP requested in his first comment. To make that string he would have to serialize the subtree.
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.