I'm writing a unix shell script where I need to pretty print XML files, but the catch is that there are portions of them that I may not touch. Namely, they're Apache Jelly scripts, which are contained within the XML files I need to pretty print. So I need to convert this
<proc source="customer"><scriptParam value="_user"/><scriptText><jelly:script>
<jelly:log level="info">
this text needs
to keep its indent level
and this is none of my business
</jelly:log>
<!-- get date -->
<sql:query var="rs"><![CDATA[
select sysdate
from dual
]]></sql:query>
</jelly:script>
</scriptText></proc>
Into this
<proc source="customer">
<scriptParam value="_user"/>
<scriptText>
<jelly:script>
<jelly:log level="info">
this text needs
to keep its indent level
and this is none of my business
</jelly:log>
<!-- get date -->
<sql:query var="rs"><![CDATA[
select sysdate
from dual
]]></sql:query>
</jelly:script>
</scriptText>
</proc>
Notice that the only change to the jelly:script element is newline
before it.
I couldn't find any option in xmllint or xmlstarlet to ignore a
certain element. Is there any tool that can help me achieve this? I'm on
Linux, if it matters.
xmlstarlet,xmllint, and probably most XML parser based tools. Otherwise I would have suggestedxmlstarlet ed.