Through an API I get an XML file which I'm trying to parse through org.w3c.dom and XPath. A part of the XML file describes HTML content:
<Para>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA)
<CitationRef CitationID="B1">1</CitationRef>
<CitationRef CitationID="B2">2</CitationRef> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies"
<CitationRef CitationID="B4">4</CitationRef> as well as pandemic influenza.
</Para>
This should become something like:
<p>Since 2001, state and local health departments in the US have accelerated efforts to prepare for bioterrorism and other high-impact public health emergencies. These activities have been spurred by federal funding and guidance from the US Centers for Disease Control and Prevention (CDC) and the Health Resources and Services Administration (HRSA)
<a href="link/B1">1</a>
<a href="link/B2">3</a> . Over time, the emphasis of this guidance has expanded from bioterrorism to include "terrorism and non-terrorism events, including infectious disease, environmental and occupational related emergencies"
<a href="link/B4">4</a> as well as pandemic influenza.
</p>
Any suggestions on how I can accomplish this? The main issue is to retrieve the tags and replace them while keeping their location.