0

I have a big xml structure. I am interested in certain xml structure like below. I need to extract img tags only and the value of the src attribute if they are inside coral-card. I was trying to use regex to get coral-card enclosing tags with a regex and then use regex with the coral-card tags to get to the img tag and the content.

var regex = /<coral\-card ((.|[\r\n])*?)<\/coral\-card>/g;

Is there a way to use anything after I have got the specified xml content containing coral-card tags like below. I don't want to use regex after this as I think it should be possible to get the img tag and src attribute value using jquery or javascript function.

<coral-card variant="condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/lightbox.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>

<coral-card variant="semi-condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/small.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>
2

2 Answers 2

2

DOMParser and xpath are very easy to use for parsing xml. You can do something like:

const DOMParser = require('xmldom').DOMParser;
const xpath = require('xpath');

let parser = new DOMParser();
let doc = parser.parseFromString(<your xml>);
let document = doc.documentElement;
let coralCards = xpath.select('<path>/coral-card', document);

See xpath docs for all of the ways you can extract nodes out of an xml blob.

Sign up to request clarification or add additional context in comments.

Comments

0

This is exactly why the core DOM specification was created:

// Find all the <coral-card> elements:
var elements = document.getElementsByTagName("coral-card");

// Loop through them:
for(var i = 0; i < elements.length; ++i){
  // Extract whatever you need:
  console.log(elements[i].getAttribute("variant"));
  console.log(elements[i].querySelector("img").src);
}
<coral-card variant="condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/lightbox.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>

<coral-card variant="semi-condensed" data-timeline="true" stacked>
    <coral-card-asset>
        <img src="/content/dam/collections/3/3qtVFsGwnDVKpZ6H_SaM/small.folderthumbnail.jpg?width=240&height=240">
    </coral-card-asset>
 </coral-card>

2 Comments

Thanks. I have that specified xml content within coral-card tag or to put it precisely I have a very big HTML content lets say htmlResponse which has this xml content . How would documen.GetEelementsByTagName would work in that case ? Should I convert html response string to DOM using parseHTML first ?
@Geek Yes. Once it's parsed from a string. You can use the DOM API to traverse it and extract whatever you want.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.