Extract data in XML file with bash

Question

I need help extracting an XML string from file like this:

<line>
<Start_Time>2016-May-18 17.06.17.504</Start_Time>
<Domain>pciereg062</Domain>
<Injected_tags>
 before xml started ; AUTOMATIC-REPRODUCTION-stopped on barrier ;
</Injected_tags>
</line>

<line>
<Start_Time>2016-May-18 17.08.53.585</Start_Time>
<Domain>adv191</Domain>
<Injected_tags>port-num-0 ; port-num-0 actual-FW-14.16.0234 ;
</Injected_tags>
</line>

I want to extract the domain name which is in injected_tags (which will come always after domain) string stopped on barrier.

Is there a simple bash utility to do this (grep, awk, sed)?

From the example above, the output should be pciereg062 and not adv191.

See: Using xmlstarlet, how do I change the value of an element — Cyrus
– Cyrus, Commented May 22, 2016 at 21:35
While it's quick, dirty, bad and shooting-in-your-leg solution, it should work as long as your XML input structure remains the same: grep -B 2 'stopped on barrier' input.xml | grep -Po '(?<=<Domain>).*(?=</Domain>)'. You really should look into some XML parser like Cyrus suggested. — Pavel Gurkov
– Pavel Gurkov, Commented May 22, 2016 at 21:40

Ed Morton · Accepted Answer · 2016-05-22 21:41:49Z

1

With GNU awk for multi-char RS:

$ awk -v RS='</[^>]+>' -F'[<>]' '{m[$2]=$3} $2=="Injected_tags" && /stopped on barrier/{print m["Domain"]}' file
pciereg062

answered May 22, 2016 at 21:41

Ed Morton

208k18 gold badges90 silver badges212 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Extract data in XML file with bash

1 Answer 1

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Linked

Related