I am trying to replicate a workplace problem. I have a xml file like below
[~]$ less -N sample.xml
1 <SOURCE BUSINESSNAME ="" NAME ="TABLE1" FOO="ABCD"..... >
2 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_XYZ" />
3 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_ABCD" />
4 ...
5 ...
6 </SOURCE>
7 <SOURCE BUSINESSNAME ="" NAME ="TABLE2" ....... >
8 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_ABCD" />
9 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_XYZABC" />
10 ...
11 ...
12 </SOURCE>
13 <SOURCE BUSINESSNAME ="" NAME ="TABLE3" .... >
14 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_PQR" />
15 <SOURCEFIELD BUSINESSNAME ="" NAME ="COL_ABCD" />
16 ...
17 ...
18 </SOURCE>
Now I want the value of NAME field where any of the SOURCEFIELD NAME is like XYZ.
For example, in the given example I need TABLE1 as line 2 contains COL_XYZ. And also TABLE2 as in line 9 we have COL_XYZABC
I was thinking some way, to get row 1,2,7,9,13 as output and then grep -B1 XYZ|grep -w SOURCE field to get only row 1,7 in output.
Expected Output:
TABLE1
TABLE2
What I tried so far
- Doing a grep on
SOURCEis not working as every row has at least one of them. - Doing a
egrep -w "SOURCE|XYZ"is not working as I needXYZABCwill not satisfy its condition.
Could someone please suggest something which I can try to get desired result. I am using Linux 2.6.18-371.el5
TABLE3dont have a matchingXYZname (in line14and15). So I don't needTABLE3in output. The answer by RobertL worked like a charm. I also came up with another answer which is slower than that but works. Thanks for giving it a thought anyways. Cheers.