In linux, how can we use grep command to print the contents that comes inside this tag?
<errorPayload>XXXXXXXX</errorPayload>
I tried grep -Po '<errorPayload>' abc.log, but it only prints <errorPayload>
Don't use grep to parse XML or HTML.
Instead, use a proper parser:
xidel -s -e '//errorPayload/text()' file
XXXXXXXX
Can work with xmllint and xmlstarlet as well too, in some less cases (weak HTML support):
xmlstarlet sel -t -v '//errorPayload/text()' file
xmllint --xpath '//errorPayload/text()' file
It only prints <errorPayload> because that's what you told it to do by using the -o (--only-matching) option. From the man page, that means "Print only the matched (non-empty) parts of a matching line..."
If you want to see just the content of the tag, you need to create a regular expression that matches only the content, but not the start/end tag.
This should do it:
grep -Po '(?<=<errorPayload>).*(?=</errorPayload>)' abc.log
Given you sample input in abc.log, this produces:
XXXXXXXX
The expression (?<=<errorPayload>) is a "positive look-behind assertion": it means that the given pattern needs to match before our target expression, but is not considered part of the "matched content". The expression (?=</errorPayload>) is a "positive look-ahead assertion", which does the same thing but for a following pattern.
See e.g. this article for more details about look-ahead and look-behind assertions.
Caveat: grep is a rotten tool for parsing XML. The above will work as long as the XML formatting in your log files is consistent.
<errorPayload>...</errorPayload>` on the same line, this will print the string found between the first opening <errorPayload> and the last closing </errorPayload>.
XXXXXXXXcontain<or>? CanXXXXXXXXbe multi-line? What if you have<errorPayload>AAAA <errorPayload>AAAA </errorPayload> </errorPayload>? Is that possible? Why do you need to do this withgrep? Are you open to other tools? Can there be manyerrorPayloadtags in the file or just one? Many on the same line or just one?