Skip to main content
replaced http://stackoverflow.com/ with https://stackoverflow.com/
Source Link

sed/awk are really about regular expressions. check this answer on stackoverflowthis answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but do not try using sed/awk to parse XML

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

# stupid and naive approach:
sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML. it might work for a very specific generator of the given file, but is not guaranteed to work with any XML-file following the same structure (and structured data is what XML is all about)

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but do not try using sed/awk to parse XML

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

# stupid and naive approach:
sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML. it might work for a very specific generator of the given file, but is not guaranteed to work with any XML-file following the same structure (and structured data is what XML is all about)

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but do not try using sed/awk to parse XML

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

# stupid and naive approach:
sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML. it might work for a very specific generator of the given file, but is not guaranteed to work with any XML-file following the same structure (and structured data is what XML is all about)

bold warning to not use sed/awk
Source Link
umläute
  • 6.7k
  • 2
  • 30
  • 54

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but don't try using sed/awk.but do not try using sed/awk to parse XML

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

# stupid and naive approach:
sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML. it might work for a very specific generator of the given file, but is not guaranteed to work with any XML-file following the same structure (and structured data is what XML is all about)

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but don't try using sed/awk.

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML.

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but do not try using sed/awk to parse XML

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

# stupid and naive approach:
sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML. it might work for a very specific generator of the given file, but is not guaranteed to work with any XML-file following the same structure (and structured data is what XML is all about)

Source Link
umläute
  • 6.7k
  • 2
  • 30
  • 54

sed/awk are really about regular expressions. check this answer on stackoverflow why parsing HTML/XML with regular expressions is a bad idea.

for XML you really need to build a DOM of the document and then find your information. there are cmdline tools like xmlstar that allow you to get information out of XML-documents.

but don't try using sed/awk.

PS: of course, you might be able to create a simple regular expression that can extract the information needed on the files you happen to encounter in real life. e.g. the following will print the 5th line of the document, which (in your example) holds the relevant information.

sed '5!d' MyXML.xml

but this makes an assumption about the layout of the file, which has nothing to do with XML.