4

I have the following XML data:

<info>
    <data>
        <name>my_image</name>
        <value>site.com/image1.img</value>
    </data>
    <data>
        <name>my_link</name>
        <value>site.com/p_info</value>
    </data>
</info>

I want to replace all the site.com to siteimag.com of every my_image value (just for the my_image attributes).

So the result will be:

<info>
    <data>
        <name>my_image</name>
        <value>siteimag.com/image1.img</value>
    </data>
    <data>
        <name>my_link</name>
        <value>site.com/p_info</value>
    </data>
</info>

How can it be done with sed command?

Thanks.

1
  • parsing xml with sed is bad idea Commented Oct 16, 2017 at 10:31

3 Answers 3

5
sed '/my_image/{n;s/site.com/siteimag.com/}' file

Brief explanation,

  • /my_image/: search the line contained "my_image"
  • Once the string searched,
    • read the next line by using the command n.
    • s/site.com/siteimag.com/: do the substitution
Sign up to request clarification or add additional context in comments.

Comments

5

The right way is using xml parsers like xmlstarlet:

xmlstarlet ed -u "//data/name[text()='my_image']/following-sibling::value" \ 
-v "$(sed 's/site/&imag/' <(xmlstarlet sel -t -v "//data/name[text()='my_image']/following-sibling::value" file.xml))" file.xml

The output:

<?xml version="1.0"?>
<info>
  <data>
    <name>my_image</name>
    <value>siteimag.com/image1.img</value>
  </data>
  <data>
    <name>my_link</name>
    <value>site.com/p_info</value>
  </data>
</info>

Comments

2

The same XML document can be written and formatted in many different ways, so using a simple text tool like sed to update it, would be very fragile and easy to break.

E.g. just adding an empty line between name and value, will break the naïve sed script above:

<data>
    <name>my_image</name>

    <value>site.com/image1.img</value>
</data>

It also can not really follow the XML structure, and will simply replace any site.com reference, if there is a my_image pattern in the line above it, even if it is in a comment:

<data>
    <name>my_link</name> <!-- Used to be "my_image" -->
    <value>site.com/p_info</value>
</data>

There are literally hundreds of other ways, the source XML can be modified, without changing it's meaning, to make it break or result in a false replacements.

To avoid that, you should use an XML-aware tool, to do the replacement, e.g. xmlstarlet:

xmlstarlet ed \
-u "//data/name[text()='my_image']/../value" \
-x "concat(substring-before(.,'site.com'),'sitemag.com',substring-after(.,'site.com'))"|
data.xml

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.