0

I have an XML file similar like the following:

<?xml version="1.0" encoding="UTF-8"?>
<csw:GetRecordByIdResponse xmlns:csw="http://www.opengis.net/cat/csw/2.0.2">
  <xmlns:gmi="http://sdi.eurac.edu/metadata/iso19139-2/schema/gmi" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gml="http://www.opengis.net/gml" xmlns:geonet="http://www.fao.org/geonetwork" gco:isoType="gmd:MD_Metadata">
    <gmd:onLine>
                  <gmd:CI_OnlineResource>
                    <gmd:linkage>
                      <gmd:URL>http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&amp;TIME=2018-06-14T10:59:00Z&amp;</gmd:URL>
                    </gmd:linkage>
                    <gmd:protocol>
                      <gco:CharacterString>OGC:WMS-1.1.1-http-get-map</gco:CharacterString>
                    </gmd:protocol>
                    <gmd:name>
                      <gco:CharacterString>test_product:test_product</gco:CharacterString>
                    </gmd:name>
                    <gmd:description>
                      <gco:CharacterString>test_product:test_product</gco:CharacterString>
                    </gmd:description>
                  </gmd:CI_OnlineResource>
    </gmd:onLine>
</csw>

I would like to substitute the content of the tag with the following:

http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&version=1.1.0&request=GetMap&layers=test_product:test_product&styles=&bbox=140442.2309,3739661.3694,1330442.2309,2564661.3694&width=768&height=576&srs=EPSG:32632&format=application/openlayers&TIME=2018-06-14T10:59:00Z&amp;

I used to use the sed command in bash:

correct_url='http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&amp;version=1.1.0&amp;request=GetMap&amp;layers=test_product:test_product&amp;styles=&amp;bbox=140442.2309,3739661.3694,1330442.2309,2564661.3694&amp;width=768&amp;height=576&amp;srs=EPSG:32632&amp;format=application/openlayers&amp;TIME=2018-06-14T10:59:00Z&amp;'
sed -i 's/<gmd:URL>\(.*\)<\/gmd:URL>/<gmd:URL>'"${correct_url}"'<\/gmd:URL>/' xml_file.xml

It gives me an error:

sed: -e expression #1, char 52: unknown option to `s'

Could you please tell me what I'm doing wrong?

UPDATE:

using the suggestion of @rubystallion I tried to escape all the special characters:

correct_url='http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&amp;version=1.1.0&amp;request=GetMap&amp;layers=test_product:test_product&amp;styles=&amp;bbox=140442.2309,3739661.3694,1330442.2309,2564661.3694&amp;width=768&amp;height=576&amp;srs=EPSG:32632&amp;format=application/openlayers&amp;TIME=2018-06-14T10:59:00Z&amp;'
correct_url_escaped="${correct_url//\//\\\/}"
correct_url_escaped="${correct_url_escaped//&/\\&}"
correct_url_escaped="${correct_url_escaped/\?/\?}"
correct_url_escaped="${correct_url_escaped/\?/\?}"
correct_url_escaped="${correct_url_escaped//\;/\;}"
correct_url_escaped="${correct_url_escaped//\=/\=}"

sed -i 's/<gmd:URL>\(.*\)<\/gmd:URL>/<gmd:URL>'"${correct_url_escaped}"'<\/gmd:URL>/' xml_file.xml

But I'm still getting error:

sed: -e expression #1, char 47: unknown option to `s'

Am I still missing something??

6
  • 1
    Don't use sed to modify XML; instead, use an XML-aware tool. Commented Jul 18, 2018 at 6:20
  • Your XML is not valid: xmllint returns many namespace error : Namespace prefix gmd on ... is not defined. Commented Jul 18, 2018 at 6:23
  • @choroba I adde the namespaces. I forgot to write them Commented Jul 18, 2018 at 6:31
  • 1
    Don't Parse XML/HTML With Regex. I suggest to use an XML/HTML parser (xmlstarlet, xmllint ...). Commented Jul 18, 2018 at 6:38
  • 1
    @Cyrus something like xmlstarlet ed -u "//gmd:url" -v $correct_url xml_file.xml ? Commented Jul 18, 2018 at 6:46

2 Answers 2

1

Your URL has special characters in it, and you are substituting the URL into the executed command. If you place an echo in front of your sed command line, you'll see what is actually executed, which clearly isn't going to be a valid sed command.

You need to escape the URL, or just not place it directly into your sed command. You can achieve the latter by using the e flag, which replaces the matched text with the result of an executed command. Like this:

url="http://x:[email protected]/foo?a=b&c=d" sed -r -i 's/(\s*)<gmd:URL>(.*)<\/gmd:URL>/echo "\1<gmd:URL>$url<\/gmd:URL>"/e' xml_file.xml

Note, you should be cautious about using the e flag; because you are executing something there are potential security issues.

Also please heed generally good advice about using a XML editing tool to edit XML (in one off simple jobs like this, IMO it's fine to use sed if it's the quickest way to get it done ...).

Sign up to request clarification or add additional context in comments.

Comments

1

As the commenters have mentioned you can write more maintainable scripts and avoid making errors by using XML-aware tools, but let me show you why your code doesn't work:

Bash substitutes variables in strings with their contents before executing commands, so / will be parsed as a delimiter by sed and & will be parsed as the whole match in the substitution string. If you escape special characters correctly, then your command will work as intended:

correct_url='http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&amp;version=1.1.0&amp;request=GetMap&amp;layers=test_product:test_product&amp;styles=&amp;bbox=140442.2309,3739661.3694,1330442.2309,2564661.3694&amp;width=768&amp;height=576&amp;srs=EPSG:32632&amp;format=application/openlayers&amp;TIME=2018-06-14T10:59:00Z&amp;'
correct_url_escaped="${correct_url//\//\\\/}"
correct_url_escaped="${correct_url_escaped//&/\\&}"

token='http://server.test.it/geoserver/test_product/wms?SERVICE=WMS&amp;TIME=2018-06-14T10:59:00Z&amp;'

sed -i 's/<gmd:URL>\(.*\)<\/gmd:URL>/<gmd:URL>'"${correct_url_escaped}"'<\/gmd:URL>/' xml_file.xml

Also, please make sure that your commands compile as described in the question next time. You forgot to put quotes around the variables.

2 Comments

Hi @rubystallion unfortunately I have to say that I still got the same error sed: -e expression #1, char 47: unknown option to `s' PS I added the quotes. thanks!
You don't have to escape the question mark, because you're inserting the URL into the replacement part of the substitution, where question marks don't have a special meaning. If you copy the code I gave verbatim into a file script.sh in the same directory as your XML file and then run bash script.sh, it should work. To escape special characters, you have to use backslashes, which is what I did in the second and third line using bash substitution.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.