2

I have a requirement to update a variable value in XML with a fixed value once there is a partial match in XML using sed on Linux.

Example:

Input value:

<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="A" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">

The requirement is to update the version value with only 6.02 whenever "6.02" is there in version value. So, output would be like:

<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="A" version="6.02">

Note: tableName="Data" is fixed value but PGPU_ID, DBaction and version can be different. So whenever tablename ="Data" and version is 6.02, then the sed command should replace the version with only 6.02 keeping other values exactly the same.

6
  • 2
    Welcome to the site. For structured data like XML and JSON, it is better to use dedicated parsing tools like xmlstarlet or JQ instead of line-oriented tools like sed or awk because the same data can end up formatted in different ways. Do you really have to use sed? Commented Jul 9, 2021 at 8:39
  • Thanks for your response! No there is no such requirement to use SED but I was wondering in order to use any parsing tools, do I need to install something in UNIX? If not, then it's totally fine. Commented Jul 9, 2021 at 8:44
  • What operating system are you using? If UNIX, which one? And are you really using Unix or did you mean Linux? Commented Jul 9, 2021 at 9:13
  • I am really sorry. this is Linux. Linux vlheemsdv03 4.12.14-122.12-default #1 SMP Thu Dec 19 12:19:34 UTC 2019 (6c5578e) x86_64 x86_64 x86_64 GNU/Linux Commented Jul 9, 2021 at 9:42
  • Is it SUSE Linux Enterprise Server? Commented Jul 9, 2021 at 22:33

3 Answers 3

7

Using xmlstarlet:

xmlstarlet ed -u '//Table/@version[ ../@tableName = "Data" and contains(.,"6.02") ]' -v '6.02' file.xml

This finds all version attributes of every Table node. It selects the ones belonging the Table nodes that also has a tableName attribute with the value Data and that contains the substring 6.02. These are updated be only the string 6.02.

The result is written to standard output where you may redirect it into a new file, or you may use xmlstarlet ed --inplace -u ... to edit the document in-place (use with care).

12
  • I ma getting error: -ksh: xmlstarlet: not found [No such file or directory]. Also, is it not possible to search the version field only for table Data as 6.02 version can exist for other tables well including "Data". Hence those versions will get updated as well. Commented Jul 9, 2021 at 9:39
  • @DebajitDutta You will need to install xmlstarlet if it's not already installed. Use your package manager. I will update the answer to also check the tableName attribute ASAP. Commented Jul 9, 2021 at 9:44
  • 4
    @DebajitDutta Then go through whatever protocol you need to go through to introduce xmlstarlet on your production servers. I would recommend installing it on the production server over adding broken home-grown XML parsing code that may potentially compromise your production system. Commented Jul 9, 2021 at 10:01
  • 2
    @DebajitDutta I can't underline that enough: Doing XML parsing using a line-oriented tool like sed (or awk, by default) is brittle and difficult to get right, but easy to do quick-n-dirty hacky edits with. Using a dedicated XML parser is safer. Commented Jul 9, 2021 at 10:07
  • 2
    @DebajitDutta You will unfortunately not get a sed command that does XML parsing from me. Commented Jul 9, 2021 at 12:02
1

Disclaimer: Never automatically change XML files with sed unless you are really sure what you are doing. It's usually easy to invent examples where the script will fail. In this case, oldversion="..." would match, for example.

sed 's/version="\([^",]*,\)*6.02\(,[^",]*\)*"/version="6.02"/g'

You don't want the replacement if the is a version number 16.02 or 6.021, which makes it a little bit ugly to create a pattern. [^",]*, matches any string without comma or double quote, followed by a comma, so \([^",]*,\)* matches zero or many of those fields. This way we make sure there is no digit or something else before the 6.02. Same thing for the fields following the version number.

3
  • ok so are you saying this command i can use? Commented Jul 9, 2021 at 11:49
  • 4
    @DebajitDutta They're saying that it will do what you say you want to do given the exact data in the question (which, may I remind you, isn't even a well-formed XML fragment). Whether or not it will actually work in a production environment, under all circumstances, forever, is up to you to consider. Commented Jul 9, 2021 at 12:04
  • 1
    Yes, as @Kusalananda wrote, you can use it at your own risk. While the awk answer below will easily fail and damage your files, this one will almost certainly do the job. But still XML has some degrees of freedom to make the command fail, if the tool that creates the file uses this freedom. If you really don't have any other option, try it carefully. Commented Jul 12, 2021 at 7:28
0

Will awk work?

awk '$2~/tableName="Data"/ && $5 ~ /[*",.^]6.02[$.,"*]/ { $5 = "version=\"6.02\">" }1'
  • Input
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="A" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="B" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="C" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="D" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.06,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="E" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="F" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.09,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="G" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="H" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="I" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.04,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="J" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="K" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="L" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.02,6.03,6.04">
  • Output
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="A" version="6.02">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="B" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="C" version="6.02">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="D" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.06,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="E" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="F" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.09,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="G" version="6.02">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="H" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="I" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.04,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=1234" DBaction="J" version="6.02">
<Table tableName="Data" primaryKey="PGPU_ID=4284" DBaction="K" version="14.1,20.4,4.30,4.40,5.00,5.30,5.40,5.41,6.00,6.01,6.05,6.03,6.04">
<Table tableName="Data" primaryKey="PGPU_ID=8827" DBaction="L" version="6.02">
3
  • 6
    This requires that the attributes are ordered in a particular way, with no newlines between them. It also assumes that the string ,6.02, does not occur in the fifth on any other line than in the lines that we'd like to change. This will break if the Table node has any data after it, on the same line, that needs to be retained (or updated, if there are more Table nodes on the same line). Commented Jul 9, 2021 at 9:27
  • I updated it to include the original data that would not be altered if 6.02 does not appear in the version list. Commented Jul 9, 2021 at 9:59
  • What about if 6.02 is the first or the last version number in the list? Commented Jul 12, 2021 at 7:22

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.