0

I have a string that contains a fair bit of XML, it's actually xml that describes a word document(document.xml). I want to simply replace a part of the string with an empty string effectivally removing it from the string. This sounds straight forward but I'm not getting the result I expect.

Here is what some of the XML looks like, this is just the first 10 lines:

<w:body xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">
    <w:p w:rsidR="00CB3A3E" w:rsidP="00257CF7" w:rsidRDefault="008C1E91">
        <w:pPr>
            <w:pStyle w:val="Heading-Title" />
        </w:pPr>
        <w:r>
            <w:t>References</w:t>
        </w:r>
    </w:p>
    <w:sdt> 

As I said this is in a string. I simply try to replace <w:t>References</w:t> with an empty string. I am doing it like so:

//xmlBody is the string that is holding the xml
xmlBody.Replace("<w:t>References</w:t>", " ");

This is not working, the string is unaltered when I do this. What am I doing wrong? Any advice would be appreciated, thanks much!

1
  • 1
    Please try to use proper XML objects to manipulate XML. First you will not produce invalud XML this way and second you'll avoid asking "how to parse/search XML with regular expressions" when you find that <w:t> could be on separeate lines from text or something similar. Commented Aug 8, 2012 at 19:34

5 Answers 5

3
xmlBody = xmlBody.Replace("<w:t>References</w:t>", "");

The Replace function doesn't change the source string; it returns a new string with the changes made. In fact, C# strings cannot be changed. If you check out the documentation, it says that they're immutable.

Sign up to request clarification or add additional context in comments.

Comments

3

In C#, string is not mutable - once created, it cannot be changed. Replace returns a new instance of string. Therefore, you need to catch its return value:

xmlBody = xmlBody.Replace("<w:t>References</w:t>", " ");

As a sidenote, it is not considered a good practice to parse XML-based strings with regular expressions, because it's too fragile. Consider using XDocument or some such to remove elements you are after.

Comments

2

string.replace returns a new string, it doesn't change the original string

try

xmlBody = xmlBody.Replace("<w:t>References</w:t>", " ");

Comments

1

Replace isn't an inline replacement... it returns a new string with the replacement made.

Comments

0

Replace everything between <w.t> tags with an empty string:

xmlBody = Regex.Replace(xmlBody,
                        @"<w:t>[\s\S]*?</w:t>",
                        "<w:t></w:t>");

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.