1

I just wrote a console application to to replace a certain string within a large number of utf-8 coded files. I need to cover about 20 different cases of this string so I reduced my codesnippet to the necessary parts. The Code looks like this:

foreach (String file in allFIles)
{
    string text = "";
    using (StreamReader r = new StreamReader(file))
    {
        text = r.ReadToEnd();
    }

    if (text.Contains(Case1))
    {
        string textCase1 = "";
        using (StreamReader rCase1Reader = new StreamReader(file))
        {
            textCase1 = rCase1Reader.ReadToEnd().Replace(Case1, Case1Const);
        }
        using (StreamWriter wCase1 = new StreamWriter(file, false, Encoding.UTF8))
        {
            wCase1.Write(textCase1);
        }

        UsedFIles.Add(file);
    }
}

My problem is that if I try to replace a string that looks like this: "partnumber: 58" and there also is a string that looks like this "partnumber: 585"

My problem is that if the current string contains the desired substring and in addition a string that has a high similarity like "partnumber: 58" and "partnumber: 585", my code will also replace the highly similar string. Is there a way I can avoid this behavoir?

8
  • 6
    Use regular expressions. Paste some example of files. Commented Jan 13, 2016 at 10:38
  • what does the whole string look like? Commented Jan 13, 2016 at 10:39
  • 1
    You need to read the next character. Determine if it's a delimiter and then decide if you want to do the replacement. Commented Jan 13, 2016 at 10:40
  • 1
    Why do you read the text twice? For me, File.ReadAllText and File.WriteAllText seem a lot simpler. Commented Jan 13, 2016 at 10:44
  • 2
    When you correctly find "partnumber: 58", what does the next character(s) look like? If that is a recognisable delimiter (space, semicolon, ...), add that to both search and replace. Commented Jan 13, 2016 at 10:52

3 Answers 3

1

Read the whole file, find the string you're interested in and then check the bit after it. Assuming the file has more to read.

    foreach (String file in allFIles)
    {
        string text = "";
        using (StreamReader r = new StreamReader(file))
        {
            text = r.ReadToEnd();
        }

        int x = text.IndexOf(Case1);
        while(x > -1)
        {
            if (text.Length - x > Case1.Length)
            {
                string nextBit = text.SubString(x + Case1.Length, 1);
                if (IsDelimeter(nextBit))
                {
                    text = Replace(text, x, Case1, Case1Const);
                    x += Case1Const.Length;
                }
            }
            else
            {
                 text = Replace(text, x, Case1 Case1Const);
                 break;
            }
            x = text.IndexOf(Case1, x + 1);
        }

        File.WriteAllText(file, text);
    }
Sign up to request clarification or add additional context in comments.

Comments

0

Use Regex

new Regex(@"partnumber: 58$");

6 Comments

Why not use string.Replace? As James Barrass comments, if theres a common delimiter then its probably more efficient to use that
That regex will match both 58 and 585. The question specifically wants to avoid that.
That Regex is wrong. one it will match both and two you've added a double slash. regexr.com/3cik8 even removing the double slash it doesn't work regexr.com/3cikb
Still wrong ^ is the beginning of a string, you mean $
Assuming the file contains partnumber: 58 - An interesting item, partnumer: 585 - A boring item. Or anything else after the part number this is still wrong
|
0

You could try:

var files = new[] { "File1.txt", "File2.txt", "File3.txt" };
// Where the key is the regex pattern to match and the value is the string to replace it with.
var patterns = new Dictionary<string, string>()
{
    { @"partnumber: \d\d", "FooBar" },
};

foreach(var file in files)
{
    string str = File.ReadAllText(file);
    foreach (var pattern in patterns)
        str = Regex.Replace(str, pattern.Key, pattern.Value);
    File.WriteAllText(file, str);
}

This example uses regex (regular expressions), the pattern partnumber: \d\d matches any strings that start with 'partnumber: ' and end in two digits. Regex is very powerful and you can use it to describe very specific cases you want to match so you can extend this for multiple patterns.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.