2

I have a Regexrule.csClass, it consists of the following properties:

    public string Expression { get; set; }
    public string FirstOpen { get; set; }
    public string FirstClose { get; set; }
    public string SecondOpen { get; set; }
    public string SecondClose { get; set; }

Expression holds a Regular Expression value, and it is always expected to return 2 Groups.

The four fields (excluding Expression) are prefixes and suffixes for the two groups that are expected to be found... so that this happens:

FirstOpen + Group[1] + FirstClose and SecondOpen + Group[2] + SecondClose

Anyway, I have a List<RegexRule> Rules; that contains a list of RegexRules objects.

The Predicament

My goal is to loop through each one one those (RegexRules r), run its respective expression (r.Expression) on a particularly long string, and when the two expected groups are found, I want the script to encapsulate each group with its prefixes and suffixes in the way shown...again,

r.FirstOpen + Group[1] + r.FirstClose and r.SecondOpen + Group[2] + r.SecondClose

I've tried many different ways but one thing I know is that str.Replace isn't going to work, in a loop. Because it will apply the prefixes and suffixes over and over, for every occurrence of the expression's results.

So how else can this be achieved?

Thank you.

Edit

This is what I've currently got:

foreach (RegexRule r in RegexRules.ToList())
{ 
    Regex rx = new Regex(r.Expression); 
    MatchCollection mc = rx.Matches(str); 
    foreach (Match m in mc) 
    { 
         MessageBox.Show("replacing");
         str = str.Replace(m.Groups[1].Value, r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
    }
}

Edit 2 - Specifics

Users will create their own Regex configurations in a .config file, and it will be in this format:

reg {(\w+).(\w+)\(\);} = [("prefix1","suffix1"),("prefix2","suffix2")];


reg - Standard word for defining a new RegexRule
{ {(\w+).(\w+)\(\); } - Their Regular expression (CONDITION: expression must always return 2 groups in its matches)
[("prefix1","suffix1"),("prefix2","suffix2")] - Two parameters in `[("","") , ("","")] - which represent the prefixes and suffixes for the two groups

**Example **

If we applied the above configuration to this string:

Lorem ipsum foo.bar(); dolor sit bar.foo(); amit consecteteur...

The regex would capture foo.bar() as a match, in that foo is match[1] group[1], and bar is match[1] group[2], according to the regular expression.

Same goes for bar.foo(), because bar is match[2] group[1], and foo match[2] group[2]

I hope this makes sense...

9
  • 2
    This really feels like you're trying to force a design pattern that doesn't fit the actual need. You might want to back up a step and try to look at what you're doing from a different angle. My $.02 Commented Aug 9, 2016 at 14:54
  • @JeremyHolovacs I don't know... how else could it be done? Commented Aug 9, 2016 at 14:57
  • @BarryD, Suppose, you've got ([0-9]).([a-z]) as a regex and 1aa1 as an input string. Do you want both 1s to be replaced or only the first one (i.e. one matching the regex)? Commented Aug 9, 2016 at 15:00
  • 2
    @BarryD. I'm not sure, I don't know what need you're trying to satisfy with this approach. But usually when I see people having problems with highly flexible code (like regex stuff) and trying to wrap structure around it, it's been a case of going down the wrong path on design. The problem is usually that you're trying to do something you shouldn't need to do. As for what that is in this case, I don't have anything of value to offer. Just be careful that your technical implementation is actually solving a business problem and not a problem of your own making. Commented Aug 9, 2016 at 15:03
  • Is there some reason why your List<RegexRule> cannot have unique values? If it was able to have unique values, would that not fix your problem, and you'd be able to use string.Replace()? Can you provide a sample of the input you use and the incorrect output you get? Commented Aug 9, 2016 at 15:06

1 Answer 1

1

As per our discussion, I think this might be a solution for you. It has to do with the first comment I made. It gives you unique values for your MatchCollection using .Distinct() so that you don't end up compounding the prefixes and suffixes.

foreach(RegexRule r in RegexRules.ToList())
{ 
    Regex rx = new Regex(r.Expression); 
    MatchCollection mc = rx.Matches(str); 
    foreach(Match m in mc.OfType<Match>().Distinct()) 
    { 
         MessageBox.Show("replacing");
         str = str.Replace(m.Groups[1].Value, 
                           r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
    }
}

If you can't use LINQ for some reason, you can always just basically do the same thing yourself by creating a new List<Match> and only adding in the ones that aren't yet in the list.

foreach(RegexRule r in RegexRules.ToList())
{ 
    Regex rx = new Regex(r.Expression); 
    MatchCollection mc = rx.Matches(str);

    List<Match> matches = new List<Match>();
    List<string> strings = new List<string>();
    foreach(Match m in mc)
        if(!strings.Contains(m.Value))
        {
            matches.Add(m);
            strings.Add(m.Value);
        }

    foreach(Match m in matches) 
    { 
         MessageBox.Show("replacing");
         str = str.Replace(m.Groups[1].Value, 
                           r.OpenBBOne + m.Groups[1].Value + r.CloseBBOne);
    }
}
Sign up to request clarification or add additional context in comments.

2 Comments

This was not only difficult to explain, but I genuinely could not get past it, and you presented the only solution that I have, without exception. And it worked. Thank you very much sir!! But: mc.Distinct(); won't work because it's a MatchCollection, thank you for the alternative :)
Ah, yeah, you're right. I forgot you have to define the type so you can get the enumerator. I've edited my answer so it should work for you with LINQ.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.