2

enter image description hereI am very new to reg-ex and i am not sure whats going on with this one.... however my friend gave me this to solve my issue BUT somehow it is not working....

string: department_name:womens AND item_type_keyword:base-layer-underwear

reg-ex: (department_name:([\\w-]+))?(item_type_keyword:([\\w-]+))?

desired output: array OR group

  • 1st element should be: department_name:womens
  • 2nd should be: womens
  • 3rd: item_type_keyword:base-layer-underwear
  • 4th: base-layer-underwear

    strings can contain department_name OR item_type_keyword, BUT not mendatory, in any order

C# Code

Regex regex = new Regex(@"(department_name:([\w-]+))?(item_type_keyword:([\w-]+))?");
Match match = regex.Match(query);
if (match.Success)
    if (!String.IsNullOrEmpty(match.Groups[4].ToString()))
        d1.ItemType = match.Groups[4].ToString();

this C# code only returns string array with 3 element

1: department_name:womens
2: department_name:womens
3: womens

somehow it is duplicating 1st and 2nd element, i dont know why. BUT its not return the other elements that i expect..

can someone help me please...

when i am testing the regex online, it looks fine to me...

http://fiddle.re/crvw1

Thanks

1
  • 2
    becuase your regex isn't matches the inbetween ` AND ` Commented Sep 26, 2014 at 9:53

5 Answers 5

3

You can use something like this to get the output you have in your question:

string txt = "department_name:womens AND item_type_keyword:base-layer-underwear";
var reg = new Regex(@"(?:department_name|item_type_keyword):([\w-]+)", RegexOptions.IgnoreCase);
var ms = reg.Matches(txt);
ArrayList results = new ArrayList();
foreach (Match match in ms)
{
    results.Add(match.Groups[0].Value);
    results.Add(match.Groups[1].Value);
}

// results is your final array containing all results
foreach (string elem in results)
{
    Console.WriteLine(elem);
}

Prints:

department_name:womens
womens
item_type_keyword:base-layer-underwear
base-layer-underwear

match.Groups[0].Value gives the part that matched the pattern, while match.Groups[1].Value will give the part captured in the pattern.

In your first expression, you have 2 capture groups; hence why you have twice department_name:womens appearing.

Once you get the different elements, you should be able to put them in an array/list for further processing. (Added this part in edit)

The loop then allows you to iterate over each of the matches, which you cannot exactly do with if and .Match() (which is better suited for a single match, while here I'm enabling multiple matches so the order they are matched doesn't matter, or the number of matches).

ideone demo


(?:
  department_name     # Match department_name
|                     # Or
  item_type_keyword   # Match item_type_keyword
)
:
([\w-]+)              # Capture \w and - characters
Sign up to request clarification or add additional context in comments.

1 Comment

@patel.milanb Ok, I put them all in an ArrayList for you. See the edit. I also used the System.Collections to be able to use an ArrayList by the way.
2

It's better to use the alternation (or logical OR) operator | because we don't know the order of the input string.

(department_name:([\w-]+))|(item_type_keyword:([\w-]+))

DEMO

String input = @"department_name:womens AND item_type_keyword:base-layer-underwear";
Regex rgx = new Regex(@"(?:(department_name:([\w-]+))|(item_type_keyword:([\w-]+)))");
foreach (Match m in rgx.Matches(input))
{
Console.WriteLine(m.Groups[1].Value);
Console.WriteLine(m.Groups[2].Value);
Console.WriteLine(m.Groups[3].Value);
Console.WriteLine(m.Groups[4].Value);
}

IDEONE

Comments

2

Another idea using a lookahead for capturing and getting all groups in one match:

^(?!$)(?=.*(department_name:([\w-]+))|)(?=.*(item_type_keyword:([\w-]+))|)

as a .NET String

"^(?!$)(?=.*(department_name:([\\w-]+))|)(?=.*(item_type_keyword:([\\w-]+))|)"

test at regexplanet (click on .NET); test at regex101.com

(add m multiline modifier if multiline input: "^(?m)...)

3 Comments

This is comprehensive.though i didnt get what exaclty | is doing there
@vks the |) is only kind of cheat to make the match optional
Awesum that is.Nice hack to include the match in one.learnt a new one .thanks
1

If you use any spliting with And Or , etc that you can use

(department_name:(.*?)) AND (item_type_keyword:(.*?)$)

•1: department_name:womens •2: womens •3: item_type_keyword:base-layer-underwear •4: base-layer-underwear

6 Comments

Maybe better [\w-]+ instead of the lazy dot .*? after item_type_keyword
Why it fails here regex101.com/r/lS5tT3/51 ? Read the question, strings won't be in an order.
You're right @AvinashRaj well upvoted your and Jerry solution already :)
Incase special character present in department_name or item_type_keyword ([\w-]) it will not match this case
@menaka Yes, but otherwise the lazy dot might substitute to nothing (see regex101 what I mean) and OP used [\w-]+ that seems not to be the problem imho.
|
0
(?=(department_name:\w+)).*?:([\w-]+)|(?=(item_type_keyword:.*)$).*?:([\w-]+)

Try this.This uses a lookahead to capture then backtrack and again capture.See demo.

http://regex101.com/r/lS5tT3/52

2 Comments

Why are you using lookaheads?
@Unihedron just to be failsafe.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.