I have this string Sample Text <[email protected]> and this string [email protected] and I'm trying to match the preceeding text ("Sample Text" in this example) if it exists and the email without the "<",">" characters. There may be whitespaces at before and after that. At first I used Regex.Split with this expression @"\s*(.*)<(.*@.*)>\s*" but it gave me 4 strings instead of 2. The 2 strings that I wanted were correct but it also returned empty strings. Now I'm trying with Regex.Matches using this expression @"\s*(.*)(?: <)?(.*@.*)(?:>)?\s*" it finds 3 matches. The 2 are again the correct ones and the other is the input string itself. As for the second string it doesn't work. How do I fix this?
4 Answers
This could be done without regex. Take a look onto MailAddress class; it could be used to parse strings like in your example:
var mailAddress = new MailAddress("Sample Text <[email protected]>");
Here mailAddress.Address property will contain [email protected] value, and mailAddress.DisplayName will contain Sample Text value.
1 Comment
Remarks section on the page he linked. It has some slightly complex implementation, but I am guessing that "Display Name <[email protected]>" is a semi-standard format. msdn.microsoft.com/en-us/library/…Based on your test cases this regex may work..
(.*)\s?\<(.*)\>
This will give you to results 1 the preceding text & 2 the text contained within the <> brackets
If you care about ensuring the email is valid you may wish to look at a more thorough email regex, but I am guess you are trying to match a string that has come from an email or mail server so that may not be a problem.
Also, its worth grabbing a regex building program such as Expresso or using one of the many online tools to help build your regex.
4 Comments
Regex.Matches always return the full match on the first match, so just ignore it and use the second and third.
To match the second type of string (only email) you better match the first type and if not found match the second using a single email regex
Comments
Try this one here
\s*(.*?)(?: <)?(\S*@.*)(?:>)?\s*
I changed yours only a bit.
added into the first group the ? to make it a lazy match
changed the part before the
@into\S, what means anything but whitespace.
You can see it online here on Rubular
MailAddressclass (if, for example, you need to do a search to find the e-mail addresses), you could write your regex very loosely and parse/clean up the strings after the fact (usingMailAddressand/or calls tostring.Splitandstring.Trim). Trying to make the regex both search for/validate the proper format as well as clean up the strings might make your regex more complicated than it needs to be.