Simple regex not working in c#

Question

/news/article-title.html

is not being caught by the regex:

^/news/[^(archives)].+.html

?

I'm trying to have articles that do NOT have "archives" in the filename, but start with "/news/"

Thanks!

[] defines one character from a character class. [^(archives)] translates to "one character that is not one of these: archives()". — Tim
– Tim, Commented Mar 11, 2011 at 16:39

Kobi · Accepted Answer · 2011-03-11 16:47:10Z

6

You should use a negative lookahead. Character classes only work for a single character. Also, don't forget to escape the dot.

If "archives" cannot be at the beginning:

^/news/(?!archives).+\.html

If "archives" cannot anywhere:

^/news/((?!archives).)+\.html

More tips:

Disallow archives as a whole word: (?!archives\b).+ or (?!archives-).+
make sure \.html is at the end (it may appear more than once): \.html(?=$|[?&])

answered Mar 11, 2011 at 16:37

Kobi

139k41 gold badges259 silver badges302 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Why be so specific about unspecified forms?

unholysampler · Accepted Answer · 2011-03-11 16:39:30Z

1

You can't use the not of a character block to not an entire string.

[^(archives)]

This is interpreted as a character that is not one of the following: (, a, r, c, h, i, v, e, s or ).

answered Mar 11, 2011 at 16:39

unholysampler

17.4k7 gold badges50 silver badges65 bronze badges