Regex where substring isn't found in pattern in specific location

Question

I am trying to build a (multiline) pattern for a linter, to catch cases where:

There is a Text( declaration
That is not followed by .appFont( declaration...
...before the next occurrence of } (end of function) or Text( (another Text declaration...)

After many hours on regex101 (and consulting gpt...) I got these 2:

Text\([\s\S]*?\)[\s\S]*?(?!\.appFont)

This just catches the part that's before the .appFont, but I want the entire catch to fail if .appFont is found...

Text\([\s\S]*?\)[\s]*?(?!appFont)[\s\S]\}

This just catches everything, ignoring appFont being in the sting entirely...

In the following example, only the 2nd case should be captured:

Text("blah") 
  .appFont(.body)
}

Text("blah") 
}

Text(
  "blah"
)
.appFont(.blah)
}

I tried to read about negative lookahead but I think I still somehow just use it wrong, or somehow cause it to be ignored when I add [\s\S] maybe?

A lot depends on the tool. A lot of tools evaluate regexes one line at a time, so there's no way to write a regex that can check the next line. (Consider that a regex would end up having to check the entire file and handle deeply nested functions, etc) — JDB
– JDB, Commented Apr 17, 2023 at 13:30
Have you tried working with negation? Something like this demo. Depends on exact requirements. — bobble bubble
– bobble bubble, Commented Apr 17, 2023 at 13:32
Further worth to try with this variant if the .appFont can occur later on but before the next }. — bobble bubble
– bobble bubble, Commented Apr 17, 2023 at 13:45
I didn't know about negation being a thing! Your last variant looks like it should do it but it mistakingly captures if Text and appFont are in the same line like so: Text("blah").appFont(.foo), can't tell why that happens from the regex you wrote? — Aviel Gross
– Aviel Gross, Commented Apr 17, 2023 at 14:02
@AvielGross Was my slip to put the lookahead behind [^}] it should be before like this update. — bobble bubble
– bobble bubble, Commented Apr 17, 2023 at 14:18

bobble bubble · Accepted Answer · 2023-04-17 16:45:35Z

Using a negated character class together with a negative lookahead.

Text\([^)]*\)(?:(?!\.appFont)[^}])*}

regex	explanation
`Text`	match the substring
`\([^)]*\)`	match `(` followed by any amount of non-`)` negated class up to next closing `)`
`(?:(?!\.appFont)[^}])*}`	`(?:` non capturing group`)` repeated `` any amount of times, containing: `(?!\.appFont)` a neg.* lookahead that checks in front of each non-`}` if substring `\.appFont` is not ahead - consumes on success each matching character up to `}`

Or alternatively use the lookahead assertion just once after closing ).

Text\([^)]*\)(?![^}]*?\.appFont)[^}]*}

Another demo at regex101 - Might even be a bit more efficient here.

regex	explanation
`Text`	match the substring
`\([^)]*\)`	match `(` followed by any amount of non-`)` up to the next closing `)`
`(?![^}]*?\.appFont)`	neg. lookahead (condition): look if `[^}]?\.appFont` is not ahead where `[^}]?` matches lazily any amount of non-`}` up to the substring `\.appFont`
`[^}]*}`	if the condition succeded (it's not ahead) consume any amount of non-`}` up to `}`

Amazing! thank you! So why does the 2nd option has the nageted character [^}] twice?
@AvielGross Welcome! A lookahead is an assertion triggered at certain positions. In the second solution the lookahead-check (?![^}]*?\.appFont) is done after the closing parentheses ). If the condition succeeds, the matching [^}]*} proceeds. In the first solution the lookahead is contained in a repitition and fired at each position (can be costly/depening on input).
ok that makes sense now thank you so much for the help and extra details!! I would vote twice if I could :)

1 Answer 1