80

In the regex below, \s denotes a space character. I imagine the regex parser, is going through the string and sees \ and knows that the next character is special.

But this is not the case as double escapes are required.

Why is this?

var res = new RegExp('(\\s|^)' + foo).test(moo);

Is there a concrete example of how a single escape could be mis-interpreted as something else?

3
  • 1
    Remember, it's not that Java or the Regexp constructor need clarification, it's the compiler (or parser). Commented Jul 25, 2013 at 16:04
  • 5
    To add to the already-correct answers: note that if you write a RegExp literal in JavaScript, you don't need to escape the backslash, as you would suspect: /(\s|^)/ Commented Jul 25, 2013 at 16:05
  • Related: stackoverflow.com/a/37329801/1225328. Commented Jul 30, 2018 at 9:00

5 Answers 5

68

You are constructing the regular expression by passing a string to the RegExp constructor.

\ is an escape character in string literals.

The \ is consumed by the string literal parsing…

const foo = "foo";
const string = '(\s|^)' + foo;
console.log(string);

… so the data you pass to the RegEx compiler is a plain s and not \s.

You need to escape the \ to express the \ as data instead of being an escape character itself.

Sign up to request clarification or add additional context in comments.

1 Comment

That pertains to both regular string literals as well as template string literals.
24

Inside the code where you're creating a string, the backslash is a javascript escape character first, which means the escape sequences like \t, \n, \", etc. will be translated into their javascript counterpart (tab, newline, quote, etc.), and that will be made a part of the string. Double-backslash represents a single backslash in the actual string itself, so if you want a backslash in the string, you escape that first.

So when you generate a string by saying var someString = '(\\s|^)', what you're really doing is creating an actual string with the value (\s|^).

Comments

11

The Regex needs a string representation of \s, which in JavaScript can be produced using the literal "\\s".

Here's a live example to illustrate why "\s" is not enough:

alert("One backslash:          \s\nDouble backslashes: \\s");

Note how an extra \ before \s changes the output.

Comments

9

As has been said, inside a string literal, a backslash indicates an escape sequence, rather than a literal backslash character, but the RegExp constructor often needs literal backslash characters in the string passed to it, so the code should have \\s to represent a literal backslash, in most cases.

A problem is that double-escaping metacharacters is tedious. There is one way to pass a string to new RegExp without having to double escape them: use the String.raw template tag, an ES6 feature, which allows you to write a string that will be parsed by the interpreter verbatim, without any parsing of escape sequences. For example:

console.log('\\'.length);           // length 1: an escaped backslash
console.log(`\\`.length);           // length 1: an escaped backslash
console.log(String.raw`\\`.length); // length 2: no escaping in String.raw!

So, if you wish to keep your code readable, and you have many backslashes, you may use String.raw to type only one backslash, when the pattern requires a backslash:

const sentence = 'foo bar baz';
const regex = new RegExp(String.raw`\bfoo\sbar\sbaz\b`);
console.log(regex.test(sentence));

But there's a better option. Generally, there's not much good reason to use new RegExp unless you need to dynamically create a regular expression from existing variables. Otherwise, you should use regex literals instead, which do not require double-escaping of metacharacters, and do not require writing out String.raw to keep the pattern readable:

const sentence = 'foo bar baz';
const regex = /\bfoo\sbar\sbaz\b/;
console.log(regex.test(sentence));

Best to only use new RegExp when the pattern must be created on-the-fly, like in the following snippet:

const sentence = 'foo bar baz';
const wordToFind = 'foo'; // from user input

const regex = new RegExp(String.raw`\b${wordToFind}\b`);
console.log(regex.test(sentence));

3 Comments

If there is a / in your regular expression, you should also use new RegExp, not just in the on-the-fly case.
Why, there's no need? Just like other special characters, just escape it with a backslash. /foo\/bar/ will match foo/bar
As you wrote earlier, double escaping metacharacters is tedious. In this case, the new RegExp is the only way when you want your code to work with any regular expression without the need for additional escaping.
7

\ is used in Strings to escape special characters. If you want a backslash in your string (e.g. for the \ in \s) you have to escape it via a backslash. So \ becomes \\ .

EDIT: Even had to do it here, because \\ in my answer turned to \.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.