Regex to match string containing two names in any order

Question

I need logical AND in regex.

something like

jack AND james

agree with following strings

'hi jack here is james'
'hi james here is jack'

@AndersonGreen, the question was prematurely locked. The answers are severely lacking as those solutions are not viable since most regex don't recognize lookaround and mode quantifier. I believe quantifier existed at the point of the question being asked. — XPMai
– XPMai, Commented Jun 1, 2020 at 10:51

Géry Ogam · Accepted Answer · 2021-11-28 21:55:43Z

You can do checks using positive lookaheads. Here is a summary from the indispensable regular-expressions.info:

Lookahead and lookbehind, collectively called “lookaround”, are zero-length assertions...lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called “assertions”. They do not consume characters in the string, but only assert whether a match is possible or not.

It then goes on to explain that positive lookaheads are used to assert that what follows matches a certain expression without taking up characters in that matching expression.

So here is an expression using two subsequent postive lookaheads to assert that the phrase matches jack and james in either order:

^(?=.*\bjack\b)(?=.*\bjames\b).*$

Test it.

The expressions in parentheses starting with ?= are the positive lookaheads. I'll break down the pattern:

^ asserts the start of the expression to be matched.
(?=.*\bjack\b) is the first positive lookahead saying that what follows must match .*\bjack\b.
.* means any character zero or more times.
\b means any word boundary (white space, start of expression, end of expression, etc.).
jack is literally those four characters in a row (the same for james in the next positive lookahead).
$ asserts the end of the expression to me matched.

So the first lookahead says "what follows (and is not itself a lookahead or lookbehind) must be an expression that starts with zero or more of any characters followed by a word boundary and then jack and another word boundary," and the second look ahead says "what follows must be an expression that starts with zero or more of any characters followed by a word boundary and then james and another word boundary." After the two lookaheads is .* which simply matches any characters zero or more times and $ which matches the end of the expression.

"start with anything then jack or james then end with anything" satisfies the first lookahead because there are a number of characters then the word jack, and it satisfies the second lookahead because there are a number of characters (which just so happens to include jack, but that is not necessary to satisfy the second lookahead) then the word james. Neither lookahead asserts the end of the expression, so the .* that follows can go beyond what satisfies the lookaheads, such as "then end with anything".

I think you get the idea, but just to be absolutely clear, here is with jack and james reversed, i.e. "start with anything then james or jack then end with anything"; it satisfies the first lookahead because there are a number of characters then the word james, and it satisfies the second lookahead because there are a number of characters (which just so happens to include james, but that is not necessary to satisfy the second lookahead) then the word jack. As before, neither lookahead asserts the end of the expression, so the .* that follows can go beyond what satisfies the lookaheads, such as "then end with anything".

This approach has the advantage that you can easily specify multiple conditions.

^(?=.*\bjack\b)(?=.*\bjames\b)(?=.*\bjason\b)(?=.*\bjules\b).*$

vim syntax: ^$.*\<jack\>$\@=$.*\<james\>\@=$.*$ or \v^(.*<jack>)@=(.*<james>)@=.*$
Does anyone know why this would break (in JavaScript at least) when I try to search for strings starting with '#'? ^(?=.*\b#friday\b)(?=.*\b#tgif\b).*$ fails to match blah #tgif blah #friday blah but ^(?=.*\bfriday\b)(?=.*\btgif\b).*$ works fine.
This isn't working for me, as demoed here: regex101.com/r/xI9qT0/1
@TonyH, for JavaScript you can remove the last $ symbol from the pattern or remove the new line character from the test string, other languages (Python, PHP) on this website work perfectly. Also you can remove .*$ from the end — regexp still will be matches the test string, but it's without selecting of the whole test string as match.
Adding (?i) can also make it case insensitive. ^(?i)(?=.*\bjack\b)(?=.*\bjames\b).*$

icyrock.com · Accepted Answer · 2010-12-08 16:16:13Z

181

Try:

james.*jack

If you want both at the same time, then or them:

james.*jack|jack.*james

answered Dec 8, 2010 at 16:16

icyrock.com

28.7k4 gold badges72 silver badges92 bronze badges

8 Comments

Yogurt The Wise Over a year ago

The accepted answer worked. this also worked perfectly for me. For searching code in visual studio 'find results'.

Kumar Manish Over a year ago

This one works for me and is much more concise & easy to understand than the accepted answer!

WileCau Over a year ago

I needed a solution that only had two names to match, so this answer is more concise for that case. But the accepted answer becomes more concise beyond 2 since the number of "or"s increases factorially. For 3 names there would be 6 "or"s, 4 names would be 24 "or"s, etc.

Jekis Over a year ago

I would recommend to make it lazy james.*?jack|jack.*?james. This will help on large texts.

Gershom Maes Over a year ago

Note this will also match such names as "jacky" and "jameson"

|

Aryeh Beitz · Accepted Answer · 2018-12-27 12:20:11Z

63

Explanation of command that i am going to write:-

. means any character, digit can come in place of .

* means zero or more occurrences of thing written just previous to it.

| means 'or'.

So,

james.*jack

would search james , then any number of character until jack comes.

Since you want either jack.*james or james.*jack

Hence Command:

jack.*james|james.*jack

edited Dec 27, 2018 at 12:20

Aryeh Beitz

2,0881 gold badge23 silver badges23 bronze badges

answered Jun 2, 2016 at 11:34

Shubham Sharma

1,83116 silver badges25 bronze badges

4 Comments

WoJ Over a year ago

As a side note - you could also have edited @icyrock's answer (which is the same as yours, just 6 years earlier), your explanation is very useful on its own.

jgritten Over a year ago

Thank you for this answer, i however feel the need to point out that in VSCode search, your answer jack.*james | james.*jack will take the spaces between the '|' (or) symbol into consideration during the search. jack.*james|james.*jack works and doesnt look for the spaces

Chris Strickland Over a year ago

Don't you need 2000 rep for the edit privilege?

Amine KOUIS Over a year ago

the second one matche also "jacky" and "jameson" check this demo.

vsync · Accepted Answer · 2020-11-21 15:00:05Z

40

Its short and sweet

(?=.*jack)(?=.*james)

Test Cases:

[
  "xxx james xxx jack xxx",
  "jack xxx james ",
  "jack xxx jam ",
  "  jam and jack",
  "jack",
  "james",
]
.forEach(s => console.log(/(?=.*james)(?=.*jack)/.test(s)) )

edited Nov 21, 2020 at 15:00

vsync

133k59 gold badges344 silver badges430 bronze badges

answered Jan 29, 2019 at 8:59

Shivam Agrawal

8699 silver badges5 bronze badges

1 Comment

user8560167 Over a year ago

could you say how it works? lookahead needs word before, and there is nothing. in this case element (?=.*jack) result will be element, for (?=.*jack) there will be no result . Olso tried on example string here: regex101.com

codaddict · Accepted Answer · 2010-12-08 16:16:25Z

9

You can do:

\bjack\b.*\bjames\b|\bjames\b.*\bjack\b

answered Dec 8, 2010 at 16:16

codaddict

457k83 gold badges500 silver badges537 bronze badges

Comments

Emma Marcier · Accepted Answer · 2020-08-16 15:25:25Z

The expression in this answer does that for one jack and one james in any order.

Here, we'd explore other scenarios.

METHOD 1: One `jack` and One `james`

Just in case, two jack or two james would not be allowed, only one jack and one james would be valid, we can likely design an expression similar to:

^(?!.*\bjack\b.*\bjack\b)(?!.*\bjames\b.*\bjames\b)(?=.*\bjames\b)(?=.*\bjack\b).*$

Here, we would exclude those instances using these statements:

(?!.*\bjack\b.*\bjack\b)

and,

(?!.*\bjames\b.*\bjames\b)

RegEx Demo 1

We can also simplify that to:

^(?!.*\bjack\b.*\bjack\b|.*\bjames\b.*\bjames\b)(?=.*\bjames\b|.*\bjack\b).*$

RegEx Demo 2

If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.

RegEx Circuit

jex.im visualizes regular expressions:

Test

const regex = /^(?!.*\bjack\b.*\bjack\b|.*\bjames\b.*\bjames\b)(?=.*\bjames\b|.*\bjack\b).*$/gm;
const str = `hi jack here is james
hi james here is jack
hi james jack here is jack james
hi jack james here is james jack
hi jack jack here is jack james
hi james james here is james jack
hi jack jack jack here is james
`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

METHOD 2: One `jack` and One `james` in a specific order

The expression can be also designed for first a james then a jack, similar to the following one:

^(?!.*\bjack\b.*\bjack\b|.*\bjames\b.*\bjames\b)(?=.*\bjames\b.*\bjack\b).*$

RegEx Demo 3

and vice versa:

^(?!.*\bjack\b.*\bjack\b|.*\bjames\b.*\bjames\b)(?=.*\bjack\b.*\bjames\b).*$

RegEx Demo 4

Great explanation. It would be even better if your method 1 could match both 'james' AND 'jack' in any order. Testing it, I found that your regex expression matches single 'james' or 'jack'

bobble bubble · Accepted Answer · 2022-10-24 13:43:44Z

7

No need for two lookaheads, one substring can be normally matched.

^(?=.*?\bjack\b).*?\bjames\b.*

See this demo at regex101

Lookarounds are zero-length assertions (conditions). The lookahead here checks at ^ start if jack occurs later in the string and on success matches up to james and .* the rest (could be removed). Lazy dot is used before words (enclosed in \b word boundaries). Use the i-flag for ignoring case.

edited Oct 24, 2022 at 13:43

answered Oct 24, 2022 at 11:14

bobble bubble

18.8k4 gold badges31 silver badges52 bronze badges

2 Comments

RavinderSingh13 Over a year ago

Very Good answer, thanks for sharing. One question: do we need .* after last \b or will that work without it also?

bobble bubble Over a year ago

@RavinderSingh13 Thank you for your comment, good point! For just validating the .* in the end is indeed useless, it's just needed if the full match is wanted.

XPMai · Accepted Answer · 2020-08-28 11:16:26Z

5

You can make use of regex's quantifier feature since lookaround may not be supported all the time.

(\bjames\b){1,}.*(\bjack\b){1,}|(\bjack\b){1,}.*(\bjames\b){1,}

edited Aug 28, 2020 at 11:16

answered Jun 1, 2020 at 10:35

XPMai

1491 silver badge6 bronze badges

3 Comments

captain_majid Over a year ago

Why no one tries this, 0 voted answers might be the best, thanks mate.

XPMai Over a year ago

@captain_majid, I apologize. After intense research and based on false positives data, I realized my original answer was wrong. I've fixed the regex code. This correct regex will work perfectly as expected.

captain_majid Over a year ago

Your 1st example worked fine with me, and strangely even a simpler one like that worked also: \b(word1|word2|word3|word4|etc)\b I've tested it here: rubular.com/r/Pgn2d6dXXXHoh7

Firstrock · Accepted Answer · 2024-01-25 12:23:08Z

Vim has a branch operator \& that is useful when searching for a line containing a set of words, in any order. Moreover, extending the set of required words is trivial.

For example,

/.*jack\&.*james

will match a line containing jack and james, in any order.

Likewise,

/\<$\w*jack\&\w*james$

will match a variable name containing jack and james, in any order.

See this answer for more information on usage. I am not aware of any other regex flavor that implements branching; the operator is not even documented on the Regular Expression wikipedia entry.

Daniel Tripp · Accepted Answer · 2023-11-29 05:48:48Z

All of the answers so far work for finding a match, but they don't all work for highlighting that match. For example: if you want to use grep's "--only-matching" or "--color" options, then there's only one kind of answer (so far) that will work: james.*jack|jack.*james. I'll call this the permutations technique, and the other ones the lookaround technique and the vim branch technique.

The lookaround technique won't do any highlighting at all, because it will always match a zero-length string, because - by definition - that's what lookarounds do. That is, for this input text:

hi jack here is james
hi james here is jack

a (perl) regex of (?=.*jack)(?=.*james) won't highlight anything. You can test this by running this command in most any unix shell:

printf 'hi jack here is james\nhi james here is jack\n' | grep --color --perl '(?=.*jack)(?=.*james)'

Some of the answers here add .* to the beginning and the end. That will highlight something - the whole line - but that doesn't help if our goal is to highlight the words we're looking for, and what's in between those words, and nothing more.

The vim branch technique (AKA \&) will highlight something that might look useful at a glance, but it's probably not what you want. For the same input text, a vim search for /.*james\&.*jack will highlight hi jack and hi james here is jack. To test this from the shell, run this:

printf 'hi jack here is james\nhi james here is jack\n' | vim -R - '+/.*james\&.*jack'

Only the permutations technique will highlight the most useful things: jack here is james and james here is jack. To test this from the shell:

printf 'hi jack here is james\nhi james here is jack\n' | grep --color --perl 'james.*jack|jack.*james'

All of what I've written here assumes that you want a technique that will generalize to three or more words.

mickmackusa · Accepted Answer · 2025-06-27 06:37:13Z

I looked at the other solutions and thought they were a little unnecessarily long, complex, and wordy. Basically you want a regex that will match

"firstword ANYTHING secondword"

And that same regex will match

"secondword ANYTHING firstword"

So I did a slight variation on this solution which worked for me

jack.*james|james.*jack

Only instead of using using one regular expression, I ran two and did an OR on the results

    #!/usr/bin/perl -w
    
    #match two words in any order
    
    my @testStrings;
    push(@testStrings, "firstword secondword");
    push(@testStrings, "secondword firstword");
    push(@testStrings, "firstword some filler in the middle secondword");
    push(@testStrings, "secondword match either word coming first firstword");
    push(@testStrings, "filler in beginning firstword some filler in the middle secondword filler in the end");
    push(@testStrings, "filler in beginning secondword some filler in the middle firstword filler in the end");
    push(@testStrings, "doh is it matching anything?");
    push(@testStrings, "firstword alone");
    push(@testStrings, "secondword alone");
    
    for (@testStrings){
      my $matched = $_ =~ /firstword.*secondword/;
      my $matchedReverse = $_ =~ /secondword.*firstword/;
      #/(?:firstword.*secondword)|(?:secondword.*firstword)/  as a single regex also works
      print "string: $_\n";
      if($matched || $matchedReverse){
        print "regex: Matched\n";
      } else{
        print "regex: Did not match\n";
      }
      print "\n";
    }

Output looks like this

perl forwardAndBackwardRegex.pl
string: firstword secondword
regex: Matched

string: secondword firstword
regex: Matched

string: firstword some filler in the middle secondword
regex: Matched

string: secondword match either word coming first firstword
regex: Matched

string: filler in beginning firstword some filler in the middle secondword filler in the end
regex: Matched

string: filler in beginning secondword some filler in the middle firstword filler in the end
regex: Matched

string: doh is it matching anything?
regex: Did not match

string: firstword alone
regex: Did not match

string: secondword alone
regex: Did not match

I do this same sort of technique in bash when I am searching for a filename with two words in any order. I will just run two greps. Something like this

    ls | grep -i firstword | grep -i secondword

It will match as long as both words are present.

Collectives™ on Stack Overflow

Regex to match string containing two names in any order

11 Answers 11

7 Comments

8 Comments

4 Comments

Test Cases:

1 Comment

Comments

METHOD 1: One `jack` and One `james`

RegEx Demo 1

RegEx Demo 2

RegEx Circuit

Test

METHOD 2: One `jack` and One `james` in a specific order

RegEx Demo 3

RegEx Demo 4

1 Comment

2 Comments

3 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

11 Answers 11

7 Comments

8 Comments

4 Comments

Test Cases:

1 Comment

Comments

METHOD 1: One jack and One james

RegEx Circuit

Test

METHOD 2: One jack and One james in a specific order

1 Comment

2 Comments

3 Comments

Comments

Comments

Comments

Linked

Related

METHOD 1: One `jack` and One `james`

METHOD 2: One `jack` and One `james` in a specific order