2

I have this regex which is used to validate phone numbers.

^\\(?(\\d{2,3})\\)?[-(). ]?(\\d{2,3})[-(). ]?(\\d{4})$

(Yes, I know it is not perfect, but I don't really care). I am just using it to replace phone numbers with another string, say ### to remove sensitive information. So false positives are fine.

It works when the string I am searching is only a phone number. This works:

String PHONE_PATTERN = "^\\(?(\\d{2,3})\\)?[-(). ]?(\\d{2,3})[-(). ]?(\\d{4})$";
String phone = "123-123-1234";
System.out.println(s.replaceAll(PHONE_PATTERN, "###")); //prints '###'

But with surrounding text it does not work:

String PHONE_PATTERN = "^\\(?(\\d{2,3})\\)?[-(). ]?(\\d{2,3})[-(). ]?(\\d{4})$";
String phone = "some other text 123-123-1234";
System.out.println(s.replaceAll(PHONE_PATTERN, "###"));

By does not work, I mean the text is printed unchanged.

What do I need to change on my regex to get this to work so that the second example prints

some other text ###
3
  • 1
    Holy mother of regex, Batman. Commented Jun 19, 2014 at 19:11
  • How did you end up with ^ and & automatically when building your regex? When I write a regex, I have to actively think about whether either of them should be there, like "Do I only need to match this at the beginning of strings/lines?" Commented Jun 19, 2014 at 20:01
  • @ADTC I'm a complete noob when it comes to regex. I was just working off what I had found at this site: zparacha.com/… Commented Jun 19, 2014 at 23:48

3 Answers 3

6

Remove the ^ and $ from the beginning and end of your expression. Those characters match the beginning and end of a String, but you don't want the phone number to be the only content of the String, so you should remove them.

Sign up to request clarification or add additional context in comments.

Comments

5

Instead of anchors ^ and $ use \b (word boundary):

String PHONE_PATTERN = "\\b\\(?(\\d{2,3})\\)?[-(). ]?(\\d{2,3})[-(). ]?(\\d{4})\\b";

Comments

4

You need to remove the beginning of string ^ and end of string $ anchors, with having both of these set you're matching the entire string from the first character in the string until the last character in the string.

  • The ^ stipulates the pattern must match the substring starting with the first character in the string.
  • The $ stipulates the pattern must match the substring ending with the last character in the string.

If you want to search for a pattern that is at one end or the other, that is when you need to use anchors.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.