14

Duplicate: What is the best regular expression for validating email addresses?

I know this is a common question, but I still can't seem to find a great regular expression to use when validating email addresses.

I don't really have time to go read the spec and write my own. What have ya'll used before, and has it worked well? I don't really care about 100% matching the spec, but the closer the better.

1

2 Answers 2

14

Here's a function that I use. It does a little more than just run the email address through a regex, but so far it is the most complete solution that I found:

function validEmail($email, $skipDNS = false)
{
   $isValid = true;
   $atIndex = strrpos($email, "@");
   if (is_bool($atIndex) && !$atIndex)
   {
      $isValid = false;
   }
   else
   {
      $domain = substr($email, $atIndex+1);
      $local = substr($email, 0, $atIndex);
      $localLen = strlen($local);
      $domainLen = strlen($domain);
      if ($localLen < 1 || $localLen > 64)
      {
         // local part length exceeded
         $isValid = false;
      }
      else if ($domainLen < 1 || $domainLen > 255)
      {
         // domain part length exceeded
         $isValid = false;
      }
      else if ($local[0] == '.' || $local[$localLen-1] == '.')
      {
         // local part starts or ends with '.'
         $isValid = false;
      }
      else if (preg_match('/\\.\\./', $local))
      {
         // local part has two consecutive dots
         $isValid = false;
      }
      else if (!preg_match('/^[A-Za-z0-9\\-\\.]+$/', $domain))
      {
         // character not valid in domain part
         $isValid = false;
      }
      else if (preg_match('/\\.\\./', $domain))
      {
         // domain part has two consecutive dots
         $isValid = false;
      }
      else if (!preg_match('/^(\\\\.|[A-Za-z0-9!#%&`_=\\/$\'*+?^{}|~.-])+$/', str_replace("\\\\","",$local)))
      {
         // character not valid in local part unless 
         // local part is quoted
         if (!preg_match('/^"(\\\\"|[^"])+"$/', str_replace("\\\\","",$local)))
         {
            $isValid = false;
         }
      }
      
      if(!$skipDNS)
      {
          if ($isValid && !(checkdnsrr($domain,"MX") || checkdnsrr($domain,"A")))
          {
             // domain not found in DNS
             $isValid = false;
          }
      }
   }
   return $isValid;
}

The function has an optional $skipDNS argument that can be set to TRUE if you don't want to validate the MX records for the hos. Otherwise the function will attempt to validate that the e-mail address provided actually maps to a real email server.

It's useful to note that most RegEx email validation techniques will validate most e-mail addresses but they will most likely allow some carefully crafted invalid addresses or worst.. fail on some more obscure, but valid e-mail addresses. For more information you may want to check out the Internet Message Formats RFC which describes the allowed format for e-mail addresses.

Sign up to request clarification or add additional context in comments.

2 Comments

Checking the DNS records... nice approach!
This is actually also somewhat vulnerable to REDOS. See a related exploit here: github.com/johnhenry/valid-email/issues/4
13
^([0-9a-zA-Z]([-\.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$

This is an awesome tool to help write and check expression, not sure if you have it but hopefully its helpful.

Expresso

3 Comments

This regex is a sure-fire way to induce catastrophic backtracking, as evidenced by this follow-up question.
Awesome tool doesn't warn about catastrophic backtracking. Not sure if RegexBuddy does, but it's my tool of choice.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.