loading...

There's only one way to validate an email address

Jerod Santo on June 18, 2019

The only thing that you can reliably do to validate an email address is to send it an email. YOU SEND IT AN EMAIL! That's the only way you can do i... [Read Full]
markdown guide
 

Your regular expression is invalid; it's not good enough.

Technically there's a fully RFC822 compliant email regex, though it's "a bit" unwieldy ๐Ÿ˜‰:

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)
 
 

Yeah... I'm not gonna touch that with a 10 foot pole ๐Ÿ˜‰

 

Tried to put it in at debuggex but it couldn't validate test@test.com, guessing it's one of those that also includes other information? Side-note the graph ended up so big that I couldn't see even half of it when fully zoomed out.

 

Works for me. Make sure you select PCRE as regex type and add the 'x' flag, otherwise it won't work. The site is definitely struggling with this monster though ๐Ÿ˜‚

 

In fact -- man, I don't know what came over me; I was actually even talked into copy-pasting one off of a gist!

As soon as I read this, I knew I was in trouble haha. I remember that conversation. Good thing you've had it fixed for almost a year now!

 

I totally agree with the problems in regex and how sending an email is best way to validate an email. To add to the bot scenario in the ending I've faced situation where after I click the link in email I was faced with a recaptcha which seems good unless bots learn to solve those too

 

โ€ฆrecaptcha which seems good unless bots learn to solve those too

2017

 

Interesting article. However, what do you actually propose that a valid email address is:

  • An emailadress that works? Then this check is great.
  • Detecting it's not a bot? I don't really see this working. Especially not on long term.
  • Detecting it's not fraud/scam? Does not work. Scammers can steal credentials of their victims or what happens more often, just let the victims confirm their email.

If it's just about checking if a user did not mistyped his of her email. You're best off using a well tested browser implementation or just keeping the input loosely validated. The last thing you want is loosing conversion by someone that can not enter a valid email. At the end, it's the users responsibility to enter their correct emailadress.

If you really want to be sure and do not want to depend on a browser implementation or a strict validation process. You could even ask their email twice. Then the chance that the user mistyped will be very, very low.

 

what do you actually propose that a valid email address is

To me, valid means it's an email address in control of a human who entered it in earnest. You could layer on bot defense (via recaptcha, etc) either on the email form itself or on the confirmation page linked to in the email. But in my opinion, that's a separate concern.

You're best off using a well tested browser implementation or just keeping the input loosely validated.

I'm all for using <input type=email> and letting browsers do their thing. But that's more for UX than it is for me as the site owner.

 

If you really want to be sure and do not want to depend on a browser implementation or a strict validation process. You could even ask their email twice. Then the chance that the user mistyped will be very, very low.

Though, if you do that, you probably want to disable the ability to paste from cut-buffer into the form. Otherwise, most people will just copy-paystah and you can end up with the wrong string twice.

Remember: there's people out there constantly trying to build a better idiot.

 

Would a regex catch either of the last two cases?

 

No, pretty impossible using regex. There are tools like Siftscience though.

 

What are people's thoughts on the registration flow where the user puts in an email first, has to click the verification link, then sets username/password/finish signing up?

Any pros/cons over getting the profile info and then verifying the account?

 

That's pretty much what we do on changelog.com except we don't bother with a password.

There is a way you can join and give us all your info up front, but most people just give us their email and we go from there. Once they click the confirm link they are directed to their profile where they can fill out the rest of the info if they like.

Works pretty well, in my experience.

 

Used to be, you could validate an email address by connecting to the SMTP server listed in the address's MX record, and then do a VRFY. But, also thanks to spammers, you can almost never do that any more.

 

The good ole' days! ๐Ÿ‘ด

 

I agree that you shouldn't use a regex, but PHP in particular has the filter_var() function, which is a far better option. There are a few edge cases that are validated incorrectly, but it's generally fairly reliable.

However, sending an activation email is probably prudent in most cases since just because an email address is valid doesn't mean it actually exists.

 

However, sending an activation email is probably prudent in most cases since just because an email address is valid doesn't mean it actually exists.

Or is owned by the person submitting the addressโ€ฆ :p

 
 

Yep. There are similar challenges with telephone numbers too ("surely not!" I hear you cry... uh-huh: en.wikipedia.org/wiki/National_con...), and usually our clients want to confirm it's not made up, thus, we adopt similar solutions, such as sending an SMS message, or making an actual call.

 

Thus the recent FCC warnings about dialing back unknown "local" numbers. Thus, "be careful about calling back".

A couple weeks ago, I received a call that, on my cell, very obviously came from the 20 country-code. However, on my VOIP line's handset - and its simplified display - the number simply appeared as a ten-digit 202NNNNNNN (the 202 area-code is local to me) number.

 

When I validate an email I simply regex for alphanumeric, @ and a . following the @. As you say, it's impossible to truly validate an email address without sending an email, but it's worth making sure it's a potentially valid email address.

Edit: In fact I think my regex actually works like: [capture-group-1}@[capture-group-2].[capture-group-3]

 

Until bots start clicking on emails. Then we're gonna have a whole new issue... But so far I don't think there are bots that will

create a fake email address
sign up for your thing, and then
access that email address and click on the link

It's not my proudest moment, but I once had to write a bot that did exactly that, as well as solving CAPTCHAs in order to post ads on classifieds sites. It's not that hard to do, and there are a bunch of services out there that will provide you with rotating IP adresses, rotating e-mail adresses that all forward to your server, and even captcha solving.

So never assume that because you send a validation e-mail it's a human that registered the account and clicked the validation link.

I completely agree with the rest of the article though !

 

So never assume that because you send a validation e-mail it's a human that registered the account and clicked the validation link.

Thanks, you've managed to ruin my day! ๐Ÿ˜ฑ

 

Agreed. To understand how complex matching an email address is, the specification for them is RFC5322. But the HTML5 email input box validation doesn't follow it, and its spec has this to say:

This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.

 

I never really understood why someone would want to write an email address validator when there is a test that is so much easier and always 100% accurate. The proof of the pudding is in the eating....

 

What about "10MinutesMail" services and others like that ?

 

In Japan many people use emails that do not follow the standard. For example .name..@example.com

Such emails were given out by one of the major phone providers if I remember correctly.

 

Yet another edge case to break our regexen! ๐Ÿ˜ญ

 

I've always been curious, how do services like kickbox.com validate email addresses?

code of conduct - report abuse