DEV Community

Cover image for Properly validating e-mail addresses

Properly validating e-mail addresses

tux0r on August 30, 2018

If you are a developer of "web applications", you probably have written (or copy-pasted) some code which tries to validate an entered e-mail addres...
Collapse
 
isaacdlyman profile image
Isaac Lyman

^.+@.+\..+$ is the regex I use. It needs something before an @, something between that @ and a following ., and something after that ..

Does it accept some invalid email addresses? Probably. I ask users to verify their email address anyway; worst case scenario, they enter an invalid address and the email bounces. That one is on them.

Does it deny some valid email addresses? Maybe, but in my opinion if your email address is so weird it doesn't have an @ and a . in it, then you already know you're a signup error waiting to happen.

For a little while, I was hearing complaints from users who accidentally typed a space while entering their email address. So I added a validator to make sure the email address has no spaces. I know this denies even more valid email addresses, but my goal isn't to match the RFC perfectly, it's to allow the maximum number of users to sign up with the minimum number of problems. This prioritizes a large number of users who make mistakes over a tiny number of users who do not, which may not be fair but makes sense as a business decision.

Collapse
 
drewknab profile image
Drew Knab

+1

I'm not really here to impress cranks who have garbage-fire emails because the spec allows it.

Collapse
 
reegodev profile image
Matteo Rigon • Edited

Once I've read a similar article, yes the RFC is not implemented correctly nowhere ( beside your library now I guess 😉), but a user with a special email will have more serious trouble than registering to your site, since basically nothing on the web will allow that email to be used ( or created in the first place)

Collapse
 
tux0r profile image
tux0r

That problem will fade as more web developers integrate my library! ;-)

Collapse
 
qm3ster profile image
Mihail Malo

Wait, wait, wait... your library?
I'm calling the :oncoming_police_car:

Thread Thread
 
tux0r profile image
tux0r

Why?

Thread Thread
 
qm3ster profile image
Mihail Malo

Because you have been so successful in framing your advertising post as an informational post that I took the bait.
Fortunately, police don't care about truth or justice, they just want to inflict some damage. This is the rare case when they're just what the doctor ordered.

Thread Thread
 
tux0r profile image
tux0r

I'm not advertising, I'm explaining. There is a lot of advertising on DEV. I'm not a company. I don't care how many people know my software. I don't sell anything to anyone.

Please complain to actual advertisers instead.

Thread Thread
 
qm3ster profile image
Mihail Malo

Alright, I feel you, I feel you.

Collapse
 
martingbrown profile image
Martin G. Brown • Edited

Whether Unicode is allowed in the name part rather depends on whether the SMTP servers involved support the SMTPUTF8 extension. In my experience they mostly don't. AWS's SES and Sendgrid don't for example.

Collapse
 
tux0r profile image
tux0r

That's not the problem of the validator though...

Collapse
 
martingbrown profile image
Martin G. Brown

Well it is if you are letting through addresses your own SMTP server doesn't support.

Collapse
 
qm3ster profile image
Mihail Malo • Edited

user registers with an IP address for a domain

Oh, such fun! I definitely want this happening to me and my apps!

And no, we shouldn't be internationalizing domains, we should be de-nationalizing people.

/en/ shouldn't just be the default aliased to /, it should be the only language of the web.

Collapse
 
tux0r profile image
tux0r

Or German, which is the most-spoken language in Europe (and soon, when the only English-speaking countries leave the EU, even more relevant here)...

Collapse
 
qm3ster profile image
Mihail Malo

Then you should have said Chinese.
I'm talking about the language (and charset) of legacy systems, and the language of programming language keywords.

Collapse
 
robiii profile image
Rob Janssen • Edited

How does it hold up to, say, these tests?

code.iamcal.com/php/rfc822/tests/

You can find the tests in easy format here: github.com/dominicsayers/isemail/b...

Collapse
 
tux0r profile image
tux0r

I am not sure. I have not tried any of those. If you find something is missing, please submit a proper bug report (or even a fix).

From looking through that page, it seems to respect the RFC 822 which are declared obsolete.

Collapse
 
bgadrian profile image
Adrian B.G.

which I have developed to solve this very problem once and for all.

You cannot do that as long as the internet evolves.

Collapse
 
tux0r profile image
tux0r

I will update the library as soon as a new address standard is established.

Collapse
 
jaspr profile image
Jason Spradlin

For the rest of your life?

Thread Thread
 
tux0r profile image
tux0r

Probably.

Collapse
 
tux0r profile image
tux0r

I have tried some of the "perfectly working" mechanisms. Even if your language has one, it will most likely not cover corner cases. (I admit to not have tried every single one.)

 
tux0r profile image
tux0r

I have just tested a local address with an emoji. PHP does not accept that.

Thread Thread
 
perttisoomann profile image
Pert Soomann

Seems like PHP implementation is perfect, but they check against older RFC.

Collapse
 
tux0r profile image
tux0r

That 99.99% Works

Not good enough.

 
tux0r profile image
tux0r

So the PHP implementation becomes increasingly less usable as more and more Unicode domains are registered.

Thread Thread
 
tux0r profile image
tux0r • Edited

Hazardous, but rule-compliant. libvldmail has a compiler flag for that, so you could make it reject them if needed.

Thread Thread
 
perttisoomann profile image
Pert Soomann

Yeah, beyond just pure technical ability to validate email address against rules, it's a weird one on so many levels:

theguardian.com/technology/2017/ap...

Can see pros and cons which ever way you go, tho having a dedicated library does help with faster updates over built in functions that might take years to release next version.