Let’s talk about names!
See how my name’s right at the top of this thing? Names are generally one of the first things we ask for and one of the first things we offer in conversation as part of getting to know someone.
Names are also a pretty personal thing. It’s one thing to dislike your own name for your own reasons, but having someone else mangle your name can range from annoying to genuinely hurtful. So, it’s important to think about how we validate names as we work to create more inclusive technology.
So names are pretty important information. However: names can take a lot of forms, and it's hard to account for all of them. If we want to build validation into our forms, how do we validate names? Can we?
What are things that you’ve seen forms assume about how a name should be formatted?
- Everyone has one first name and one last name.
- Everyone has one first name and two last names!
- A name has a minimum of three characters. - you'd be surprised how often this comes up. Hilariously some of them accept "Ho!" and not "Ho"
- A given name comes first, and a family name comes last.
- Names are composed of letters.
- Names are composed of letters from the Latin alphabet, or can be losslessly translated into letters from the Latin alphabet.
- Names don't change, or only change at a few very specific points in time.
- Everyone has one canonical full name.
These are all actually wrong! As it turns out, basically all assumptions you can make about names are wrong.
What people take for granted as being a "name" anywhere is not true in the majority of other places—and even within their own country.
While the USA is extremely diverse, there still tends to be a lot of expectation that names conform to a certain "Americanized" standard
This goes back to way before web forms--immigrants did and still do often adopt more "American" names in hopes that assimilating would lead to more opportunity. People with names that don't sound European even today often find themselves marked as perpetual foreigners, even when they were born here, and their parents were born here, and so on.
These expectations, when extended to web forms, both are really obnoxious and do cause problems. I actually was looking into pitching a different talk to another conference earlier this year, but it turned out their form validated names to be a minimum of three characters. To their credit, they were very graceful and fixed the problem after I let them know, but it's possible other people just gave up. (Hilariously, I have occasionally gotten around forms set up like that by adding an exclamation point.)
That's pretty much it. It's very difficult to define a name as anything other than this very basic, very vague thing, without any incorrect premises.
Sorry, there's no good way to validate them!
So, okay. Names are nigh-impossible to pin down, but you want to make sure you have data that's useful and serves the intended purpose. What can you do about that?
So, when we're gathering name information via a web form, the point of a validation is not actually to figure out what is and isn't a name; the point is actually to get the information we need to use elsewhere. Sometimes we just want to be able to greet a user in emails or on their dashboard, sometimes we need it to respond to a personal communication, sometimes we need to know what to put on a packing label, and so on.
Do you really need to identify a user by a name? If the information isn't used in any way other than cosmetically, and you have other ways of identifying a user uniquely, such as email, you might be able to just drop it.
If there's never any reason for you to need only the given name or names, or only the surname or surnames, you can replace first name/last name with a single full name field. That way people can enter their name in a way that feels right to them.
Like, just the personal name, or just the surname(s)? If so, consider collecting only that data, and labeling it in a way that makes sense to users from any cultural context, such as: "How should we address you?" "What's your full formal title?" or something similar.
If a user knows you want a name for how to address them in emails, or for addressing a petition to a Congressperson, or to mail them a package, that's useful context that will help the user enter information that's helpful to both of you.
If you know, for example, that some of your users have multiple names that they go by (e.g., some people adopt an "English" given name on immigrating, or to do business with English-speakers, but retain a different legal name), your form might need to ask for both of those names if you anticipate situations where both are needed.
Some people have multiple given names, multiple surnames, or additional name parts that signify gender, generation, marital status or religious identity, among other things.
The world is also not uniform on whether the given name or the family name is written first. For example, many people with Spanish names coming to the United States encounter the issue that only their second family name gets read as their "real" surname, whereas in most Spanish-speaking countries the first one is generally the emphasized one.
Twitter, for example, only allows a 16-character display name, which is not much! While it means they don't have to adjust their placement of items in their UI, it means that a large number of people can't enter their whole name on Twitter.
Some systems are old, and have limited ability to be changed to accommodate larger character sets, spaces, punctuation, etc. due to the size or sensitivity of the data. That's understandable. I think we're all familiar with airline systems, for example, which stubbornly refuse to process hyphens and tend to squeeze out spaces.
If you can't process a name as-entered in your system, don't position it as a problem with the user's name.
John Graham-Cumming wrote in a 2010 blog post about encountering this issue: "What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name." If it's their name, it can't be invalid.
The best thing you can do is admit the system limitations in your validation messages (Graham-Cumming suggests "Our system is unable to process last names that contain non-letters, please replace them with spaces", for example) and continue to work on any improvements you can make so that more names can be correctly processed.
- "Regular Expression for Validating Names and Surnames," stackoverflow
- "Your Last Name Contains Invalid Characters," John Graham-Cumming
- "Representing People's Names in Dublin Core," Andrew Waugh
- "Wookey - is that it?" Wookey
- "Falsehoods Programmers Believe About Names," Patrick McKenzie