DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’» is a community of 966,904 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in

Discussion on: Let's stop using [a-zA-Z]+

Collapse
 
lionelrowe profile image
lionel-rowe • Edited on

Great post! But unfortunately, this still fails for some languages. For example, Burmese:

const burmese = 'α€™α€Όα€”α€Ία€™α€¬α€˜α€¬α€žα€¬'

;/^\p{Letter}+$/u.test(burmese) // false
Enter fullscreen mode Exit fullscreen mode

Or even certain normalized representations of "LΓΌdenscheid":

// visually the same, but splits the umlaut and the "u" into two characters
const town = 'LΓΌdenscheid'.normalize('NFD')

;/^\p{Letter}+$/u.test(town) // false
Enter fullscreen mode Exit fullscreen mode

You can get around this by also allowing the Mark category:

const regex = /^[\p{Letter}\p{Mark}]+$/u

regex.test(burmese) // true
regex.test(town) // true
Enter fullscreen mode Exit fullscreen mode
Collapse
 
tillsanders profile image
Till Sanders Author • Edited on

Good remark! I will update the post to include this πŸ‘