Discussion on: Let's stop using [a-zA-Z]+

View post

Great post! But unfortunately, this still fails for some languages. For example, Burmese:

const burmese = 'မြန်မာဘာသာ'

;/^\p{Letter}+$/u.test(burmese) // false

Or even certain normalized representations of "Lüdenscheid":

// visually the same, but splits the umlaut and the "u" into two characters
const town = 'Lüdenscheid'.normalize('NFD')

;/^\p{Letter}+$/u.test(town) // false

You can get around this by also allowing the Mark category:

const regex = /^[\p{Letter}\p{Mark}]+$/u

regex.test(burmese) // true
regex.test(town) // true

Till Sanders • Mar 9 '21 • Edited

Good remark! I will update the post to include this 👍