My point is that the terminology around what a "character" is has gotten so confusing that we should just stick to well-defined terms like "code point" and "grapheme". "Character" is sometimes confused with one or other of those (or something else entirely) and so I don't think it's a good name for a data type.
If you want to loop over "characters" in a string, you should loop over code points (which are composed of between 1-4 bytes). But why should someone ever want to loop over the individual bytes of a code point? This functionality could be provided, but not at the expense of clarity.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
It is, UTF-8 can carry up to 4 bytes of information.
My point is that the terminology around what a "character" is has gotten so confusing that we should just stick to well-defined terms like "code point" and "grapheme". "Character" is sometimes confused with one or other of those (or something else entirely) and so I don't think it's a good name for a data type.
If you want to loop over "characters" in a string, you should loop over code points (which are composed of between 1-4 bytes). But why should someone ever want to loop over the individual bytes of a code point? This functionality could be provided, but not at the expense of clarity.