DEV Community

Vee Satayamas
Vee Satayamas

Posted on

The advantages of various types of strings in Rust

New Rustacians usually ask why Rust has many types of strings. I don't know the original intention. Anyways, it has some advantages as follow:

  1. It minimizes the language core. Rust core has only str. String, OsString, CString are in the standard library, which is not a part of the language core.

  2. Programmer can control performance. For example, &str refers to a byte slice so it should be fast, while String is a wrapper of Vec. String should have more performance penalty from heap memory allocation.

  3. OsString and CString improve interoperability with other programming languages and operating systems. Rust can manipulate C-style strings directly without converting them to native Rust strings. OsString is also similar.

String conversation has a performance penalty. Also sometimes string conversation is not obvious. Some systems don't even use Unicode or other standards.

The alternative of having too many types is treating everything as a byte slice. However, type checking doesn't work well with too few types.

Discussion (3)

Collapse
drsensor profile image
DrsEnsor

Most of the time you probably want CString instead of String for handling user input. Unlike String, CString automatically remove any 0 bytes ("nul characters").

I'm still looking on how to remove all non-printable chars in String (UTF-8)

Collapse
veer66 profile image
Vee Satayamas Author

Is a non-printable character a control character?

Collapse
drsensor profile image
DrsEnsor • Edited on

I think what I mean is non-graphic character which consist of both Unicode Control Character (Cc, Cf, Cs, Co, Cn) and Separator Format Character (Zl, Zp)

ref: Unicode General Category