DEV Community

Cover image for Escaping characters in C# strings
Carl-Hugo Marcotte
Carl-Hugo Marcotte

Posted on • Originally published at forevolve.com

Escaping characters in C# strings

In this article, we look at escaping characters in C# strings.
But what is escaping, you may wonder?
That's how we write special characters or characters that would otherwise be impossible to include in a string, like ".

This article is part of a learn programming series where you need no prior knowledge of programming.
If you want to learn how to program and want to learn it using .NET/C#, this is the right place.
I suggest reading the whole series in order, starting with Creating your first .NET/C# program, but that's not mandatory.

This article is part of a sub-series, starting with Introduction to string concatenation.
It is not mandatory to read all articles in order, but I strongly recommend it, especially if you are a beginner.
If you are already reading the whole series in order, please discard this word of advice.

Escaping characters

As mentioned in the introduction, the action of escaping characters means you write an "escape sequence" to represent specific characters.
An escape sequence starts with a \. It is followed by a character or a series of numbers representing the Unicode character code.

Here are a few examples:

using System;

var quotes = "Wrap \"a few words\" with quotes.";
var newLine = "Add a\nnew line.";
var tab = "\tAdd a tab at the begining of the string.";
var e = "This is the letter e: \u0065";

Console.WriteLine(quotes);
Console.WriteLine(newLine);
Console.WriteLine(tab);
Console.WriteLine(e);
Enter fullscreen mode Exit fullscreen mode

When we run the previous code, we obtain the following result:

Code sample's console output

If we take a closer look:

  • We can write \" to insert a " character inside a string, which would be impossible otherwise.
  • We can use the special \n to insert a new line (see below for more special characters).
  • We can use the special \t to insert a horizontal tab (see below for more special characters).
  • We can use the Unicode representation of a character, prefixed by \u, to insert that character into a string (UTF-16 or UTF-32).

In my opinion, the most helpful special characters are:

Escape sequence Character name Example
\' Single quote Not related to string but to char, like
var q = '\''; (not covered yet).
\" Double quote See preceding example.
\\ Backslash var s = "c:\\folder\\some-file.ext";
\n New line See preceding example.
\r Carriage return Read the OS specific line break section
\t Horizontal tab See preceding example.
\u0000 Unicode escape sequence (UTF-16) To write the character e: \u0065
\U00000000 Unicode escape sequence (UTF-32) To write the character e: \U00000065
\x0[0][0][0] Unicode escape sequence (UTF-16) with variable length To write the character e: \x65

See String Escape Sequences for some more characters.

Next, we explore line break.

Platforms-specific line break

In case you did not know, Windows line breaks are different from Unix-based platforms.
Windows uses a sequence of both carriage return and new line (\r\n) while Unix uses only a new line (\n).
Windows is pretty forgiving and will most likely understand \n.
Nevertheless, in a cross-platform app targeting both Windows and Unix (e.g., Linux and Mac), do you really want to leave line breaks to chance?
To save the day, we can use the Environment.NewLine property instead, which adapts to the current OS.

Here is an example (same output as the newLine variable of the preceding example):

using System;
Console.WriteLine($"Add a{Environment.NewLine}new line."); // We leveraged interpolation here
Enter fullscreen mode Exit fullscreen mode

If we have many line breaks, this can become tedious.
Hopefully, we can simplify this by leveraging the string.Format method we cover next.

string.Format

We can use the string.Format method to format a string.
It is similar to interpolation, but instead of inserting the variables directly, we need to define numerical tokens, like {0}, {1}, {2}, and so forth.
Then, we pass arguments that match those tokens in the correct order to get the final result.

Many other .NET methods allow such a construct, including Console.WriteLine.
Here is an example of both string.Format and Console.WriteLine:

using System;
var newLine = string.Format("Add a{0}new line.", Environment.NewLine);
Console.WriteLine(newLine);
Console.WriteLine("Add a{0}new line.", Environment.NewLine);
Enter fullscreen mode Exit fullscreen mode

When running the code, both Console.WriteLine output the same result.
The advantage here is writing {0} every time we need a line break.
Of course, we could use interpolation with {Environment.NewLine} directly, but it makes the string longer.
Even worst, we could use concatenation like "Add a" + Environment.NewLine + "new line."; I find this to be the hardest code to read, making the string even longer.

But how does that work?
In the background, the framework replaces the numerical tokens with the specified arguments.

There are more to string.Format, including the possibility to format your values, but that's getting more and more out of the scope of our main subject.
Next, we look at escaping characters in interpolated strings.

Interpolation-specific escaping

In an interpolated string, we use the characters { and } to wrap the element to insert, like a variable.
What if we want to write those characters?
It is as easy as doubling them.
For {, we write {{.
For }, we write }}.

Here is an example:

var name = "Darth Vader";
var greetings = $"{{Hello {name}}}";
Console.WriteLine(greetings);
Enter fullscreen mode Exit fullscreen mode

As simple as that, the preceding code will output {Hello Darth Vader}.
Next, let's have a look at the verbatim identifier (a.k.a. multiline strings).

The verbatim identifier

The verbatim identifier (@) allows us to write multiline strings, but it's not all.
It also disables the escaping of characters, which can be very handy when writing file paths (amongst other things).
In other words, a backslash is just a backslash, not a special character that starts an escape sequence.
For example, it is more convenient to write var path = @"c:\folder\some-file.ext"; than var path = "c:\\folder\\some-file.ext"; (simple vs double backslash).

Since \[character] does not work, how can we escape "?
Like with interpolation, we only have to double it, like var v = @"Some ""quoted words"" here.";

The verbatim identifier can also be used to escape reserved keywords, like if, for, public, etc.
For example, if you'd like to create a variable named public, you can't write var public = "some value"; because this won't compile.
However, you could prefix your variable name with @ to make it work, like this: var @public = "some value";.
Using this makes your code a little harder to read, but that's a trick that can come in very handy sometimes.

More info: one place where it was handy was to create class attributes in an old ASP.NET MVC version because class is a reserved keyword: we had to escape it using the verbatim identifier.

Reference: if you want to know more about reserved keywords, please visit the C# Keywords page in the official documentation.

Next, it's your turn to try it out.

Exercise

In this exercise, you will write a program that writes code.

Unfortunately, I was not able to recreate the exercise on this platform, so please look at the exercise on the original post on my blog. I'm sorry for the inconvenience.

Conclusion

In this article, we explored the string type a little more.
We discovered that the backslash character (\) is used to escape characters or to create special characters like tabs (\t) and line breaks (\r and \n).
We then learned about the Environment.NewLine property that gives us the correct platforms-specific line break.
We also peaked at string.Format to format strings using numerical tokens instead of interpolation.

Afterward, we explored some edge cases to escape { and } in interpolated strings and " in multiline strings.
The solution was to double the characters; a.k.a. write "" to obtain ".
We saw that in @"..." strings, the escape character (\) does not work, which is an advantage that can become a disadvantage.
We finally learned to use the verbatim identifier (@) to bypass the reserved keywords limitation and use it as an escape character for identifiers.

All in all, that's a lot of new content that concludes our introduction to strings mini-sub-series.

Next step

It is now time to move to the next article: Introduction to boolean algebra which is coming soon. Stay tuned by following me on dev.to, Twitter, or other places you can find me.
You can look at my blog contact page for more info.

Top comments (0)