Don't use String.ToLower() in C# when comparing strings

#dotnet #performance #csharp #programming

When comparing strings, often the casing should be ignored in the comparison. Most of the times, this is required because every user writes text different. I do not really care when there is “Apple” or “apple” in a dataset. To me, this is the same. On the other side, “A” and “a” are different characters for computers.

To overcome this problem, a working and very wide-spread approach is to first lowercase (or uppercase) all letters and then compare them. Most of us done it, me too - but no longer. Although there is a drawback you should keep in mind when it comes to performance.

I’ve created a litte benchmark program an I’ll evaluate the results afterwards. First I’ll explain the benchmark setup.

Benchmark setup

I’ve evaluated three types of string equality comparison.

Strings which are equal (except casing)
Strings which are not equal, differing in the last character
Strings which are not equal, differing in the first character

These three test-cases were applied to string with lengths of 10, 100 and 1000 characters

My benchmark code looks like the below code snippet, where StringToCheck and VerifyString are generated.

[Benchmark]
public void StringComparison_Equals()
{
    var result = StringToCheck.Equals(VerifyString, StringComparison.CurrentCultureIgnoreCase);
}

[Benchmark]
public void StringComparison_EqualityOperator()
{
    var result = StringToCheck.ToLower() == VerifyString.ToLower();
}

Benchmark results

I've used BenchmarkDotNet to run my test and enable the MemoryDiagnoser to also anaylse memory usage/allocations.

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1526 (21H2)
Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R), 1 CPU, 8 logical and 4 physical cores
.NET SDK=6.0.101
  [Host]     : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT
  DefaultJob : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT


|                                    Method | Length |        Mean |     Error |    StdDev |      Median |  Gen 0 | Allocated |
|------------------------------------------ |------- |------------:|----------:|----------:|------------:|-------:|----------:|
|                   StringComparison_Equals |     10 |    55.42 ns |  1.766 ns |  5.180 ns |    55.51 ns |      - |         - |
|           StringComparison_Equals_b_Start |     10 |    46.70 ns |  0.973 ns |  1.965 ns |    46.41 ns |      - |         - |
|             StringComparison_Equals_b_End |     10 |    81.11 ns |  1.815 ns |  5.236 ns |    80.42 ns |      - |         - |
|         StringComparison_EqualityOperator |     10 |    79.07 ns |  2.444 ns |  7.168 ns |    77.24 ns | 0.0229 |      96 B |
| StringComparison_EqualityOperator_b_Start |     10 |    80.76 ns |  2.589 ns |  7.551 ns |    79.09 ns | 0.0229 |      96 B |
|   StringComparison_EqualityOperator_b_End |     10 |    86.48 ns |  2.763 ns |  8.016 ns |    85.40 ns | 0.0229 |      96 B |
|                   StringComparison_Equals |    100 |   131.51 ns |  2.654 ns |  7.485 ns |   128.91 ns |      - |         - |
|           StringComparison_Equals_b_Start |    100 |    51.06 ns |  1.427 ns |  4.162 ns |    49.94 ns |      - |         - |
|             StringComparison_Equals_b_End |    100 |   410.15 ns | 10.802 ns | 31.849 ns |   403.00 ns |      - |         - |
|         StringComparison_EqualityOperator |    100 |   246.99 ns |  4.642 ns |  4.342 ns |   246.59 ns | 0.1070 |     448 B |
| StringComparison_EqualityOperator_b_Start |    100 |   243.33 ns |  2.462 ns |  2.056 ns |   243.46 ns | 0.1070 |     448 B |
|   StringComparison_EqualityOperator_b_End |    100 |   260.33 ns |  5.702 ns | 16.176 ns |   253.08 ns | 0.1070 |     448 B |
|                   StringComparison_Equals |   1000 |   878.83 ns |  7.949 ns |  7.047 ns |   878.38 ns |      - |         - |
|           StringComparison_Equals_b_Start |   1000 |    46.50 ns |  0.944 ns |  0.883 ns |    46.13 ns |      - |         - |
|             StringComparison_Equals_b_End |   1000 | 3,291.64 ns | 27.859 ns | 26.060 ns | 3,283.96 ns |      - |         - |
|         StringComparison_EqualityOperator |   1000 | 2,001.56 ns | 31.459 ns | 27.888 ns | 1,993.78 ns | 0.9670 |   4,048 B |
| StringComparison_EqualityOperator_b_Start |   1000 | 1,966.92 ns | 32.515 ns | 28.824 ns | 1,967.16 ns | 0.9670 |   4,048 B |
|   StringComparison_EqualityOperator_b_End |   1000 | 2,002.98 ns | 39.681 ns | 30.980 ns | 1,997.75 ns | 0.9670 |   4,048 B |

As you can seee in the above table, the Equals method is sometimes faster, but does not allocate any memory.

The biggest difference is, when the strings differ in the first character. It takes about the same time to compare string with 10, 100 and 1000 if they differ at the start. In contrast to that, with EqualityOperator this cannot be seen.

For the version with EqualityOperator it takes the same time, no matter where the difference is. This is not true for the Equals, this gets faster, the earlier the difference is. This is probably only because of the ToLower() in the first place.

In addition, memory is allocated when using EqualityOperator with ToLower(). That's because of a new instance with all lowercased letters has to be created. See: String.ToLower Method - Microsoft Docs

Additional memory allocations are bad because the garbage collector needs to clean them up again, which will cost valuable CPU time and therefore slows down the application

Summary

Don't use ToLower() when comparing strings (neither use ToUpper()). This also applies to related string operations like StartsWith(), EndsWith(), Contains() and so on. I also interpret this as a code smell, because the available language features are not used. In addition, some IDE's even detect this smell.

As always, the code can be found at my GitHub. There you can also find the benchmark report in CSV format.

Top comments (9)

David Kröll • Mar 25 '22

Yes of course, that's why I've written Don't use ... when ...

Eric Conklin • Mar 17 '22

So what do you suggest? Invariant culture ignore case?

David Kröll • Mar 17 '22

Depends on your use-case, but never seen a difference in these two. Actually I don't know the difference.

SmartmanApps • Dec 2 '24

Interesting. Related tip I got a while ago from DotnetDave, StartsWith() and EndsWith() are both faster than Contains()

Ezekiel Juel • Mar 17 '22

Do you have a similar trick for switch statements to ignore case?

David Kröll • Mar 17 '22

I'm not sure if this works. I'm not that familiar with how switch statements actually compare the things.
In fact, during lowering (a step in compilation) the switch is transformed to an nested if, in some cases to a binary search tree to optimize for efficiency.

SmartmanApps • Dec 2 '24

From what I've seen switch only works with actual strings - an example of something that DOES work is nameof(thing), but that's because the result is a string - so I always just use ToLower() as the argument I'm passing into a switch. The other option is if you KNOW where the caps might be, just create a fall-through for it...

case "ThisThing":
case "thisthing":
dostuff;
break;