I’ve been using C# and .NET for over a decade now 🙄. I’ve worked with developers professionally in private organizations and out "in the open" within the open-source world.
I thought I would compile some of the most common ".NET blunders" that I personally have seen time and time again.
And just to make it clear: these are all things I had to learn too and keep up to date on too!
#1: Async/Await…
I’ll start by saying that I see C# async
misunderstood and misused all the time. I’ve seen very senior, experienced, skilled and knowledgeable C# developers do it. I’ve seen myself do it too!
Based on my experience, I say that C# does not lead developers into the "pit of success" on asynchronous tooling and syntax.
The official docs say:
C# has a language-level asynchronous programming model, which allows for easily writing asynchronous code without having to juggle callbacks or conform to a library that supports asynchrony.
Sure.
There are great resources on this topic 1, 2.
I think the fact that these exist highlight that it’s not so straightforward.
Take a look at the comments on this recent HackerNews thread about C# async/await
.
In any event, here’s a quick-fire list of specific questions I’ve often had from other developers when digging into C# asynchronous code. I’ll try to address each one very briefly.
async void
: doesn’t that just support asynchronous code and the caller can "fire-and-forget"?
No. It’s really bad. Never do it.
There are all sorts of issues. An exception in an async void
method can crash your entire application, for example.
Can’t I just put a synchronous call inside Task.Run
to make it asynchronous?
var myVar = await Task.Run(async () =>
{
doSynchronousThing();
return Task.CompletedTask;
});
No. It doesn’t.
This doesn’t do any asynchronous work and could force the use of an extra thread depending on your context, system load, etc.
Note: This code will usually trigger a warning in your IDE because there’s no await
keyword used . But you can do this, and I keep seeing it.
Can’t I use Task.Run
to make really performant parallelized code?
No.
Again, asynchronous code is not the same as parallelized code.
A bit of a tangent maybe, but related… I’ve seen the following code many times:
var myTasks = new List<Task>();
myTasks.Add(Task.Run(async () => {
doSynchronousThing1();
}));
myTasks.Add(Task.Run(async () => {
doSynchronousThing2();
}));
await Task.WhenAll(myTasks);
There are a few issues here:
- Notice that inside the delegate for each
Task.Run
we aren’t returningTask.CompletedTask
. This makes that delegate’s signatureasync void
(see point above onasync void
). - This has the same issue as the point above about using
Task.Run
to force asynchronicity (e.g. it doesn’t).
In the end, this code doesn’t do any asynchronous work and can cause some thread context switching (e.g. there’s unnecessary overhead).
If I don’t await
a Task
doesn’t that make my method way more performant?
public Task DoAsynchronousThing()
{
var myTask = someOtherAsyncStuff();
return myTask;
}
To keep this simple – just use async
on all your asynchronous methods. You benefit more than trying to be smart with a feature that developers keep getting wrong.
David Fowler’s Async Guidance:
There are benefits to using the async/await keyword instead of directly returning the Task…
Note: Depending on where/how you do this, you might get some IDE warnings. Heed the warnings.
Isn’t Task.Factory.StartNew
the same as Task.Run
?
Create a new empty .NET console application using dotnet new console my-test
.
Replace Program.cs
with the following:
using System.Diagnostics;
var watch = new Stopwatch();
watch.Start();
var tasks = new List<Task>();
for (var i = 0; i < 5; i++)
{
tasks.Add(Task.Factory.StartNew(async () =>
{
await DoDelay();
}));
}
await Task.WhenAll(tasks);
Console.WriteLine($"The program is closing... it took {watch.ElapsedMilliseconds} ms to run!!");
async Task DoDelay()
{
await Task.Delay(10000);
}
How long will this program take to execute: more than 10 seconds? Try it.
Spoiler: It takes milliseconds to run. Did you expect that?
Here are a couple articles addressing this in detail (1, 2).
#2: Parallel Processing While Handling HTTP Requests
Here’s a silly example of an HTTP endpoint to show what I’m getting at:
IActionResult PostUpdateUserName(string userName)
{
int existingNameFound = 0;
var existingNames = GetExistingUserNames();
Parallel.ForEach(existingNames, name =>
{
if (name == userName)
{
Interlocked.Increment(ref existingNameFound);
}
});
// Do something with existingNameFound > 0
}
Using an asynchronous model to process HTTP requests is not the same as parallel processing.
With asynchronous code, you are trying to make your .NET threads do as little work as possible. For web applications specifically, this will keep your applications responsive, capable to serve many requests concurrently and therefore scale well.
Parallel code is the opposite: it uses and keeps a hold on threads. This is to do CPU-intensive work like heavy calculations. This is usually performed using the Parallel
static class.
I’ve seen parallel code in web applications being performed during an HTTP request. Usually, the code performing these heavy calculations wasn’t inside MVC controllers but farther down the chain of execution.
This is bad because parallel processing hogs threads that otherwise should be used sparingly to respond to HTTP requests.
If you need to use parallel processing, don’t do it during an HTTP request. This will lead to reducing the scalability and throughput of your overall web server. It could also lead to increasing overall memory pressure on your web process/server too.
#3: Trying To Self-Manage Database Connections
Have you ever seen code where an SqlConnection
is passed around as an argument and then used like this?
SomeData DoDatabaseThing(SqlConnection connection)
{
if (connection.State != ConnectionState.Open)
{
connection.Open();
}
// Do a bunch of stuff, calling more methods
// and passing the connection as an argument...
if (connection.State != ConnectionState.Closed)
{
connection.Close();
}
// Return data.
}
Some say that being defensive and making sure you manage your database connections this way is "good". In my experience, this type of code is an easy source of bugs.
One time, an application I worked on was using this pattern. I had mentioned a few times that this kind of code could lead to issues if we aren’t careful…
One day, we found that after a number of hours our web server would crash. We had to manually restart the web server in production every couple of hours… Eventually, I found the source:
The developer in question introduced database code that caused C# code to throw an exception – before trying to close the connection. That exception was then silently caught up the stack! 💣
E.g. The connection was never closed and caused a memory/connection leak.
The more you try to be "smart" with database connections, the more chances you have to mess things up.
There are 2 things that come to mind on this topic :
- Always put your
SqlConnection
inside of ausing
block. It will be automatically closed once it’s out of scope. - Under the covers,
SqlConnection
is really a pool of connections. It’s already managed for you so you’re not gaining anything trying to manage things yourself.
#4: HttpClient
HttpClient
is a class that does what you think it does. However, this class is not as straightforward as it could (should?) be.
The normal thinking in C# is that any class which implements IDisposable
should be used within a using
block. Since HttpClient
isn’t disposable, we should be able to create a new instance any time we need to use it, right?
If you check out the current example of using HttpClient
in the official documentation, you’ll see this comment:
HttpClient
is intended to be instantiated once per application, rather than per-use. See Remarks.
In the "Remarks" section, there’s a link to this article with more information about how this class works.
I’ve seen systems that had weird network issues and socket exceptions because of misusing HttpClient
.
The general advice with using HttpClient
is to use IHttpClientFactory to "create" instances for you.
public MyClass
{
private IHttpClientFactory _factory;
public MyClass(IHttpClientFactory factory)
{
_factory = factory;
}
public async Task DoRequest()
{
using var client = _factory.CreateHttpClient();
// Use `client` now to perform http requests.
}
}
#5: Calling ToList() Unnecessarily
I’ve seen this one a lot. In most situations it’s not a big deal and the side-effects are not felt.
This can happen for many reasons. For example, a developer doesn’t realize that they’ve already called ToList()
before from some other method acting on the same collection.
var users = await GetUsers();
Thing1(users);
Thing2(users);
private void Thing1(IEnumerable<User> users)
{
var list = users.ToList();
// Do something stuff on list.
}
private void Thing2(IEnumerable<User> users)
{
var list = users.ToList();
// Do something stuff on list.
}
Again, this is usually not an issue… until it is.
When working with larger collections, performance sensitive applications or sensitive hot paths of an application can be negatively affected.
Calling ToList()
will cause extra memory allocations. With large collections or when creating a large numbers of collections overall (or both), this can lead to:
- Using more memory than necessary
- Allocating memory is CPU intensive – so it uses extra CPU resources
- Additional GC pressure – which can cause large slow downs and pauses
One example from my experience was improving the performance of an application that processed/aggregated millions of data points as quickly as possible. By finding a couple of places where I could remove redundant calls to ToList()
, I decreased processing time by 10%.
In this case, the processes could take 6 days to run. Just this one change would have an improvement of a half-day.
Again, this isn’t necessarily about reducing memory usage – it’s about conserving CPU cycles used to allocate new memory and how often the .NET garbage collector has to run.
If you want HTTP requests that are blazing fast too, then this is one thing to keep an eye out for!
#6: Ignoring Database Fundamentals Because The ORM Will "Take Care Of It"
ORMs are great. They can make working with the 90% of database usage in application code straightforward and easy. 90% of database migrations that you’ll need can be autogenerated too!
But then you have some aspect of your system that needs something special:
- A reporting query that used to be fast but that now takes down the database when running with that 1 massive customer
- That batch insert/update code path that used to be fine but is now abysmally slow
- A set of dashboard queries that customers have been complaining about
- That new full-text search feature customers have been asking for
What do you do? The more you know about databases – the better.
There’s a place for simplicity, which ORMs are great at. But some situations call for more nuanced solutions.
Here are some areas where it might help you to have some extra knowledge:
-
Indexing:
- Which tables? Which columns? Clustered vs. non-clustered? B-tree vs. GIN? Etc.
- Beyond basic SQL operators:
- Merge
- Cross Apply
- CTEs
- PostgreSQL Hot Updates
- Full-text Search
- Query performance analysis
#7: Long-running Or Overnight Jobs That Keep Running Out Of Memory!
I’ve seen this a number of times. It’s not something complicated, but a mistake usually made when the dataset being used in production is small. Then, over time it grows larger and larger and bad data access patterns begin to cause problems.
Imagine an overnight processing job that grabs all the records from a database table and processes them all in memory.
With 10 items to process, this might be fine. But with 10,000 items you’ll start getting memory issues.
My advice: Paginate when dealing with a collection of items where you aren’t sure that it’s going to remain the same size.
#8: Creating A Ton Of Different Separate Projects For One Business/Domain Context
I’ve written about this a few times (1, 2, 3). I’m not the biggest fan of clean architecture. I think many of the principles found in the typical presentation of it are solid (no pun intended!), but I tend to disagree with the application of those principles.
In short, we software developers like to complicate things because it’s "more SOLID" or "more X".
For me simplicity (usually) wins.
Do you need 5 different .NET projects like Web
, Core
, Infrastructure
, DataAccess
, Shared
, etc.?
What if instead we focus how our folders & projects are structured by the capabilities of our business/system?
Incoming heresy: Maybe it’s okay to put database related code inside the same folder as business logic code? They are still in separate files. But now you don’t need to context switch to find your domain code vs. database code. You can still use interfaces. No one is stopping you.
For my more nuanced thoughts, check out ".NET Architecture: How To Structure Your Solutions."
#9: Allowing Unlimited Network Requests To Your APIs
If you build any HTTP API: it will be abused. Sonner or later. Even by those within the same organization!
It could be a rogue/poorly written script or even a malicious attacker.
Not protecting your API with rate limits leaves the door open for unnecessary outages.
Again, don’t trust that consumers will not abuse your API.
For modern .NET developers, there are newer built-in tools to easily add rate limiting to your APIs. Use them!
#10: Using PostAsync
Instead Of PostAsJsonAsync
The newer PostAsJsonAsync
extension method on HttpClient
should be your "go to" for making typical HTTP POST requests.
Behind the scenes, this method uses some fancy memory pooling, etc. to make it’s memory allocations minimal.
A much better performance profile can mean things like using a smaller cloud instance size – which reduces cost and complains from the DevOps people 🤣. If you are building a sharable library, then your code is less of a performance burden on the consumer code.
All-in-all, it’s a quick win!
#11: Building Proprietary "Frameworks" And Wrappers Around Built-in .NET Functionality
Anytime I see the name "framework" inside source code that’s not part of .NET I raise an eyebrow.
I’ve seen so many custom frameworks: custom UI frameworks, custom HTTP routing, custom HTML renderers, custom data access frameworks, etc.
Internal frameworks that make working with HTTP requests, routing, data binders, etc. "easier" are bound to cause awkward situations:
- The framework wasn’t supposed to be used everywhere. But now it is.
- The framework overrides built-in .NET functionality, leading to a net restriction of available functionality instead of a net increase.
- New (to the org) developers have to (re)learn all this stuff
- There’s usually no documentation. If there is, it’s probably terrible.
- What if the 1 person who wrote most of the framework leaves the company?
In my experience, most things that deviate from a standard .NET functionality that’s existed for years is (usually) a can of worms. Or a pandora’s box. Something like that.
#12: Using Stored Procedures Everywhere
Back in the day (like over 10 years ago…) stored procedures were considered by many as a silver bullet.
They are more secure than parameterized SQL statements from application code, more performant, easier to understand, easier to test, etc.
So they said.
Today, I think that the consensus (and truth?) is that none of these points are true:
- Parameterized SQL statements are just as safe.
- 99% of queries you need to do are just as performant.
- Stored procedures are harder to (source) version.
- The size and scope of stored procedures (in my experience) have the tendency to grow out of control quickly. This is because they are difficult to modularize and test.
- Parameterized SQL statements can be tested as part of your .NET test suite if you’ve organized things well. No need to run separate database tests.
This is a great article on the topic.
#13: Dogmatically Never Using Stored Procedures
With all things in our industry, it’s easy to be dogmatic. This article is somewhat dogmatic 🤔.
Specifically with stored procedures, they do have a place.
In my mind, their main benefit is when you have some bulk-ish operations that you need to perform. Sending multiple SQL statements over-the-wire (like hundreds) can be slow. Especially if the database is not on the same machine as your .NET application (this is probably the case for most readers).
The latency between your application and database is usually negligible. But when you are doing bulk-ish stuff, then it can be a source of bottleneck.
Stored procedures have the benefit of storing the SQL statements on the database. You can send over the data you want to work on once, and then let the database do the bulk of the work. Instead of doing a loop in your application code, you put the loop in the stored procedure.
You can send all your data via 1 database call, or in chunks. Either way, this can be a great use case for using a stored procedure. I’ve seen massive improvements to critical operations with this approach, when used sparingly.
#14: Microservices
"Microservice all the things" sounds great.
The idea of decoupling teams and having high-performing, modular and isolated services sounds great. But there can be huge costs to over-leaning into microservices.
If you work in an organization who’s leaned into microservice heavily, you often find issues that appear as questions like:
- "How the heck do I integrate with team X?"
- "There are 5 versions of service X. Which one should I use since they are all being deployed to production?"
- "I’ve found 5 different ways that our organization solves [insert common software design problem]? What should I do?"
- "Which repository is that in? Which team owns that? The guy who wrote this isn’t around anymore. Does anyone know how this thing works?"
- "Our UI needs data from teams X, Y and Z. Which services and APIs should I use? Oh… team Y has 5 different services that I need to stitch together? Does anyone understand the data points for any of these? [10 meetings later]: So I guess there is a new service that combines these data points that no one else knew about."
All of these questions / musings are signs of costs: You need really good coordination, documentation, planning, etc. to pull it off well.
And so… all the things I said about internal "frameworks" I think applies to microservices:
- "It wasn’t supposed to be used everywhere"
- "This service doesn’t do functionality F that the previous version of the service did…"
- New developers have a lot of extra stuff to learn just to get started
- Terrible or no documentation
For more of my thoughts, I’d suggest checking out ".NET Architecture: How To Structure Your Solutions" and "What Is A Microservice Architecture?"
#15: Building Your Own Background Job Scheduler
This is just a plug for my open-source project 🤣.
But, there are costs to using open-source projects that I find developers rarely consider when they add a new library to their projects:
- No guarantee of support
- Using more libraries expands potential security profile
- Poorly abstracted code can cause your own code to introduce hacks or poor design patterns to fit the library into your code
Conclusion
We’re all still learning. There’s always more to learn. But, it’s okay to not learn something new every single day.
More important than learning everyday, I think, is to learn from our mistakes.
That is all.
Top comments (0)