Async programming is a style of programming that allows you to increase performance when handling events independent of the main program flow. For example, some common tasks that benefit from being asynchronous include:
- handling user interface events
- communicating over a network
- reading / writing to a secondary storage device
What's the problem?
As programmers, we are used to working with synchronous code. Synchronous functions are easy to understand - we call them, they do some work, and they return a value.
There are situations where this leads to very bad performance. Imagine we want to get the contents of every page given by a list of URLs.
public List<string> GetAllPages(List<string> urls)
{
var pages = new List<string>();
foreach (var url in urls)
{
pages.Add(GetPage(url));
}
return pages;
}
This will be very slow for large lists of URLs, as the for loop has to wait for each page request to finish before starting the next one. We can therefore say that each function call is blocking - it blocks the entire thread until the result is returned.
How can we fix it?
Async programming helps fix this issue by separating the function call and the function result. For example, we can start all the requests, and then wait for the results to arrive. Pseudocode might look something like this:
def GetAllPages(urls):
pages = list()
for url in urls:
StartRequest(url)
for url in urls:
pages.Append(WaitForResponse(url))
This is rather clunky, as you have to define a start and an end function for each operation you want to make asynchronous. Thankfully, C# has a nice syntax to make this easier.
How does it work in C#?
In C#, asynchronous operations are represented using the Task<T>
type. The "start" function returns a Task
object, from which the result can be obtained. If the function GetPage() was written using Task Awaitable Programming (TAP), you could write the program as follows:
public List<string> GetAllPages(List<string> urls)
{
var tasks = new List<Task<string>>();
foreach (var url in urls)
{
// starts the asynchronous operation
var task = GetPage(url);
tasks.Add(task);
}
var pages = new List<string>();
foreach (var task in tasks)
{
// blocks the thread to wait for the result
pages.Add(task.Result);
}
return pages;
}
Whilst this works, it isn't ideal. Although the individual calls to GetPage
aren't blocking the function, any external code that calls GetAllPages
will be blocked until all the requests have finished.
To combat this, we can write the function in an asynchronous style. This usually takes three steps:
- Add the
async
keyword - Change the return type to
Task<T>
- Replace synchronous waits (
Task.Wait()
,Task.Result
) with theawait
keyword
The previous function would now look as follows:
public async Task<List<string>> GetAllPages(List<string> urls)
{
var tasks = new List<Task<string>>();
foreach (var url in urls)
{
// starts the asynchronous operation
var task = GetPage(url);
tasks.Add(task);
}
var pages = new List<string>();
foreach (var task in tasks)
{
// doesn't block the thread
pages.Add(await task);
}
return pages;
}
await
VS .Result
When you call .Result
, the system thread remains blocked even when waiting.
On the other hand, using await
frees up the thread to allow other tasks to run. For example, this means that a server can use only 4 system threads to handle 100 clients (as opposed to the 100 system threads in a naive approach).
Therefore, await
should be used in place of .Result
whenever possible.
Note: A function must be marked
async
for theawait
keyword to be used.
What's the catch?
Async programming can be very useful in certain situations. As a rule of thumb, it is only increases performance when the program is IO-bound. You shouldn't use async functions to do CPU-bound calculations, as it:
- provides almost no useful functionality
- makes the code less readable
- might decrease performance
One exception to this rule is the Task.Run()
function, which allows CPU-bound work to be performed on a background thread.
Footnote
If you enjoyed reading this, then consider dropping a like or following me:
I'm just starting out, so the support is greatly appreciated!
Disclaimer - I'm a (mostly) self-taught programmer, and I use my blog to share things that I've learnt on my journey to becoming a better developer. Because of this, I apologise in advance for any inaccuracies I might have made - criticism and corrections are welcome!
Top comments (5)
I'm mostly self taught as well and I'll admit I didn't deal much with IO bound operations and by consequence async code but I found the explanation of how this works In c# easy to understand.
The only question I would pose is how do you then work with the result of the function.
Are you referring to the initial Task object that the function returns? Or the result when the Task object is awaited?
Given the async function you wrote. How would I work then with it's result.
I've heard some things on how Task in C# is monadic so I assume that you never leave the async state and just chain function execution kinda like the javascript's .then()
The GetAllPages function returns a task, which can either be awaited (if called by an async function) or synchronously waited using .Result (if called by a synchronous function).
I’m not familiar with the term “monadic”, however I think you are correct that async generates some boilerplate code. You write it as if the code is a single function block, but behind the scenes it automatically generates exit and entry points before and after each await. So yes, I think it is similar to JavaScript’s .then() functionality.
In a very naive term that will probably get me hunted by the FP Inquisition, I would risk saying that a monad is a "context box".
Examples are things like Thask where you chain tbem with await (or .then()) or sometimes you get to take the thing out of the box.
Another famous example is IO, since those operations can fail the result such a function gets wraped in IO so that it can just be called in the middle of any old function without basically forcing you to wrap the enclosing function in IO