Discussion on: The newly announced future of .NET - unifying all the things

View post

Replies for: So .NET Core becomes .NET 5 and basically usurps the remaining missing features from .NET Framework and becomes the new standard run-time. I think...

practically every popular language and platform (perhaps except for Go and Rust) currently outsource anything performance-critical to C

I might not be 100% aware of everything behind the scenes in C# but I don't believe that is actually the case for C#/.NET (at least it hasn't been for a while).

The Roslyn compiler for C# is self-hosted, meaning that it is actually written in C#. You also have libraries like ImageSharp which is a C# image processing library. Both of these things require fast operation with low memory overhead.

I was looking up information to back up why C# is fast and there are many articles that trade blows saying "C++ is faster", "C# is faster" or "Java is faster" (kidding on that last one) though because there was such a mixed bag of tests, opinions and explanations - I thought I would instead explain from another angle.

Regardless of your programming language, you can write slow code. In C, C++, C# etc, you can also write very performant code. Static compilers can make certain optimisations from the get-go and results are naturally crazy fast for being compiled to such a low level. JIT compilers can make optimisations on the fly and results can be finely tuned for any specific target machine.

I don't see a reason why someone couldn't write a fast low-level binary database protocol or a neural network in C# and be as fast and as memory efficient as their needs actually are.

Eventually the language won't be the bottleneck anymore but the device it is running on (CPU vs GPU etc).

Rasmus Schultz • May 8 '19

Since you mention ImageSharp, take a look at this article:

devblogs.microsoft.com/dotnet/net-...

There is a radical performance difference between ImageSharp and something like SkiaSharp, which outsources to C++.

People outsource this kind of work to C++ for good reasons - and languages like C# offer run-time interop with C and C++ binaries for some of the same reasons.

My main point isn't so much whether bytecode VM languages need to outsource - more the fact that they do. Languages like C#, JavaScript, PHP, Ruby, Python and so on, all have a bunch of run-time APIs that were written in C or C++, most of it inaccessible to the developers who code in those languages.

C# isn't much different in that regard. While, yes, the C# compiler itself was written in C#, the CLR and all the low-level types etc. are C code, mostly inaccessible to developers who work in C#.

You could argue that CLR isn't part of the C# language - but C# isn't really complete without the ability to run it, and I'd argue the same for any language that isn't fully bootstrapped and self-contained.

From what I've seen, VM-based languages (with JIT and garbage-collection and many other performance factors) aren't really suitable for very low-level stuff, which always gets outsourced.

Something common like loading/saving/resizing images is an excellent example of something that practically always gets outsourced - and whenever anybody shows up claiming they've built a "fast" native alternative, these typically turn out to be anywhere from 2 to 10 times slower, usually with much higher memory footprint.

From my point of view, languages like C#, JavaScript and PHP are all broadly in the same performance category: they're fast enough. But not fast enough to stand alone without outsourcing some of the heavy lifting to another language.

And yeah, maybe that's just the cost of high level languages. 🤷‍♂️

James Turner • May 9 '19 • Edited

That article you've linked to is an interesting read and you're right about the CLR, just had a look at the CoreCLR project on GitHub and it is 2/3rds C# and 1/3rd C++ (seems to be the JIT and GC).

With the article and the ImageSharp comparison, I am curious to how much faster it would be now 2 years on with newer versions of .NET Core (there have been significant improvements) as well as any algorithmic updates that may have occurred - might see if I can run the test suite on my laptop. (EDIT: See bottom of comment for test results)

That C# isn't (in various tests and applications) as fast as C++ doesn't mean it always is or has to be that way. Like, it isn't that C++ code is executed on the processor, it is machine code. If C# and C++ generated the same machine code, arguably they should be identical in performance.

C# usually can't generate the same machine code due in part to overhead in protecting us from ourselves as well as things like GC etc. C# though does have the ability to work with pointers and operate in "unsafe" manners which may very well generate the exact same code.

All that said, I do expect though that it is naturally easier to write fast code in C++ than C# for that exact reason - you are always working with pointers and the very lowest level of value manipulation. The average C# developer likely would never use the various unsafe methods in C# to get close to the same level of access.

Just an aside, it seems like the tables of data and the graphs displayed in that article you linked to don't actually match each other. 🤷‍♂️

EDIT: I did run the tests from the article you linked me as they had it all configured up on GitHub. Short answer: Skia is still faster but the results are interesting.

Note 1: I disabled the tests of the other image libraries as I was only curious about ImageSharp.
Note 2: If you want to replicate it on your machine, you will need to change the Nuget package referenced for ImageSharp as it isn't available. I am using the latest beta version instead. This means one line needs to be altered in the tests as a property doesn't exist - you'll see if you compile it.

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17134.706 (1803/April2018Update/Redstone4)
Intel Core i7-6700HQ CPU 2.60GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
Frequency=2531249 Hz, Resolution=395.0619 ns, Timer=TSC
.NET Core SDK=2.2.101
  [Host]            : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT
  .Net Core 2.2 CLI : .NET Core 2.2.0 (CoreCLR 4.6.27110.04, CoreFX 4.6.27110.04), 64bit RyuJIT

Job=.Net Core 2.2 CLI  Jit=RyuJit  Platform=X64  
Toolchain=.NET Core 2.2  IterationCount=5  WarmupCount=5  

|                                Method |     Mean |      Error |    StdDev | Ratio | RatioSD |     Gen 0 | Gen 1 | Gen 2 |  Allocated |
|-------------------------------------- |---------:|-----------:|----------:|------:|--------:|----------:|------:|------:|-----------:|
|   'System.Drawing Load, Resize, Save' | 594.9 ms | 113.475 ms | 29.469 ms |  1.00 |    0.00 |         - |     - |     - |   79.96 KB |
|       'ImageSharp Load, Resize, Save' | 318.8 ms |  10.496 ms |  2.726 ms |  0.54 |    0.02 |         - |     - |     - |  1337.7 KB |
| 'SkiaSharp Canvas Load, Resize, Save' | 273.5 ms |  10.264 ms |  2.665 ms |  0.46 |    0.02 | 1000.0000 |     - |     - | 4001.79 KB |
| 'SkiaSharp Bitmap Load, Resize, Save' | 269.1 ms |   6.619 ms |  1.719 ms |  0.45 |    0.02 | 1000.0000 |     - |     - | 3995.29 KB |

So SkiaSharp actually uses 3x the memory for the ~26% performance improvement.

James Jackson-South • Sep 8 '19

The performance differences between the two libraries boil down to the performance of our jpeg decoder. Until .NET Core 3.0 the equivalent hardware intrinsics APIs simply haven't been available to C# to allow equivalent performance but that is all about to change.

devblogs.microsoft.com/dotnet/hard...

The ImageSharp decoder currently has to perform its IDCT (Inverse Discrete Cosine Transforms) operations using the generic Vector<T> struct and co from System.Numeric.Vectors, we have to move into single precision floating point in order to do so and then alter our results to make them less correct and more inline with the equivalent integral hardware intrinsics driven operations found in libjpeg-turbo (This is what Skia and others use to decode jpegs). This, of course takes more time to do and we want to do better.

With .NET Core 3.0 and beyond we will be finally be able to use the very same approach other libraries use and achieve equivalent performance with the added bonus of a less cryptic API and easier installation story.

Incidentally our Resize algorithms not only offer equivalent performance to Skia but also do a better job, yielding higher quality output and better handling edge cases that trip up other graphics APIs.

It's perfectly possible to write very high performance code with C# - Many great APIs already exist to help do so Span<T>, Memory<T> etc and there is a great focus at Microsoft currently to further improve the performance story.

James Turner • Sep 9 '19

I'm writing an article about maximising performance in .NET which starts off with some simple things but moves onto things like using Span<T>, unsafe operations or even the dense matrix you helped explain to me on Twitter.

When I saw that blog post about hardware intrinsics, that got me even more excited to share the big performance gains that can be had.

There definitely hasn't been a better time to be working with .NET!