Pattern matching is great, makes life easier, but it comes with a cost. Most of the time the cost is negligible. But it is better to know when not to use it.
Lets assume following simple case,
Pattern Matching
public abstract class Shape {
}
public class Circle: Shape {
public double Radius;
}
public class Rectangle: Shape {
public double Width;
public double Height;
}
We want to calculate area, by using pattern matching, so here is the code.
public double Pattern(Shape shape) {
switch(shape) {
case Circle circle:
return Math.PI * circle.Radius * circle.Radius;
case Rectangle rectangle:
return rectangle.Width * rectangle.Height;
}
throw new NotSupportedException();
}
Now lets analyze JIT ASM generated for above code,
C.Pattern(Shape)
L0000: push rsi
L0001: sub rsp, 0x20
L0005: vzeroupper
L0008: mov rsi, rdx
L000b: mov rdx, rsi
L000e: mov rcx, 0x7ffb9945d250
L0018: call 0x00007ffbef626620
L001d: test rax, rax
L0020: jne short L0049
L0022: mov rdx, rsi
L0025: mov rcx, 0x7ffb9945d3c0
L002f: call 0x00007ffbef626620
L0034: test rax, rax
L0037: je short L0064
L0039: vmovsd xmm0, [rax+8]
L003e: vmulsd xmm0, xmm0, [rax+0x10]
L0043: add rsp, 0x20
L0047: pop rsi
L0048: ret
L0049: vmovsd xmm0, [rax+8]
L004e: vmovaps xmm1, xmm0
L0052: vmulsd xmm1, xmm1, [C.Pattern(Shape)]
L005a: vmulsd xmm0, xmm0, xmm1
L005e: add rsp, 0x20
L0062: pop rsi
L0063: ret
L0064: mov rcx, 0x7ffb8fbddfd8
L006e: call 0x00007ffbef6278f0
L0073: mov rsi, rax
L0076: mov rcx, rsi
L0079: call System.NotSupportedException..ctor()
L007e: mov rcx, rsi
L0081: call 0x00007ffbef5eb3a0
L0086: int3
Notice lines, L0018: call 0x00007ffbef626620
and L002f: call 0x00007ffbef626620
, both calls are to check whether the object is of instance requested. I was also surprised to see this as pattern matching can also match an interface/base class. So it is not simple comparison. Every pattern matching case requires separate CALL
instruction, which is a considerable overhead on CPU.
So, in simple words, the last case will always require all instance of
CALL
before hitting the case. You can certainly move the most used cases on the top to make life easier.
Plain old OOPS
public abstract class Shape {
public abstract double Area();
}
public class Circle: Shape {
public double Radius;
public override double Area()
{
return Math.PI * Radius * Radius;
}
}
public class Rectangle: Shape {
public double Width;
public double Height;
public override double Area() {
return Width * Height;
}
}
Now lets call Area
method and see what JIT ASM generates.
public double Area(Shape shape) {
return shape.Area();
}
C.Area(Shape)
L0000: mov rcx, rdx
L0003: mov rax, [rdx]
L0006: mov rax, [rax+0x40]
L000a: mov rax, [rax+0x20]
L000e: jmp rax
It is simplest single call, it loads and address and jumps to it and returns from the method. The speed is achieved by avoiding instance of checks.
Complete sample
Can performance of Pattern matching improved?
Not really, high level languages usually store too much type information in order to provide improved pattern matching. But underlying cost of every single comparison is costly.
In Processor, any type of processor, be it ARM, X64 or any, single numeric comparison is much caster than hierarchical or sequential types. Basically type matching is a multi step comparison.
This is also the reason V8 team decided to store Enum flag to detect type in JavaScript inbuilt type instead of pattern matching.
Branchless
OOPS will also make most of your code branchless, which makes CPU eagerly load next set of instruction to execute instead of missing branch cache and slowing the execution.
More about branchless coding
Top comments (0)