Description
I found a couple places where the JIT seems to be adding unnecessary movzx/movsxd instructions when doing unchecked conversions from char to byte and then up again to IntPtr. It's basically just doing the same operation again.
Consider this simple example (sharplab.io sample here):
public static unsafe IntPtr Convert(char c)
{
return unchecked((IntPtr)(void*)(byte)c);
}
I get the following:
; V00 arg0 [V00,T00] ( 3, 3 ) ushort -> rcx
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
;* V02 tmp1 [V02 ] ( 0, 0 ) long -> zero-ref "Inlining Arg"
; V03 tmp2 [V03,T01] ( 2, 4 ) long -> rax "NewObj constructor temp"
;* V04 tmp3 [V04 ] ( 0, 0 ) long -> zero-ref "Inlining Arg"
;
; Lcl frame size = 0
G_M51760_IG01:
;; bbWeight=1 PerfScore 0.00
G_M51760_IG02:
0FB7C1 movzx rax, cx
0FB6C0 movzx rax, al
;; bbWeight=1 PerfScore 0.50
G_M51760_IG03:
C3 ret
;; bbWeight=1 PerfScore 1.00
; Total bytes of code 7, prolog size 0, PerfScore 2.20
That second movzx rax, al is not actually neeced, right?
Or, this other related case (sharplab.io sample here):
// Assume this has 255 items in it
private static ReadOnlySpan<byte> SomeTable => new byte[] { 1, 2, 3 };
[MethodImpl(MethodImplOptions.AggressiveOptimization)]
public static byte Lookup(char c)
{
ref byte r0 = ref MemoryMarshal.GetReference(SomeTable);
return Unsafe.Add(ref r0, unchecked((byte)c));
}
Gives the following:
; V00 arg0 [V00,T00] ( 3, 3 ) ushort -> rcx
;# V01 OutArgs [V01 ] ( 1, 1 ) lclBlk ( 0) [rsp+0x00] "OutgoingArgSpace"
;* V02 tmp1 [V02 ] ( 0, 0 ) struct (16) zero-ref "struct address for call/obj"
;* V03 tmp2 [V03 ] ( 0, 0 ) struct (16) zero-ref "NewObj constructor temp"
;* V04 tmp3 [V04 ] ( 0, 0 ) struct ( 8) zero-ref "NewObj constructor temp"
;* V05 tmp4 [V05 ] ( 0, 0 ) struct (16) zero-ref ld-addr-op "Inlining Arg"
;* V06 tmp5 [V06 ] ( 0, 0 ) byref -> zero-ref "Inlining Arg"
;* V07 tmp6 [V07 ] ( 0, 0 ) int -> zero-ref "Inlining Arg"
;* V08 tmp7 [V08 ] ( 0, 0 ) byref -> zero-ref V02._pointer(offs=0x00) P-INDEP "field V02._pointer (fldOffset=0x0)"
;* V09 tmp8 [V09 ] ( 0, 0 ) int -> zero-ref V02._length(offs=0x08) P-INDEP "field V02._length (fldOffset=0x8)"
;* V10 tmp9 [V10 ] ( 0, 0 ) byref -> zero-ref V03._pointer(offs=0x00) P-INDEP "field V03._pointer (fldOffset=0x0)"
;* V11 tmp10 [V11 ] ( 0, 0 ) int -> zero-ref V03._length(offs=0x08) P-INDEP "field V03._length (fldOffset=0x8)"
;* V12 tmp11 [V12 ] ( 0, 0 ) byref -> zero-ref V04._value(offs=0x00) P-INDEP "field V04._value (fldOffset=0x0)"
;* V13 tmp12 [V13 ] ( 0, 0 ) byref -> zero-ref V05._pointer(offs=0x00) P-INDEP "field V05._pointer (fldOffset=0x0)"
;* V14 tmp13 [V14 ] ( 0, 0 ) int -> zero-ref V05._length(offs=0x08) P-INDEP "field V05._length (fldOffset=0x8)"
;
; Lcl frame size = 0
G_M14832_IG01:
;; bbWeight=1 PerfScore 0.00
G_M14832_IG02:
0FB7C1 movzx rax, cx
0FB6C0 movzx rax, al
4863C0 movsxd rax, eax
48BA7CB76ABD22020000 mov rdx, 0x222BD6AB77C
0FB60410 movzx rax, byte ptr [rax+rdx]
;; bbWeight=1 PerfScore 3.00
G_M14832_IG03:
C3 ret
;; bbWeight=1 PerfScore 1.00
; Total bytes of code 24, prolog size 0, PerfScore 6.40
Here we have both the unnecessary movzx, as well as that extra movsxd rax, eax (presumably because the Unsafe.Add overload it picks up is the int one?) that I think the JIT should be able to remove as well, as it should be able to see that the value is immediately converted to native integer right after that, so that sign extension isn't necessary in this situation.
Thankfully this can be worked around by just manually casting to IntPtr, I couldn't remove that duplicate movzx though.
Configuration
- Tested on both .NET 5 Preview 6 with disasmo, and on sharplab.io on .NET Core 3.x
- Locally I have Windows 10 Pro x64 19041.x
- x64
Regression?
From what I can see on sharplab.io, this happens on .NET Core 3.x too, and even way back on .NET Framework x64.
category:cq
theme:basic-cq
skill-level:intermediate
cost:medium
Description
I found a couple places where the JIT seems to be adding unnecessary
movzx/movsxdinstructions when doing unchecked conversions fromchartobyteand then up again toIntPtr. It's basically just doing the same operation again.Consider this simple example (sharplab.io sample here):
I get the following:
That second
movzx rax, alis not actually neeced, right?Or, this other related case (sharplab.io sample here):
Gives the following:
Here we have both the unnecessary
movzx, as well as that extramovsxd rax, eax(presumably because theUnsafe.Addoverload it picks up is theintone?) that I think the JIT should be able to remove as well, as it should be able to see that the value is immediately converted to native integer right after that, so that sign extension isn't necessary in this situation.Thankfully this can be worked around by just manually casting to
IntPtr, I couldn't remove that duplicatemovzxthough.Configuration
Regression?
From what I can see on sharplab.io, this happens on .NET Core 3.x too, and even way back on .NET Framework x64.
category:cq
theme:basic-cq
skill-level:intermediate
cost:medium