Recently, I was looking at the implementation principle of virtual method calls in C++, which means that a pointer to the vtable array pointer array is stored in the first position of the class, and each pointer element in the vtable array points to its own virtual method. The implementation method is very interesting. Haha, now I am curious about how to implement it in C\.
1: Polymorphic play in C\
1. a simple C\
For the convenience of explanation, I define a Person class and a Chinese class. The detailed code is as follows:
internal class Program { static void Main(string[] args) { Person person = new Chinese(); person.SayHello(); Console.ReadLine(); } } public class Person { public virtual void SayHello() { Console.WriteLine("sayhello"); } } public class Chinese: Person { public override void SayHello() { Console.WriteLine("chinese"); } } }
2. assembly code analysis
Next, use windbg in person At the next breakpoint at sayhello (), take a look at its disassembly Code:
D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs @ 9: 05cf21b3 b93c5dce05 mov ecx,5CE5D3Ch (MT: ConsoleApp1.Chinese) 05cf21b8 e8030f89fa call 005830c0 (JitHelp: CORINFO_HELP_NEWSFAST) 05cf21bd 8945f4 mov dword ptr [ebp-0Ch],eax 05cf21c0 8b4df4 mov ecx,dword ptr [ebp-0Ch] 05cf21c3 e820fbffff call 05cf1ce8 (ConsoleApp1.Chinese..ctor(), mdToken: 0600000A) 05cf21c8 8b4df4 mov ecx,dword ptr [ebp-0Ch] 05cf21cb 894df8 mov dword ptr [ebp-8],ecx D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs @ 11: >>> 05cf21ce 8b4df8 mov ecx,dword ptr [ebp-8] 05cf21d1 8b45f8 mov eax,dword ptr [ebp-8] 05cf21d4 8b00 mov eax,dword ptr [eax] 05cf21d6 8b4028 mov eax,dword ptr [eax+28h] 05cf21d9 ff5010 call dword ptr [eax+10h] 05cf21dc 90 nop
From the assembly code, the logic is very clear, and the general steps are as follows:
- eax,dword ptr [ebp-8]
Get the first address of person on the heap from the stack (ebp-8). If you don't believe it, you can use! do 027ea88c try it.
0:000> dp ebp-8 L1 0057f300 027ea88c 0:000> !do 027ea88c Name: ConsoleApp1.Chinese MethodTable: 05ce5d3c EEClass: 05cd3380 Size: 12(0xc) bytes File: D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll Fields: None
- eax,dword ptr [eax]
If you know the memory layout of the instance on the heap, you should know that the first address stores the methodtable pointer, which we can use! dumpmt 05ce5d3c to verify.
0:000> dp 027ea88c L1 027ea88c 05ce5d3c 0:000> !dumpmt 05ce5d3c EEClass: 05cd3380 Module: 05addb14 Name: ConsoleApp1.Chinese mdToken: 02000007 File: D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll BaseSize: 0xc ComponentSize: 0x0 DynamicStatics: false ContainsPointers false Slots in VTable: 6 Number of IFaces in IFaceMap: 0
- eax,dword ptr [eax+28h]
What does that mean? If you know CoreCLR, you should know that the methodtable is hosted by a class MethodTable class, so it takes a field with the methodtable offset of 0x28. What is the offset field? First, we use dt to export the methodtable structure.
0:000> dt 05ce5d3c MethodTable coreclr!MethodTable =7ad96bc8 s_pMethodDataCache : 0x00639ec8 MethodDataCache =7ad96bc4 s_fUseParentMethodData : 0n1 =7ad96bcc s_fUseMethodDataCache : 0n1 +0x000 m_dwFlags : 0xc +0x004 m_BaseSize : 0x74088 +0x008 m_wFlags2 : 5 +0x00a m_wToken : 0 +0x00c m_wNumVirtuals : 0x5ccc +0x00e m_wNumInterfaces : 0x5ce +0x010 m_pParentMethodTable : IndirectPointer<MethodTable *> +0x014 m_pLoaderModule : PlainPointer<Module *> +0x018 m_pWriteableData : PlainPointer<MethodTableWriteableData *> +0x01c m_pEEClass : PlainPointer<EEClass *> +0x01c m_pCanonMT : PlainPointer<unsigned long> +0x020 m_pPerInstInfo : PlainPointer<PlainPointer<Dictionary *> *> +0x020 m_ElementTypeHnd : 0 +0x020 m_pMultipurposeSlot1 : 0 +0x024 m_pInterfaceMap : PlainPointer<InterfaceInfo_t *> +0x024 m_pMultipurposeSlot2 : 0x5ce5d68 =7ad04c78 c_DispatchMapSlotOffsets : [0] " $ (System.Private.CoreLib.dll" =7ad04c70 c_NonVirtualSlotsOffsets : [0] " $ ($((, $ (System.Private.CoreLib.dll" =7ad04c60 c_ModuleOverrideOffsets : [0] " $ ($((,$((,(,,0 $ ($((, $ (System.Private.CoreLib.dll" =7ad12838 c_OptionalMembersStartOffsets : [0] "(((((((,(((,(,,0(((,(,,0(,,0,004"
From the layout of methodtable, eax+28h is m_ The second field of pmultipurposeslot2 structure, because the first field is the virtual method table pointer. If you want to verify, it is also very simple. Use! dumpmt -md 05ce5d3c export all the methods, and then combine dp 05ce5d3c to see if there are many methods after 0x5ce5d68.
0:000> !dumpmt -md 05ce5d3c EEClass: 05cd3380 Module: 05addb14 Name: ConsoleApp1.Chinese mdToken: 02000007 File: D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll BaseSize: 0xc ComponentSize: 0x0 DynamicStatics: false ContainsPointers false Slots in VTable: 6 Number of IFaces in IFaceMap: 0 -------------------------------------- MethodDesc Table Entry MethodDe JIT Name 02610028 02605568 NONE System.Object.Finalize() 02610030 02605574 NONE System.Object.ToString() 02610038 02605580 NONE System.Object.Equals(System.Object) 02610050 026055ac NONE System.Object.GetHashCode() 05CF1CE0 05ce5d24 NONE ConsoleApp1.Chinese.SayHello() 05CF1CE8 05ce5d30 JIT ConsoleApp1.Chinese..ctor() 0:000> dp 05ce5d3c L10 05ce5d3c 00000200 0000000c 00074088 00000005 05ce5d4c 05ce5ccc 05addb14 05ce5d7c 05cd3380 05ce5d5c 05cf1ce8 00000000 05ce5d68 02610028 05ce5d6c 02610030 02610038 02610050 05cf1ce0
Take a closer look at the output. 02610028 after 05ce5d68 above is system Object. Finalize() method, 02610030 corresponds to system Object. Tostring() method.
- call dword ptr [eax+10h]
With the above foundation, this sentence is easy to understand. It starts from M_ Find the location of the cell pointer where SayHello is located in the pmultipurposelot2 structure, and then make a call.
0:000> !U 05cf1ce0 Unmanaged code 05cf1ce0 e88f9dde74 call coreclr!PrecodeFixupThunk (7aadba74) 05cf1ce5 5e pop esi 05cf1ce6 0001 add byte ptr [ecx],al 05cf1ce8 e913050000 jmp 05cf2200 05cf1ced 5f pop edi 05cf1cee 0300 add eax,dword ptr [eax] 05cf1cf0 245d and al,5Dh 05cf1cf2 ce into 05cf1cf3 0500000000 add eax,0 05cf1cf8 0000 add byte ptr [eax],al
From the assembly point of view, it is still a piece of pile code. The implication is that the method has not been JIT compiled. If the compilation is completed, 05cf1ce0 05ce5d24 none consoleapp1 Chinese. The Entry (05CF1CE0) of sayhello() will also be modified synchronously. It is easy to verify. We will continue to compile the go code, and then dumpmt.
0:008> !dumpmt -md 05ce5d3c EEClass: 05cd3380 Module: 05addb14 Name: ConsoleApp1.Chinese mdToken: 02000007 File: D:\net6\ConsoleApplication2\ConsoleApp1\bin\x86\Debug\net6.0\ConsoleApp1.dll BaseSize: 0xc ComponentSize: 0x0 DynamicStatics: false ContainsPointers false Slots in VTable: 6 Number of IFaces in IFaceMap: 0 -------------------------------------- MethodDesc Table Entry MethodDe JIT Name 02610028 02605568 NONE System.Object.Finalize() 02610030 02605574 NONE System.Object.ToString() 02610038 02605580 NONE System.Object.Equals(System.Object) 02610050 026055ac NONE System.Object.GetHashCode() 05CF2270 05ce5d24 JIT ConsoleApp1.Chinese.SayHello() 05CF1CE8 05ce5d30 JIT ConsoleApp1.Chinese..ctor() 0:008> dp 05ce5d3c L10 05ce5d3c 00000200 0000000c 00074088 00000005 05ce5d4c 05ce5ccc 05addb14 05ce5d7c 05cd3380 05ce5d5c 05cf1ce8 00000000 05ce5d68 02610028 05ce5d6c 02610030 02610038 02610050 05cf2270
At this point, you can see that it has changed from 05cf1ce0 to 05cf2270. This is the JIT compiled method code. We use! U decompile.
0:008> !U 05cf2270 Normal JIT generated code ConsoleApp1.Chinese.SayHello() ilAddr is 05E720D5 pImport is 008F6E88 Begin 05CF2270, size 27 D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs @ 28: >>> 05cf2270 55 push ebp 05cf2271 8bec mov ebp,esp 05cf2273 50 push eax 05cf2274 894dfc mov dword ptr [ebp-4],ecx 05cf2277 833d74dcad0500 cmp dword ptr ds:[5ADDC74h],0 05cf227e 7405 je 05cf2285 05cf2280 e8cb2bf174 call coreclr!JIT_DbgIsJustMyCode (7ac04e50) 05cf2285 90 nop D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs @ 29: 05cf2286 8b0d74207e04 mov ecx,dword ptr ds:[47E2074h] ("chinese") 05cf228c e8dffbffff call 05cf1e70 05cf2291 90 nop D:\net6\ConsoleApplication2\ConsoleApp1\Program.cs @ 30: 05cf2292 90 nop 05cf2293 8be5 mov esp,ebp 05cf2295 5d pop ebp 05cf2296 c3 ret
Finally, this is consoleapp1 Chinese. Sayhello method.
3. summary
In essence, CoreCLR is also written in C++, so we can't escape the game of using virtual tables to realize polymorphism. However, the game is a little more complicated. I hope this article will be helpful to you.