Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLRIE appears to be incompatible with 1 or more MS Exchange services (services crash) #399

Open
sanikolov opened this issue May 13, 2021 · 8 comments

Comments

@sanikolov
Copy link

OS: windows server 2016 or 2019
Microsoft Exchange Server version: 2016 or 2019, e.g. Cumulative Update 9
Process that I profiled: MSExchangeHMHost.exe but other MSEchange*.exe services are crashing too
.NET Framework version = 4.0.30319 but I don't think this matters

No other profilers are loaded as you can see from this message, meaning that the CLRIE profiler is the only thing loaded and it's not configured to instrument anything. The mere presence of the CLRIE profiler is enough to cause a crash.

No instrumentation method configs found to load in process 18028l
(466c.2090): Break instruction exception - code 80000003 (first chance)
*** WARNING: Unable to verify checksum for C:\Program Files\Microsoft CLR Instrumentation Engine\1.0.36\Instrumentation64\MicrosoftInstrumentationEngine_x64.dll
MicrosoftInstrumentationEngine_x64!GetInstrumentationEngineLogger+0x458e8:
00007fff`eba1e298 cc              int     3

Here is an interesting observation: if you set MicrosoftInstrumentationEngine_DebugWait to 1 and are able to attach to the service swiftly from windbg.exe then when you say (g)o to windbg the service appears to run as expected, no crashes.
Please take a look. I'd imagine that on azure users may want to gather metrics for MS Exchange.

@wiktork
Copy link
Member

wiktork commented May 13, 2021

cc @WilliamXieMSFT I believe this should be addressed by #386 and should be fixed with the latest release.

@sanikolov
Copy link
Author

nice, let me try the latest release and confirm or infirm your guess.

@sanikolov
Copy link
Author

sanikolov commented May 13, 2021

Unfortunately, version 39 does not solve the MS Exchange issues.
DebugWait=0 crashes, DebugWait=1 allows me to verify that all variables, path are as expected and does not crash.

00007fff`cfcc0000 00007fff`cfe20000   MicrosoftInstrumentationEngine_x64 C (export symbols)       C:\Program Files\Microsoft CLR Instrumentation Engine\1.0.39\Instrumentation64\MicrosoftInstrumentationEngine_x64.dll
00007fff`d76b0000 00007fff`d777c000   Microsoft_Office_Datacenter_Monitoring_ActiveMonitoring_Recovery_ni   (deferred)             
00007fff`de3c0000 00007fff`def75000   Microsoft_Exchange_Common_ComponentConfig_Transport_ni   (deferred)             
00007fff`def80000 00007fff`df719000   Microsoft_Exchange_Common_Directory_DirectoryVariantConfig_ni   (deferred)             
00007fff`df720000 00007fff`dfab9000   Microsoft_Exchange_Rpc_ni   (deferred)             
00007fff`dffb0000 00007fff`e0010000   Microsoft_Practices_ObjectBuilder2_ni   (deferred)             
00007fff`e0080000 00007fff`e097c000   Microsoft_Exchange_Data_ni   (deferred)             
00007fff`efbf0000 00007fff`efddd000   Microsoft_CSharp_ni   (deferred)             
00007fff`f7200000 00007fff`f7265000   ManagedAvailabilityCrimsonMsg_ni   (deferred)             
00007fff`f8200000 00007fff`f824a000   InstrumentationEngine_ProfilerProxy_x64   (deferred)

Again, none of our DLLs are getting loaded as a test to see how the instrumentation engine will fare on its own.

@WilliamXieMSFT
Copy link
Member

Hi @sanikolov, would you be able to provide us ([email protected]) a dump of the crash and also any log files (Errors|Dumps)? This sounds like a tricky bug if debugging causes it to go away.

@sanikolov
Copy link
Author

Indeed it is tricky to debug.
I had time to get deeper into this bug today and tried a couple things, no luck.
First I used DebugDiag to capture a full dump. My rule failed to collect a single dump. See images below.
Second, I enabled Application Verifier and gflags for the purpose of getting windbg started and attached to the service automatically. Never happened, no matter what checkboxes I clicked. See below some screenshots.

These are the variables under Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeHM
subkey Environment

COR_ENABLE_PROFILING=1
COR_PROFILER={21419204-CA6F-464E-BC1D-E4506B9D333F}
COR_PROFILER_PATH_64=C:\Program Files\Microsoft CLR Instrumentation Engine\Proxy\v1\InstrumentationEngine.ProfilerProxy_x64.dll
MicrosoftInstrumentationEngine_DisableCodeSignatureValidation=1
MicrosoftInstrumentationEngine_LogLevel=None
MicrosoftInstrumentationEngine_DebugWait=0

image
image
image
image

It is really weird that none of the above actions resulted in any progress.

@sanikolov
Copy link
Author

sanikolov commented May 21, 2021

btw process explorer shows werfault.exe popping up at the time of the crash as is customary.
So we're dealing with a crash, not a silent exit, in my opinion.

@WilliamXieMSFT
Copy link
Member

Hi @sanikolov, would it be possible to run procdump to generate a dump file? Also, would you mind setting MicrosoftInstrumentationEngine_LogLevel=Errors|Dumps and MicrosoftInstrumentationEngine_FileLogPath to some folder on disk ([path]\ with backslash) so we can persist the logs for all processes that get profiled?

@sanikolov
Copy link
Author

I tried your suggestions. Nothing came of it.
First I requested system wide catch all for crashing processes.

C:\tmp\Procdump>procdump64.exe -i c:\tmp\crashes -ma

ProcDump v10.0 - Sysinternals process dump utility
Copyright (C) 2009-2020 Mark Russinovich and Andrew Richards
Sysinternals - www.sysinternals.com

Set to:
  HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug
    (REG_SZ) Auto     = 1
    (REG_SZ) Debugger = "C:\tmp\Procdump\procdump64.exe" -accepteula -ma -j "c:\tmp\crashes" %ld %ld %p

ProcDump is now set as the Just-in-time (AeDebug) debugger.

Nothing was generated in folder C:\tmp\crashes\ during several service restarts, each of which crashed.
Subsequently I changed
MicrosoftInstrumentationEngine_LogLevel to Errors and later to Dumps while setting
MicrosoftInstrumentationEngine_FileLogPath to a valid location (which I have done tens of times in the past few months).
Restarted a couple more times.
No logs were generated.
This bug may require special skill or debug environment to resolve, such as a JTAG or something sophisticated like that.
At this point I am gonna have to leave the bug in your capable hands.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants