Through some extensive testing I've discovered a fundamental bug in either Exchange 2010, ESXi, or both.
The Issue: My VMs freeze completely and never recover at random intervals, usually after running for a few hours. There is no amount of debugging or looking for patterns on the Windows platform that has revealed any culprit, nor have I even found a process or activity that correlates with the failure. From the OS perspective, the HW just freezes. From the ESXi perspective, the VM goes into maximum CPU utilization, medium memory utilization and and stays there permanently. VMware tools fails connections and the VMs have to be manually reset.
My platform is fairly simple:
Please note that numerous other VMs running Server 2008 R2 have no problems at all, only the ones running Exchange 2010 have problems.
Troubleshooting steps, none of which made any apparent difference:
The workaround: I loaded up Server 2008 sp2 (non R2) and everything works perfectly. No freezes, and in fact no problems of any kind. If anything it feels faster to me in terms of GUI consoles, responsiveness, etc. I even got Forefront and DAGs to work just fine on this platform.
What does this mean? I think it means that we have positively identified a serious bug that both VMware and Microsoft should be taking seriously. If Microsoft caused this to promote their Hyper-V product, then that's a definite misstep. If VMware knows this is a problem and isn't officially acknowledging it yet, then that is also a big misstep. We have positively identified that on some platforms, Exchange 2010 on Server 2008 R2 simply will not work properly. There is NO existing change, variable or patch that helps (vcpu, memory,video, network, storage, etc). I've currently only heard reports from individuals running this on HP DL380 G5 systems; although this might indicate a possible culprit, that is hard to say because this HW is one of the most popular platforms on earth. I have confirmed that the issue is identical on Intel 5160 and 5460 processors alike.
So to summarize: EWE = URFM (ESXi 4.0.1 + Windows server 2008 R2 + Exchange 2010 = Unusable Randomly Freezing Machines)