VM crashes, unstable, or unable to start VM due to server having 1TB or more memory

Answer ID 4562   |    Updated 10/11/2017 12:55 PM

VM crashes, is unstable, or unable to start VM due to server having 1TB or more memory. For example, when Citrix XenDesktop is used with a Windows 7 guest OS, a blue screen crash may occur.

This issue affects only systems with supported GPUs based on the Maxwell architecture: Tesla M6, Tesla M10, and Tesla M60.

You may get nvidia-smi output similar to shown below and failure symptoms can vary.

 

Fri Sep 15 09:13:43 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 384.73                 Driver Version: 384.73                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla M60           On   | 00000000:05:00.0 Off |                  Off |
| N/A   35C    P8    24W / 150W |    526MiB /  8191MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla M60           On   | 00000000:06:00.0 Off |                  Off |
| N/A   30C    P8    23W / 150W |     13MiB /  8191MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     71728  M+C+G   TestImage                        508MiB |
+-----------------------------------------------------------------------------+
GPU 00000000:05:00.0: Detected Critical Xid Error
GPU 00000000:05:00.0: Detected Critical Xid Error

 

The solution is to limit the amount of system memory on the server to less than 1 TB.

Was this answer helpful?
 
Your rating has been submitted, please tell us how we can make this answer more useful.

Print