FAQs (Frequently Asked Questions) from VMware-NVIDIA Community Webinar 18th May 2016 on vGPU Time-slicing
FAQs (Frequently Asked Questions) from VMware-NVIDIA Community Webinar 18th May 2016 on vGPU Time-slicing
Wed, 18th May (8AM PST, 11AM EST, 4PM UK) - NVIDIA / VMware Webinar Series: Unveiling the Impact of Time Slicing with NVIDIA 8AM PST, 11AM EST, 4PM UK GRID
This is a summary of the questions arising from the interactive chat Q&A from a webinar, presented by Erik Bornhorst from NVIDIA's GRID Performance Engineering Group and Tony Paikeday from VMware, for the VMware User Group Community in May 2016. We have checked and enhanced the answers but readers should be aware that as products evolve the answers may change and as such all information should be queried against the current product documentation. We aim to release these FAQs for the benefit of those who attended the webinars and also the wider community. Users with follow-on questions are encouraged to post on the NVIDIA GRID Forums where GRID support, product and engineering staff, answer questions: https://gridforums.nvidia.com/ .
Q: Will this webinar be available for replay later?
A: Yes it will - all our past VMware webinars are on the community group site https://communities.vmware.com/community/vmtn/vmware-nvidia-direct-access-program
We also have started publishing FAQs from the webinars on our Knowledge base, for example: http://nvidia.custhelp.com/app/answers/detail/a_id/4112/. You can also click on the "Answers" tab and search for information, known issues, how-to-guides and videos for GRID and vGPU.
Q: I'm talking to a customer tomorrow about how to implement vGPU's in the environment of the customer. Where can I get more information?
A: If you have questions need pointers - our forums are a great resource for pure NVIDIA questions http://gridforums.nvidia.com is where you'll find a lot of NVIDIA staff and community gurus:
Customers often find it useful to review the case studies containing details of others' real world problems and use of GRID as a solution, see: https://virtuallyvisual.wordpress.com/useful-links/general-citrix-case-studies/
The main GRID webpage also carries a great deal of detailed information under the "resources" tab: http://www.nvidia.com/object/grid-enterprise-resources.html
For those new to virtualization they may prefer to use a local NVIDIA partner with experience of GRID technologies: http://www.nvidia.com/object/partner-locator.html
Q: Does this same timing-slicing technology occur on GRID K1/K2 cards? Or just the new Tesla M60/M6/M10?
A: Yes all this information on time-slicing is relevant in all generations of GRID board including K1/K2 - same architectural model. You can re-watch the webinar which is posted on: https://communities.vmware.com/community/vmtn/vmware-nvidia-direct-access-program to overview the time-slicing technology.
Q: We are integrating M6 MXM with Dell M series blades (M630) and have interest from both Citrix and VMware shops in high-end VDI using GRID. Where can I find out more about Dell blade solutions?
At the time of the webinar (this may change in the future), Dell do not ship a blade server that takes M6's directly. Partner Amulet Hotkey (AHK) does. Customers can contact AHK directly to find out more information. Customers using virtualization stacks such as Citrix/VMware are advised to always check the server support provided by those vendors via their HCL (Hardware Compatibility Lists), advice on how to do this can be found here:
Q: Where can I ask more questions?
For VMware related questions regarding NVIDIA GRID, our joint forum is the best place, here: https://communities.vmware.com/community/vmtn/vmware-nvidia-direct-access-program. For NVIDIA GRID specific issues, the NVIDIA GRID forums (https://gridforums.nvidia.com/default/board/119/nvidia-grid-forums/) would be a good place to ask. Or join another of our frequent webinars and ask your questions live again!
Q: I am a bit confused on how to actual configure this time slicing in a view environment. Are there any specific instructions?
A: Time-slicing is a built-in technology the user does not need to do anything and it will happen automatically. Long-term we may add control to prioritise certain VMs if there was sufficient customer demand. By explaining the automated technology we are hoping to give customers greater insight when doing their own evaluations. If you want to ask further questions to clarify please visit NVIDIA GRID Forums and our engineers can talk you through any remaining questions.
Q: We have 11 VM's per host using the K140Q profile on a K1 card, All hosts have 2 physical GPU's installed, the NVIDIA hard limit per card is 16, is it safe to remove a GPU card and how does time-slicing come into play and would one card is be enough?
A: A K140Q can put up to 4 vGPUs on each of its 4 pGPUs (physical GPUs), each vGPU is guaranteed a time slice of 25% if there's no work on a VM skips to next. At the moment with 11 over 8 pGPUs (2xK1 boards) you must have 1 or 2 users per pGPU if spread evenly. This means each vGPU has all the cores of a pGPU available 50 or 100% of the time.
You could put all your users on 3 GPUs with 3 or 4 users per pGPU, so each vGPU would be guaranteed 25-33% of a pGPU. If your users are using the GPU a lot this will reduce the amount each gets but what we find as Erik showing is many applications are very bursty in their GPU usage so in practice there is a lot of skipping to next vGPU and consolidating on the pGPU would work. Basically you'll have to try with your users and their work patterns.
Q: Are these graphs something we'd be able to reproduce in our own environments, or are these "doctored" charts, purely for illustrative purposes?
A: The graphs in the webinar show real data collected by GRID performance engineering team and yes they should be reproducible. We have taken note to include more details on how users can do this when we present future graphs and data.
Q: What were the tools used for sizing and performance testing shown in the webinar?
The Performance Engineering Team used the Windows Performance Recorder and Windows Performance Analyzer to demonstrate the impact of time slicing the 3D Engine when executing real user workloads. Windows Performance Analyzer allowed us to drill down to very short sample rates to show that there could be GPU cycles available even when user(s) run GPU "heavy" tasks like panning, zooming or rotating. Unfortunately Windows Performance Recorder/Analyzer can't be used for sizing or performance testing due to limitations (Only DirectX) and complexity (required GB of data when logging a couple minutes).
Q: Were the benchmarks shown in the webinar performed using PCoIP or Blast Extreme/Horizon 7?
A: The GRID performance engineering team have benchmarked both protocols, but the results in the presentation were mostly server side and concerned with the applications' sporadic use of the GPU which is independent of the protocol.
Q: During the webinar you announced a new GRID board, the M10, where can I find details?
A: The M10 is a 4 GPU per card; 64 users per card so up to 128 vGPUs per host. The press release about the M10 GRID card is available here: http://www.marketwired.com/press-release/nvidia-grid-delivers-100-graphics-accelerated-virtual-desktops-per-server-nasdaq-nvda-2126274.htm.
Q: When using the K1, each VDI user holds/locks up the GRID's GPU memory and does not release the memory as long as the VDI is ONLINE whether they are logged in or not. I don't understand how time-slicing will be able to release this dedicated memory for another user to use.
A: Yes the framebuffer (FB) is dedicated (belongs just to one VDI user - totally secure), it is the processing cores not the memory that are shared via time-slicing; it's like having dedicated RAM for a VM whilst having vCPUs to share CPU resource. Time-slicing is performed on the GPU in hardware and not software and as such is a unique, optimized technology unique to NVIDIA that allows scalability and consolidation in VDI/XenApp etc.
Q: What does the performance looks like for Video editing workstations with special effects, etc? Can VDI deliver the performance required?
A: It really depends on the users and application. Video-editing and special effects can be demanding on rendering and also audio/video sync so the limitations and bandwidth demands of VMware/remote protocol may need to be considered. We have a lot of folk using Autodesk Mudbox and Maya on VDI. VDI also has Wacom/USB haptic devices available on various platforms/end-clients (but again one to check needs with for with VMware/Citrix).
Q: I have a suggestion for a future webinar: nuts and bolts of measuring/monitoring, what to measure, where to measure (in guest, in host, at endpoint), and tools for each. It seems like there's a lot of what not to do out in the community, but not much on the what to do and how to do it.
A: That is a very good idea and one we have noted and are looking into. Meanwhile there maybe more information available than you have found. Resources already available include:
· Knowledge base articles such as this one measure frame buffer (http://nvidia.custhelp.com/app/answers/detail/a_id/4108/) or this one measuring CPU/vCPU contention for XenServer (http://nvidia.custhelp.com/app/answers/detail/a_id/4117/).
· You can always ask questions about monitoring on our forums, there is even a dedicated monitoring area: https://gridforums.nvidia.com/default/board/131/monitoring-assessment-tools/
· Our engineering team publish informal YouTube videos guiding uses through monitoring: https://www.youtube.com/user/JasonShootsPeople/videos
· And of course we do regular webinars with many partners.
Please also remember the GPU is only one resource, CPU, IOPS, RAM all interact and limit the scalability of a system, please do look at the information available from the virtualization vendors and communities too about monitoring GPU in a virtualized context.
Q: Have NVIDIA been testing with Login VSI GFXMax?
A: LoginVSI announced a new Graphics testing framework for VDI based on real CAD/3D applications recently. We certainly do test with a lot of the applications that LoginVSI have included within their graphics workload (it's a collection of real workloads) although we do not explicitly use this benchmark in our own testing. Our own testing systems are somewhat constrained in adopting new frameworks by the need to continuously regression test existing products against historical baselines. As a new framework in beta it is one we are evaluating and we work cloely with LoginVSI to collaborate on interoperable technologies.
Q: Is the M10 following the same pricing model as the M60?
A: The M10 will also follow a software licensed model. The M10 card is primarily aimed at business and mainstream VDI applications (as was the K1) and is aimed at the vApps and vPC licensing pricing. The M60 is better suited to 3D/CAD users where vWks licensing is more likely to be needed. The M10 has a higher user density per card (64 users), as such whilst the licensing model is the same, the M10 will have a very different value and price point to the M60, better suited to mainstream business VDI and apps (e.g. XenApp).
Details of the M10 announcement can be found here: http://nvidianews.nvidia.com/news/nvidia-grid-delivers-100-graphics-accelerated-virtual-desktops-per-server
Software licensing provides supported users with direct 24/7 support from NVIDIA enterprise support.