PVS (and MCS) are amazing – but very often misunderstood – technologies. I have been in the reluctant camp with MCS (Machine Creation Services) mostly because of the huge impacts I have seen PVS (Provisioning Services) have on organizations workflow and performance. For many years the issue I find in the field is that people tend to focus completely on Storage performance with these technologies.
I won't spoil this for you. Storage is a minimal impact compared to the more modern ability to cache in RAM. RAM of the OS is always… ALWAYS faster than your storage can ever be. So while I do hold that Hyperconverged infrastructure does wonders for improving MCS (which has since the beginning suffered from poor read speeds and very poor deployment speeds) nothing improves performance like even a small amount of RAM caching.
But MCSIO (the caching in MCS) is unstable, you say… Only if you don't give it enough to work with and don't properly defragment your system. As I found in recent cases you'll get better results spending on RAM than storage… but let's not jump too far ahead.
This is a teaser article for Chapter 4 of the Citrix Hero Program.
While the program is now closed, the good news is that you can get access to over 12 hours of video lessons including this one by enrolling at https://ctx.academy
Or Get the book! This topic is fully expanded in Chapter 4 of Be A Citrix Hero
Problem 1 – VDA Performance Lags
Problem Description
User experience suffers, especially later in the day. The problem seems to be worse for Machine Creation Services managed Server VDA hosted Desktops that are open all day long, however MCS Desktop VDAs seem to have similar symptoms. While not as high impact, a similarly configured Provisioning Services (PVS) target device exhibited similar symptoms. System pauses are experienced without subsequent CPU utilization spikes. While OS Optimization helps slightly the overall user experience seems to suffer regardless of how many programs are open. Workspace Environment Manager has helped memory and CPU issues but not solved the overall slowdowns issue.
Troubleshooting Notes
The Administrators note that the problems do seem to increase with additional sessions being active, but cannot explain when the same amount of sessions at an earlier point in the day are not experiencing the slowdowns so they suspect that the number of sessions may not be the primary factor.
Storage Engineers are concerned as IOPS do increase toward the end of the day, but are still way below what the system is rated to perform. A recent move to an all-flash array has not had a noticeable impact so they have encouraged looking more at the programs themselves or that it is a ‘Citrix problem’. Network Engineers report no issues or notable differences between beginning and end of the workday.
Solutions to Test
This problem tends to be caused because either the Cache In RAM with Failover to Disk feature has not been configured or has not been configured with adequate RAM. So… increase the RAM cache if you are experiencing these slowdowns and see if that helps.
Additional Considerations:
- Defragment the master image prior to deployment
- Assure Antivirus software does not have scheduled scans for non-persistent MCS or PVS-provided VDAs
- Properly configure Citrix Optimizer (see Chapter 1 of the Become a Citrix Hero ebook – it's free!)
- Use Citrix Workspace Environment Manager to control CPU and Memory resource priorities and to free up RAM, especially on Server OS VDAs
Want to go deeper? Access over 12 hours of video lessons in the Citrix Hero Program!
Problem 2 – PVS Target Device (VDAs) lock up randomly or are slow at times
Problem Description
When PVS Target devices are configured properly (enough Cache, optimizations, etc) users are noting apparent lock-ups which clear after a few seconds. Programs are sometimes slow to load, especially early in the mornings. The problems are often much worse after new vDisk versions are deployed.
Troubleshooting Notes
Monitoring the VDAs indicate no unusual CPU activity- in fact at the times users note issues, Administrators may note that there is little to no CPU activity. Event logs do not indicate any OS configuration issues that are related.
The Citrix team confirmed that caching, optimization and the above recommendations have been configured properly. By further troubleshooting they found that many VDAs are indicating high retries.
Network Engineers note there are no packet loss issues or utilization issues on the network (note- be careful here; I have seen would-be Citrix Heroes not accept this explanation and blame the Network, only to later be embarrassed by the real cause…)
Solutions to Test
One of the most common causes of Retries is actually the PVS server not being able to meet read requests rapidly enough. Most typically this is caused by inadequate RAM on the PVS Server. When vDisks are read Windows Server will automatically cache the data. This reduces reads from disk because the non-persistent VMs are always reading the same data. However, because the cache is FIFO (First In First Out) if the amount of reads exceed the cache, the PVS server must read from it’s vDisk Store. Even with very fast storage, seek times and transfer can cause delays.
Citrix has guidance on how to get started with proper RAM sizing for PVS (see below). A common misconception is that it would be better to have faster storage- in fact some customers use physical hardware or dedicated Flash-Based or RAM-Based storage only to find that the improvement is minimal. RAM is far cheaper a solution. We’ll discuss in far more detail in the main lesson.
Resolving PVS and MCS Performance Issues
So- How do you resolve these issues?
Here's a quick summary of what we cover in the rest of this lesson and some helpful links for you.
Problem 1 – VDA
- Recommended starting points for MCSIO-based RAM cache
- Server OS ~ 2 to 6 GB (or much more)
- Desktop OS ~ 512 MB
- Version Considerations (soon: MCSIO is dead, long live PVS)
- Consider: https://www.citrix.com/blogs/2014/04/18/turbo-charging-your-iops-with-the-new-pvs-cache-in-ram-with-disk-overflow-feature-part-one/ and https://www.citrix.com/blogs/2014/07/07/turbo-charging-your-iops-with-the-new-pvs-cache-in-ram-with-disk-overflow-feature-part-two/
- Consider also – Use cases like Chrome can seriously impact your caching
- Recommended starting points for PVS Cache in RAM with Failover to Hard Disk feature
- Server OS ~ 2 to 4 GB
- Desktop OS ~ 256-512 MB
- How to determine if you have configured enough RAM for Cache in VDAs
- Why Defragmentation matters
- PVS vDisk Defragment trick: https://support.citrix.com/article/CTX229864
- Why Defragmentation matters on PVS and MCS: https://www.citrix.com/blogs/2015/01/19/size-matters-pvs-ram-cache-overflow-sizing/
Problem 2 – PVS Server
- Recommended starting points for PVS (Server) RAM configuration: https://www.citrix.com/blogs/2013/07/03/pvs-internals-2-how-to-properly-size-your-memory/
- Use Perfmon to identify vDisk reads are coming from disk instead of RAM – make sure your Cache Read Hits % stays above 80%
- Antivirus scans can kill your performance… even/especially scans on the PVS server
- Quick Tip: When you update a vDisk first boot a single VM, logon and launch all the programs normally launched; then start/restart other VMs
- Versioning helps with the caching problem, believe it or not!
Want to go deeper? Get the full Be a Citrix Hero Book!
Available at booksellers and Amazon, sure – but save when you buy direct from D.J.!