fbpx

PVS and MCS Performance: Memory Matters

PVS (and MCS) are amazing – but very often misunderstood – technologies. I have been in the reluctant camp with MCS (Machine Creation Services) mostly because of the huge impacts I have seen PVS (Provisioning Services) have on organizations workflow and performance. For many years the issue I find in the field is that people tend to focus completely on Storage performance with these technologies.

I won’t spoil this for you. Storage is a minimal impact compared to the more modern ability to cache in RAM. RAM of the OS is always… ALWAYS faster than your storage can ever be. So while I do hold that Hyperconverged infrastructure does wonders for improving MCS (which has since the beginning suffered from poor read speeds and very poor deployment speeds) nothing improves performance like even a small amount of RAM caching.

But MCSIO (the caching in MCS) is unstable, you say… Only if you don’t give it enough to work with and don’t properly defragment your system. As I found in recent cases you’ll get better results spending on RAM than storage… but let’s not jump too far ahead.

This is a teaser article for Chapter 4 of the Citrix Hero Program.
The chapter given to the current subscribers is 19 pages, nearly 7000 words of content. What you see here is more of the “WHY” not all of the how. But I didn’t want to leave you completely out in the cold. So I’ve included some links and other content that I hope will be helpful to you.

So what is the Citrix Hero Program?

Each month we tackle a new leading practice and dive in deep to make sure you understand it.
We have a live Q&A and video lessons to supplement the full chapter.
Members in good standing will have access to key past articles. This means that if you subscribe next month, you’ll get access to this full article right away!

Enrollment will re-open in April – I encourage you to try it out then to lock in a special price!

Problem 1 – VDA Performance Lags

Problem Description

User experience suffers, especially later in the day. The problem seems to be worse for Machine Creation Services managed Server VDA hosted Desktops that are open all day long, however MCS Desktop VDAs seem to have similar symptoms. While not as high impact, a similarly configured Provisioning Services (PVS) target device exhibited similar symptoms. System pauses are experienced without subsequent CPU utilization spikes. While OS Optimization helps slightly the overall user experience seems to suffer regardless of how many programs are open. Workspace Environment Manager has helped memory and CPU issues but not solved the overall slowdowns issue.

Troubleshooting Notes

The Administrators note that the problems do seem to increase with additional sessions being active, but cannot explain when the same amount of sessions at an earlier point in the day are not experiencing the slowdowns so they suspect that the number of sessions may not be the primary factor.

Storage Engineers are concerned as IOPS do increase toward the end of the day, but are still way below what the system is rated to perform. A recent move to an all-flash array has not had a noticeable impact so they have encouraged looking more at the programs themselves or that it is a ‘Citrix problem’. Network Engineers report no issues or notable differences between beginning and end of the workday.

Solutions to Test

This problem tends to be caused because either the Cache In RAM with Failover to Disk feature has not been configured or has not been configured with adequate RAM. So… increase the RAM cache if you are experiencing these slowdowns and see if that helps.

Additional Considerations:

  • Defragment the master image prior to deployment
  • Assure Antivirus software does not have scheduled scans for non-persistent MCS or PVS-provided VDAs
  • Properly configure Citrix Optimizer (see Chapter 1 of the Become a Citrix Hero ebook – it’s free!)
  • Use Citrix Workspace Environment Manager to control CPU and Memory resource priorities and to free up RAM, especially on Server OS VDAs

Want to go deeper? Every month we take problems just like this and dive deep into the solutions in the Citrix Hero Program. Enrollment opens April 15th.

Problem 2 – PVS Target Device (VDAs) lock up randomly or are slow at times

Problem Description

When PVS Target devices are configured properly (enough Cache, optimizations, etc) users are noting apparent lock-ups which clear after a few seconds. Programs are sometimes slow to load, especially early in the mornings. The problems are often much worse after new vDisk versions are deployed.

Troubleshooting Notes

Monitoring the VDAs indicate no unusual CPU activity- in fact at the times users note issues, Administrators may note that there is little to no CPU activity. Event logs do not indicate any OS configuration issues that are related.

The Citrix team confirmed that caching, optimization and the above recommendations have been configured properly. By further troubleshooting they found that many VDAs are indicating high retries.

Network Engineers note there are no packet loss issues or utilization issues on the network (note- be careful here; I have seen would-be Citrix Heroes not accept this explanation and blame the Network, only to later be embarrassed by the real cause…)

Solutions to Test

One of the most common causes of Retries is actually the PVS server not being able to meet read requests rapidly enough. Most typically this is caused by inadequate RAM on the PVS Server. When vDisks are read Windows Server will automatically cache the data. This reduces reads from disk because the non-persistent VMs are always reading the same data. However, because the cache is FIFO (First In First Out) if the amount of reads exceed the cache, the PVS server must read from it’s vDisk Store. Even with very fast storage, seek times and transfer can cause delays.

Citrix has guidance on how to get started with proper RAM sizing for PVS (see below). A common misconception is that it would be better to have faster storage- in fact some customers use physical hardware or dedicated Flash-Based or RAM-Based storage only to find that the improvement is minimal. RAM is far cheaper a solution. We’ll discuss in far more detail in the main lesson.

Want to go deeper? Every month we take problems just like this and dive deep into the solutions in the Citrix Hero Program. Enrollment opens April 15th.

Resolving PVS and MCS Performance Issues

So- How do you resolve these issues?

Here’s a quick summary of what we cover in the rest of this lesson and some helpful links for you.

Problem 1 – VDA

Problem 2 – PVS Server

  • Recommended starting points for PVS (Server) RAM configuration: https://www.citrix.com/blogs/2013/07/03/pvs-internals-2-how-to-properly-size-your-memory/
  • Use Perfmon to identify vDisk reads are coming from disk instead of RAM – make sure your Cache Read Hits % stays above 80%
  • Antivirus scans can kill your performance… even/especially scans on the PVS server
  • Quick Tip: When you update a vDisk first boot a single VM, logon and launch all the programs normally launched; then start/restart other VMs
  • Versioning helps with the caching problem, believe it or not!

Want to go deeper? Every month we take problems just like this and dive deep into the solutions in the Citrix Hero Program. Enrollment opens April 15th.

Citrix’s New Best Practice items – DJ’s Highlights

Citrix’s New Best Practice items – DJ’s Highlights

Gotta boost the signal on this. My friend Nick Rintalan from Citrix Consulting has put together a new ‘best practice’ (or leading practices for the lawyers) update that I feel it’s important for people to see!

Nick Rintalan, Lead Architect and Best Practice Guru at Citrix Consulting

Nick Rintalan, Lead Architect at Citrix Consulting

New Best Practice(s)?

Here are some of the highlights of the article, sorted here by what I feel is most important for you to read:

  • PVS and Memory Buffers. Yes, yes, for the love of all that is holy, yes. I haven’t yet deployed for validated the Write Cache features now in MCS, but I can tell you from experience that XenApp with 2-4 GB of RAM cache with failover to disk has been giving roughly 20-30% faster logons and overall better experience for most of my customers.
  • Protocols (as in HDX). One of my primary frustrations for quite some time now is that Citrix XenDesktop ships by default with a protocol that has a good experience on LAN but tends to be problematic at distance. H.264 is great for video, but frankly I hate it everywhere else. I think it almost singlehandedly ruined things for Citrix since PCoIP can perform better than this hog (my opinion). Thinwire and even the legacy encoder, however- actually deliver on the promises and need to be investigated in nearly every single use case I see. So I agree with Nick- use the policy templates included with 7.6 u3 and above (including LTSR) as a starting point. Odds are good you won’t be disappointed. When I say ‘use’ here what I mean is remember that you can apply these codecs on a per user basis, connection basis or even per delivery group- meaning filters are your friend! It is perfectly acceptable to have multiple codecs going for various use cases. One size nearly NEVER fits all, so test these out!
  • vSphere Cluster Sizing. Number 3 on my list right now. You need a dedicated resource cluster for Enterprise workloads- but honestly- for XenApp workloads, consider more hosts per cluster. You should be using bigger VMs anyway, so the number of managed VMs is about the same- just more computing power. CCS is seeing 24+ hosts per cluster be just fine in XenApp. For XenDesktop with more than 5000 VMs- I will add here that a dedicated vCenter may save you a lot of pain… my opinion, and of course… you guessed it. TEST!
  • XenApp CPU Over-Subscription. Seriously. The “1.5x” thing needed an update so I’m glad to see some clarification here. In all things- I still encourage practical testing instead of just implementing something because “Citrix said to.”
  • PVS Ports and Threads. Those of you who know me know I bang this drum a lot- so here’s some backup for what I’m saying. The defaults are not good enough. Good design is still required!
  • Farm Design. You’re probably like me and are coming along kicking and screaming from XenApp 6.5, which most would agree has been the “Windows XP” of the Citrix world. It just hasn’t been this good yet, and I still feel 7.9 doesn’t have true feature parity… but as Nick describes, they are getting there. As always… TEST, TEST and then TEST some more before you implement zones with FMA!!!!
  • XenMobile Optimizations. I guess we have to talk about it. XenMobile is here to stay, so best to not take the ‘out of the box’ experience there either.

Read More

READ: “New” Citrix Best Practices 2.0 | Citrix Blogs

Give it a read and let me know your thoughts… but most importantly- don’t forget to share this!

Citrix PVS or MCS? The Debate Rages On!

Citrix PVS or MCS? The Debate Rages On!

PVS (Provisioning Services) or MCS (Machine Creation Services, aka linked clones)? This is a long-standing debate that I’m hoping to have the time to address after Citrix Synergy. But I did appreciate this breakdown since I continue getting this question all the time: Should I choose PVS or MCS for my deployment?

Well, in our debate-obsessed culture (US Elections, Batman vs Superman, Captain America vs Iron Man… the list is endless), this one is heating up. In some ways- it’s like having to chose the less of multiple evils…

Movie Villains in a Political-Style Debate

But in all honesty- how do you make the decision?

Well- of course it depends- but one thing you may want to consider first from Dan Feller’s recent blog- which bottleneck will you be experiencing?

PVS vs MCS – Part 2: Scalability | Ask the Architect

In a nutshell- if your storage is awesome (super fast with good deduplication capability) but your network may not be… MCS is an easy win.

If you plan to deploy to the cloud- MCS is an easy win.

If you need it deployed quick for a POC- MCS is an easy win.

But…

Considering the real network consumption to boot a VM is less than 300 MB, and that PVS makes diskless or near-diskless configurations possible…

PVS is still my reference standard, even for smaller environments. Here’s why:

  • PVS has a proven track record and an ability to deploy a single image to multiple hypervisor pools. MCS struggles with requiring copies of the master VM to each storage. While this has gotten better, PVS is still epic in this regard.
  • Networks have evolved and is barely ever a bottleneck that makes PVS struggle. Even a single 1Gbit connection can boot and maintain several hundred target VMs. Given that most VMs are operating at more than 10Gbit in the enterprise today and the load can be spread to multiple PVS servers… this factor barely exists any longer.
  • PVS has always reduced IOPS requirements overall, but in the past 4 years has seriously jumped forward because of two things:
    • Your .vhdx file is read and cached into RAM, so subsequent reads for target device requests come from RAM. This means you can scale nearly endlessly with virtually no IOPS impact from reading the base vDisk.
    • Write Cache in RAM with Failover to Hard Disk, while the longest description ever is perhaps one of the single most epic bits of Citrix technology to be deployed in the past 10 years. Reducing the amount of storage IOPS for Write, which used to be almost 90% of the overall IOPS required for PVS targets is now lessened because the writes are cached in RAM and in some cases don’t even hit the disk at all!
  • PVS makes a pod-based architecture viable, lowering downtimes significantly. With the right design, you can have an entire rack of servers go offline and your users won’t even know. You can design in ways that allows you to mix storage and hypervisor pools that MCS has trouble maintaining. So when I say it scales “better” I am rarely talking about quantity but operational quality. Of course, it all depends on good design but if you want to hear more about that, I’d love to discuss it!
  • PVS prevents the SAN battle. Nearly every time I go into a deployment of XenDesktop the team managing the SAN storm into the conference room with a unified front ready to say ‘no.’ But I tell them we may not need them at all (local storage really is possible for PVS targets) or that our IOPS will be less than 1 per user…  their shoulders drop down, they smile and tell me to have a nice day. And, I do. Because they said “yes”- because I’ve made their life easier.
  • PVS can track versioning and rollback images with much more speed and efficiency than every other technology out there I have ever seen.

Now, does PVS represent a learning curve- absolutely, which is I think the other thing that needs to be further discussed. I continue to see bad practices out there… but first I want to hear from you: What experiences have you had between MCS and PVS, and what are your thoughts? What kind of questions do you have? Comment below!

But if you want my advice in most cases, subject to a whole bucket of ‘it depends’ here it is:

  1. POC with MCS
  2. Small deployments and cloud-based deployments with MCS
  3. Go to Production with PVS
  4. Put on your gloves and get ready for a fight

Good luck! Share this with your colleagues – I’d love to hear more from people before I start the Citrix Imaging topic in a few weeks!

-DJ

Categories

By With a Little Help from Our Friends

ByteSized Book logo