vSphere DRS/pDRS and vROps Workload Optimization
Updated: Apr 27, 2022
How is vSphere DRS and pDRS different than vROps Workload Optimization? This has been a common question since Workload Optimization was introduced back in vROps 6.7.
Let's start with the basics: what is DRS? Distributed Resource Scheduling (DRS) is a vSphere feature designed to provide VMs the resources they need. DRS moves VMs around within Clusters, such that they aren't experiencing resource contention. The move itself is called a vMotion. It looks something like this.
Pre-vSphere 7.0, DRS uses a Cluster Imbalance metric to keep workloads balanced within a Cluster. The Cluster Imbalance metric is derived from the standard deviation of load across ESXi Hosts in the Cluster. If Cluster Imbalance is within a known threshold, the Cluster is considered balanced and VMs remain in place. If not, at least one ESXi Host in that Cluster has significantly more workload than the others, so DRS vMotions VMs to another ESXi Host in that Cluster in an effort to get them the resources they need.
In vSphere 7.0, DRS no longer uses Cluster Imbalance to vMotion VMs, but rather a metric called VM Happiness. If VMs are happy, they stay where they are. If VMs are unhappy, they get vMotioned to a different ESXi Host, resulting in each VM getting the resources it needs.
It should also be noted that pre-vSphere 7.0, DRS ran on a 5-minute schedule, whereas in vSphere 7.0 it now runs on a 1-minute schedule. VMware conducted a really nice performance study here: https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/performance/drs-vsphere7-perf.pdf
There is another vSphere feature called Storage DRS, which applies to VMFS and NFS Datastores within a Datastore Cluster. Based on capacity and IO, storage DRS will move VM disk files (.VMDK) between Datastores to find performance improvements. The move itself is called a Storage vMotion. It looks something like this.
Back to DRS, there are four automation levels available:
Disabled - DRS is disabled. VMs won't be moved nor will recommendations be made.
Manual - VM placement and migration recommendations are displayed, but do not run until you manually apply them.
Partially Automated - Initial VM placement is performed automatically. Migration recommendations are displayed, but do not run automatically.
Fully Automated - VM placement and migration recommendations run automatically.
Along with these automation levels, you'll also be asked for a migration threshold. This allows you to specify which recommendations are made and subsequently applied, either manually or automatically, based on your selection above. Think of migration thresholds as a dial: turned down to 1 or most conservative for the fewest recommendations/moves or turned up to (most aggressive) for the most recommendations/moves. These recommendations are made based on performance metrics consumed from vCenter.
Most Conservative - makes/applies only priority 1 recommendations, ie recommendations necessary to satisfy cluster constraints like affinity rules or host maintenance.
Conservative - makes/applies priority 1 and 2 recommendations, ie those that will significantly improve the cluster load balance.
Default - makes/applies priority 1, 2, and 3 recommendations, ie those that will improve the cluster load balance.
Aggressive - makes/applies priority 1, 2, 3, and 4 recommendations, ie those that will only moderately improve cluster load balance.
Most Aggressive - makes/applies all recommendations, even those that will only slightly improve the cluster load balance.
I'd like to highlight DRS affinity rules here because they are similar to vROps Workload Optimization Business Intent, but different. DRS affinity rules allow you to control the placement of VMs on specific ESXi Hosts within a Cluster.
VM-Host Affinity Rule: Used to specify affinity (or anti-affinity) between a group of virtual machines and a group of hosts. It specifies that a group of VMs can or must run on a certain group of ESXi Hosts in that Cluster. An anti-affinity rule specifies that a group of VMs can't run ib a certain group of ESXi Hosts in that Cluster.
VM-VM Affinity Rule: Used to specify affinity (or anti-affinity) between individual VMs. A rule specifying affinity causes DRS to try to keep the specified VMs together on the same ESXi Host. An anti-affinity rule tells DRS tries to keep the specified VMs separated. Reasoning here might be, so that when a problem occurs with one ESXi Host, you don't lose both VMs.
The next generation of DRS is predictive DRS (pDRS), introduced in vSphere 6.5. pDRS is similar to DRS, but recommends and vMotions VMs based on predicted behavior determined by vROps. pDRS needs to be enabled in both vSphere and vROps.
This is not intended to be an exhaustive look into DRS/pDRS, there are literally books written on these topics. Two people that know these topics better than anyone in the world, Frank Denneman and Duncan Epping, are active bloggers. To explore DRSpDRS in more detail start with their blogs:
Frank Denneman - https://frankdenneman.nl
Ducan Epping - http://www.yellow-bricks.com
Now, let's talk about vROps Workload Optimization, which is the optimization engine in vROps that applies to all VMs in your vROps environment, on all ESXi Hosts, across all Clusters. This is different than DRS/pDRS in that vROps Workload Optimization will move workloads across clusters, whereas DRS/pDRS vMotions VMs within Clusters.
You configure vROps Workload Optimization by adjusting Operational Intent and Business Intent.
Operational Intent allows you to tell vROps how and where to move VMs, across clusters. There are three options available.
Balance - will spread workloads evenly over the available resources, but may move workloads more often. This is good for more stable populations.
Moderate - will minimize workload contention, but will not attempt to move workloads to achieve better balance or consolidation.
Consolidate - will place workloads into as few clusters as possible, but allows for less responsive capacity. This is good for populations with steady demand, and may reduce licensing and power costs.
You are also given the option to specify Cluster Headroom, anywhere from 0% to 50%.
Business Intent allows you to tell the vROps Workload Optimization engine where to place VMs based on business considerations, things like Operating Systems, Licensing, Tiers, Networks, and more. There are two methods available: Clusters and Hosts.
Cluster based business intent ensures VMs are placed on Clusters based on configured settings (multiple tags can be used and prioritized). For example, Oracle VMs will only be placed on Oracle Clusters.
Host based business intent is also available, ensuring VMs are placed only on certain ESXi Hosts. Only one category can be used at a time.
Host based business intent would allow you to tie MSSQL VMs to MSSQL ESXi Hosts for example. Please note that vROps Workload Optimization Host Based Business Intent supersedes all DRS/pDRS affinity rules as indicated above.
Enabling vSphere DRS/pDRS, and vROps Workload Optimization will provide the user with the best possible performance in their Software Defined Datacenter.