VMware vROps Alerts and Symptoms
Let's deconstruct vROps Alerts and explore their most important building block, the Symptom. First, what are Alerts comprised of (I will use vROps 8.1 for all examples in this blog):
Alert - this is where you give the Alert its name, description, the base object type, impact, criticality, type and subtype, wait and cancel cycles.
Symptom - here you trigger the Alert. There are four types of Symptoms: metric event, metric/property event, message event, and a fault event. You can define these Symptoms on self, parent, child, descendant, or ancestor. We will explore these five Symptom types in more detail later.
Recommendations - the alert payload can include a Recommendation for remediation based on industry standards, KBs, blogs, etc.
Policies - as part of the Alert definition you tell vROps which Policy you want to enable it in.
Notifications - an alert can have a Notification associated with it. In vROps 8.1 there are six Notification Plugins out of the box.
The triggering mechanism of the Alert is the Symptom, which can be broken down into four different types (https://docs.vmware.com/en/vRealize-Operations-Manager/8.1/com.vmware.vcom.core.doc/GUID-06380281-4B99-4E4B-9D4E-574E5D0A9194.html):
Metric / Property Symptom - these are based on metrics/properties against objects vROps has collected. A metric example would be CPU Utilization for a VM. A property example would be Power State of a Cisco switch port.
Message Event Symptom - these are based on events received as messages by vROps from an externally monitored system through that system's REST API. Message event examples would include events received by vROps from vCenter or Cisco UCS Manager.
Fault Event - these are Symptoms based on events that monitored systems publish. Faults are intended to signify events in the monitored systems that affect the availability of objects in your environment. A fault event example would be an ESXi Host memory device error.
Metric Event - these are Symptoms based on events sent from a monitored system where the selected metric violates a threshold. The external system manages the threshold, not vROps. A metric event example would be a drive capacity threshold event sent from Microsoft SCOM.