Dynatrace provides out-of-the-box availability monitoring of OS services.
You can monitor hosts in full-stack monitoring mode or use lightweight monitoring modes. For more information, see Infrastructure and Discovery monitoring modes.
Depending on your monitoring requirements, you can choose between basic or advanced alerting of OS services. The Discovery mode allows only basic alerting, while the Full-Stack and Infrastructure monitoring modes also allow advanced alerting.
With the service status property in Smartscape, you can query the current status of the service.
fetch `dt.entity.os:service` | fieldsAdd status
If the alert is enabled, events and problems are created when a service status changes, such as when a service goes from running to failed. For more details, refer to Host availability.
With a service availability metric, you get more detailed information about your service on a per-minute basis. This allows you to monitor the status of the service and create more tailored alerts, such as only alerting if the service has been in a failed state for more than 10 minutes.
timeseries count(dt.osservice.availability),by:{dt.osservice.display_name, dt.osservice.status} | filter dt.osservice.display_name=="apache2"
To monitor an OS service, perform the following steps.
In Dynatrace, go to OS services monitoring for the level you are configuring. The level defines priority: settings at the host level override settings at the host-group level, and settings at the host-group level override settings at the environment level.
Host group
and select the host group you want to configure.
The Host group property is not displayed when the selected host doesn't belong to any host group.
<group name>
link, where <group name>
is the name of the host group that you want to configure.Go to Settings > Collect and capture > Infrastructure > OS > OS services monitoring.
Based on the service state and the rules, the service monitoring policy defines the way Dynatrace is monitoring your service. By default, Dynatrace comes with Auto-start Windows OS Services
and Auto-start Linux OS Services
policies for auto-started Windows and Linux services with failed status.
Note that the default limit of OS Service entities is 100,000 per cluster.
In larger environments with many hosts, we recommend creating precise rules that match only the important services for your infrastructure. Creating rules that are too general (for example, matching all services on thousands of hosts) may result in reaching the limit (entity explosion), leading to the disappearance of OS services from Dynatrace.
Also, Auto-start Windows OS Services
and Auto-start Linux OS Services
can be used as a starting point for further refining the policies.
The order of service monitoring policies is important. Policies that are higher on the list will proceed before those on lower positions until they are fulfilled. This allows for the creation of selective alerts or monitoring with minimal policies. For example, if you want to monitor all auto-started services and not just those created by Microsoft, you need to add a policy with disabled alerting and/or monitoring that will verify if the manufacturer is Microsoft.
On OS services monitoring for the level you are configuring based on your OS, select Add policy and define the policy, which is a collection of rules.
System: select your operating system.
Rule name: enter the name that will be displayed in the Summary field.
Monitor: decide whether you want to monitor service availability using the OS service availability (builtin:osservice.availability
) metric. If available, the metric sends the service status every 10 seconds. The status is carried by the Service status (dt.osservice.status
) dimension.
Note that the metric consumes data points. For more information, see Metrics powered by Grail.
Alert: decide whether you want alerting for your policy.
OneAgent version 1.257+ Alert if service is not installed: whether you want to receive alerts about OS services that are not installed on the host.
Service status: set the service status for which an alert should be triggered.
You can use logic operations to monitor the service status. For example, $eq(running)
monitors the running service state.
Available logic operations:
$not($eq(paused))
– Matches services that are in state different from paused.$or($eq(paused),$eq(running))
– Matches services that are either in paused or running state.These are the service statuses you can monitor. Use one of the following values as a parameter for this condition:
running
stopped
start_pending
stop_pending
continue_pending
pause_pending
paused
optional OneAgent version 1.257+ Alerting delay: the number of 10-second measurement cycles for a service to be in configured state before an event is generated.
Next, you need to select which services you want to monitor based on service properties.
Select Add rule.
optional Rule scope: select either OS Service or Host. By default, the OS Service option is selected.
If you selected Host:
OneAgent version 1.277+ Custom metadata used for matching:
Key specifies the metadata key you want to match
Condition in which you can define a string that:
$match(ver*_1.2.?)
– Matches string with wildcards. Use *
for any number of characters (including zero) and ?
for exactly one character.$contains(production)
– Matches if production appears anywhere in the host metadata value.$eq(production)
– Matches if production matches the host metadata value exactly.$prefix(production)
– Matches if production matches the prefix of the host metadata value.$suffix(production)
– Matches if production matches the suffix of the host metadata value.Available logic operations:
$not($eq(production))
– Matches if the host metadata value is different from production.$and($prefix(production),$suffix(main))
– Matches if host metadata value starts with production and ends with main.$or($prefix(production),$suffix(main))
– Matches if host metadata value starts with production or ends with main.When including special characters such as brackets (
and )
within your matching expressions, escape these characters with a tilde ~
. For example, to match a metadata value that includes brackets, like my(amazing)property
, you would write $eq(my~(amazing~)property)
.
If you selected OS Service, proceed according to your operating system.
Service property used for matching:
A monitoring rule may consist of multiple detection rules. All detection rules must be satisfied for the OS Service to match, as a logical AND
operation is applied across all specified conditions.
With these properties, we define the services to be monitored based on:
Display name visible to a system user
Path to the service binary
Manufacturer of the service
Service name representing the name or ID under which OS service is recognized
Condition in which you can define a string that:
$prefix
qualifier, for example $prefix(ss)
.$suffix
qualifier, for example $suffix(hd)
.$eq
qualifier, for example $eq(sshd)
.$contains
qualifier, for example $contains(ssh)
.$match
qualifier, for example $match(ip?tables*)
, where *
matches any number of characters (including zero) and ?
matches exactly one character.Available logic operations:
$not($eq(sshd))
– Matches if the service's property value is different from sshd
.$and($prefix(ss),$suffix(hd))
– Matches if service's property value starts with ss
and ends with hd
.$or($prefix(ss),$suffix(hd))
– Matches if service's property value starts with ss
or ends with hd
.When including special characters such as brackets (
and )
within your matching expressions, escape these characters with a tilde ~
. For example, to match a property value that includes brackets, like my(amazing)property
, you would write $eq(my~(amazing~)property)
.
With this property we define the services to be monitored based on their startup type.
Condition in which you can define a string that:
$eq
qualifier, for example $eq(manual)
.Available logic operations:
$not($eq(auto))
– Matches services with startup type different from Automatic.$or($eq(auto),$eq(manual))
– Matches if service's startup type is either Automatic or Manual.Use one of the following values as a parameter for this condition:
manual
for Manual - the service starts only if needed or if you invoke something to start the service.manual_trigger
for Manual (Trigger Start) - the service starts along with the startup of another service.auto
for Automatic - the service starts automatically.auto_delay
for Automatic (Delayed Start) - the service startup is delayed until the system has finished booting.auto_trigger
for Automatic (Trigger Start) - the service starts automatically on startup and may be started or stopped due to certain operating system events.auto_delay_trigger
for Automatic (Delayed Start, Trigger Start)disabled
for DisabledOneAgent version 1.247+
Dynatrace version 1.247+
optional
Select Add property to specify a custom key-value property for the policy.
For example, a property with a Key set to custom.message
and Value set to The {dt.osservice.name} is with status {dt.osservice.status}
(including placeholders {dt.osservice.name}
and {dt.osservice.status}
) will extract the OS service name and status values once the rule is triggered. If the placeholder substitution fails, both the key and the value will be unavailable.
For OneAgent version 1.255+, the {dt.osservice.display_name}
placeholder is available.
Additionally, you can utilize specific dt flags in the Key field to tailor the behavior of problem notifications and Davis AI analysis:
dt.event.title
: Customizes the title of the problem.dt.event.description
: Provides a detailed description for the problem.dt.event.allow_davis_merge
: Controls Davis AI's decision to merge events based on your settings.Select Save changes.
To manage the OS services
In Dynatrace, go to OS services monitoring for the level you are configuring.
Host group
and select the host group you want to configure.
The Host group property is not displayed when the selected host doesn't belong to any host group.
<group name>
link, where <group name>
is the name of the host group that you want to configure.Go to Settings > Monitoring > OS services monitoring.
The OS services you monitor are displayed in a table under the Add policy button.
Dynatrace version 1.243+
OneAgent version 1.243+
The Host overview page contains the OS services analysis section listing the OS services for which any policy (with active alerting or monitoring) is fulfilled.
For more information, see OS services analysis.
You can use the Settings API to configure your service availability monitoring at scale.
builtin:os-services-monitoring
as the schemaId.builtin:os-services-monitoring
schema, create your configuration object.