Latest Dynatrace
In today's continuous delivery approach, even with extensive QA processes, some errors are only discovered through real-life infrastructure usage as close to production as possible. Dynatrace allows R&D teams to easily react to bugs or problems by integrating a Jira ticket creation workflow to resolve team dependencies.
This guide presents a straightforward method for utilizing Dynatrace to clarify ambiguous responsibilities and interdependencies. In this use case, you create an internal tool that takes care of identifying bugs from logs, as well as grouping, triaging, and distributing them to a bug tracker that clarifies ambiguous responsibilities and interdependencies. Such an internal tool will allow you to:
This article is intended for development team leaders, development teams and anyone who is responsible to triaging and assigning bugs to development teams.
Your company operates considerable infrastructure. You use Dynatrace to monitor both the production and pre-production segments of this infrastructure. Dynatrace monitors thousands of hosts and ingests multiple TB of logs per day.
These logs contain valuable information about errors that occurred on the systems that might need to be addressed by the various development teams.
With many teams, distributed systems, interconnected components, and large volumes of log data, it is challenging to clarify responsibilities and interdependencies.
Teams meet for a daily bug meeting and review newly discovered problems to ensure that the ownership of each item is clearly defined.
The volume makes it impossible to analyze manually, and the large number of teams adds the complexity of bringing the right log lines to the right teams.
Make sure all of these are true before you start:
storage:logs:read
. For instructions, see Assign permissions in Grail.To access permissions, go to the Settings menu in the upper-right corner of the Workflows app and select Authorization settings.
Dynatrace Automated bug triaging and ticketing consists of three stages:
Complete the following steps to create your internal tool for bug triaging and distribution.
Ingest log and event data finto Dynatrace using OneAgent or Dynatrace API:
Using your backend service, aggregate and group log data and combine that into business events.
Filter log events by severity (critical and high) and timestamp (newest events).
Create specific DQL queries to fetch the logs and have a customised overview.
The following query examples allow you to filter log data with specific loglevel value in OneAgent logs (based on specific log.source
) and log data with specific loglevel value for specific process groups (based on dt.entity.process_group
)
fetch logs, timeframe:"2023-10-04T00:00:00Z/2023-10-04T04:00:00Z"| filter startsWith(log.source, "/var/log/dynatrace/oneagent/java/")| filter loglevel == "WARN" OR loglevel == "SEVERE"
fetch logs, timeframe:"2023-10-01T09:03:18Z/2023-10-02T09:03:18Z", scanLimitGBytes: -1| filter dt.entity.process_group == "PROCESS_GROUP-9C77F43CE98EAA17" ORdt.entity.process_group == "PROCESS_GROUP-a3444fa7ad1c3d41" ORdt.entity.process_group == "PROCESS_GROUP-e50850ceefbd9c7f" ORdt.entity.process_group == "PROCESS_GROUP-689574f4e137425c" ORdt.entity.process_group == "PROCESS_GROUP-37925f9c37724091"| filter ((loglevel == "WARN" ORloglevel == "WARNING" ORloglevel == "SEVERE" ORloglevel == "ERROR" ORloglevel == "CRITICAL" ORloglevel == "ALERT"))
Group log messages.
Group similar log messages together to avoid duplicates and, based on tags or group metadata, identify teams that are likely the correct owners of the bug. Save the metadata, such as timestamp, thread name, class name, and log level.
Process the log message.
For the log message processing, split the message into tokens and categorize them as mandatory word (conventionally prefixed with ##
), word, separator, and non-relevant word (dynamic log parts as IDs or timestamps). The grouping message of a log is then built as classname
(if present), plus the first four more relevant words, plus all remaining mandatory words.
To summarize, the log records are grouped by:
Create workflow based on business event to automatically create and assign Jira tickets.
Create Dynatrace Application to consume and present grouped log data.
Using Dynatrace AppEngine create a dashboard application that consumes the grouped log data and based on this preliminary triaging and vetting, creates Jira issues in the correct projects assigned to the right teams.
Connect the dashboard application to the backend (via the EdgeConnect) and execute API calls. Dynatrace applications can also interact with your on-premises systems which you can run in your corporate network.
The dashboard application will analyze error or warning logs that indicate bugs in the code or system configuration that needs to be enhanced (for example, low CPU or low disk space). It can show in the overview the grouped logs for which tickets and comments can be automatically or manually created.
By implementing dashboard application with Dynatrace and including it in a daily meeting, teams can effectively detect, triage, and prioritize bugs, and automatically assign tickets to the right teams eliminating dependencies between teams. Teams have an overview for the component and services they are interested in and this allows them to focus and to respond quicker to problems before they reach the production and customers.
All teams have the same view of newly discovered bugs and can claim ownership in a self-service fashion. For all bugs, responsibilities and ownership can be clarified in a daily sync of less than half an hour. Without this common view, discussions would be dispersed and could easily take hours.
This results in higher quality of the product, elimination of manual correlation and ultimately also impacts the required effort for customer support.