Set log alert
In this use case, you need to set an alert based on the occurrence of log events. See how you can extract data from logs, create a processing rule, build an alert by forming a log event, and check if your alert captures logs that meet predefined criteria.
Scenario
In this scenario, you need to set alerts that notify you every time NGINX logs with errors and refused connections are captured. You also need to extract additional information from the log content, including the error number, client IP address, and http_request
that resulted in an error. Then you need to add this information to your alert's description.
This process has the following steps:
- Build and run a DQL query to retrieve logs that trigger an alert.
- Create a log processing rule to extract the additional information from the log content.
- Build an alert by creating a log event.
- Check if there are logs that match your alert.
Before you begin
You need to determine fields and conditions for your DQL query, log processing rule, and log query:
- Logs with the
ERROR
log level and - Logs with the
Connection refused
phrase found in the log content and - NGINX technology
- Additional fields:
- Error number
- IP address HTTP request
Build and run a DQL query
To build and run your query
-
Go to Logs or Logs & Events (latest Dynatrace).
-
On the Logs and events page, turn on Advanced mode.
-
Select Copy for the code sample below.
fetch logs| filter matchesValue(process.technology, "nginx")| filter matchesValue(loglevel, "ERROR")| filter matchesPhrase(content, "Connection refused")| fields timestamp,content, process.technology| parse content, "LD '[error] ' INT:error_number '#' INT LD 'Connection refused)' LD 'client:' SPACE? IPADDR:client_ip LD 'request:' SPACE? DQS:http_request"| sort timestamp desc -
Paste the query into the query edit box and select Run query.
This query:
- Retrieves NGINX logs with the
ERROR
log level and theConnection refused
phrase found in the log content - Parses the exact error number in a separate column
Sample search results
timestamp
content
process.technology
client_ip
error_number
http_request
2023-04-13 10:53:56
2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https:/HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true
["Docker","Nginx"]
123.45.67.890
32
GET/HTTP/1.1
Define and test a processing rule
Define the rule
To define a processing rule
-
Select a row in the Search results from the query above to display a side panel with details about the selected row.
-
Select Create processing rule to add a rule.
-
Define a Rule name—In this example, Copy and paste the following name.
NGINX connection refused: extract error, client IP, request -
Define a Matcher query.
Your matcher query is a one-line query that contains conditions set using matcher-specific functions. Your matcher can be extracted directly from the
filter
fields in your DQL query.If there are multiple
filter
expressions in your query, you need to join them with andand
operator to maintain the one-line structure.For this example, Copy and paste the following matcher:
matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx") -
Define a Processor definition. In this example, Copy and paste the following processor definition:
PARSE(content, "LD '[error] ' INT:error.number '#' INT LD 'Connection refused' LD 'client:' SPACE? IPADDR:client_ip LD 'request:' SPACE? DQS:http_request")
Before you save your rule, test it.
Test the rule
In the Rule testing section:
-
Select Copy below and paste the sample log into Paste log/JSON sample.
{"content":"2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: \"GET /cart HTTP/1.1\", upstream: \"http://01.222.3.44:55/cart\", host: \"HOST-1\", referrer: \"https://HOST-1/cart\", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true","process.technology": "nginx","loglevel": "ERROR"} -
Select Test the rule and examine Test result.
Let's test the rule again, but this time for an error (error
.number`) 32.
-
Select Copy below and paste the sample log into Paste log/JSON sample.
{"content": "2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: \"GET /cart HTTP/1.1\", upstream: \"http://01.222.3.44:55/cart\", host: \"HOST-1\", referrer: \"https://HOST-1/cart\", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true","timestamp": "2023-04-17T14:20:20.222000000 +0000","process.technology": "nginx","loglevel": "ERROR","error.number": "32","client.ip": "123.45.67.89","http.request": "GET /cart HTTP/1.1"} -
Select Test the rule and examine Test result.
Create a log event
To set up an alert, create a log event.
-
Go to Settings and select Log Monitoring > Events extraction.
-
Select Add log event.
-
Enter a Summary. For this example, you can Copy below and paste into Summary.
Alert on Nginx connection refused -
Enter a Log query. For this example, you can Copy below and paste into Log query.
matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx")Optionally, you can add more conditions. For example, to include only logs with error numbers, Copy and paste the following matcher:
matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx") and isNotNull (error_number) -
In the Event template section, enter a Title and Description, and set Event type.
- Add the Title
For this example, you can Copy below and paste into Title.Nginx Connection Refused - Add a Description
You can add placeholders to the description that will be filled with the values from the entire log record.
For this example, you can Copy below and paste it into Description. It has placeholders for{error.number}
,{http.request}
, and{client.ip}
.Connection refused with error number: {error.number} on request: {http.request} from client: {client.ip} - Set Event type to Custom alert.
- Add the Title
-
Select Save changes.
Check logs that match your alert
If your logs fulfill the criteria set, a Problem is created.
To find the Problem
-
Go to Problems.
-
Find the problem you defined during custom log event creation. For this example, you can Copy below and paste into Filter by.
Nginx Connection Refused -
Select Analyze logs.
-
On the Logs and events Simple Mode page, select Run query.
In Results, you can see the errors that were captured according to the criteria that you set.
Sample results:
timestamp
status
content
2023-04-13 14:10:08
ERROR
2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true
2023-04-13 14:06:30
ERROR
2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true
2023-04-13 13:57:59
ERROR
2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true