Set log alert

In this use case, you need to set an alert based on the occurrence of log events. See how you can extract data from logs, create a processing rule, build an alert by forming a log event, and check if your alert captures logs that meet predefined criteria.

Scenario

In this scenario, you need to set alerts that notify you every time NGINX logs with errors and refused connections are captured. You also need to extract additional information from the log content, including the error number, client IP address, and http_request that resulted in an error. Then you need to add this information to your alert's description.

This process has the following steps:

  1. Build and run a DQL query to retrieve logs that trigger an alert.
  2. Create a log processing rule to extract the additional information from the log content.
  3. Build an alert by creating a log event.
  4. Check if there are logs that match your alert.

Before you begin

You need to determine fields and conditions for your DQL query, log processing rule, and log query:

  • Logs with the ERROR log level and
  • Logs with the Connection refused phrase found in the log content and
  • NGINX technology
  • Additional fields:
    • Error number
    • IP address HTTP request

Build and run a DQL query

To build and run your query

  1. Go to Logs or Logs & Events (latest Dynatrace).

  2. On the Logs and events page, turn on Advanced mode.

  3. Select Copy to clipboard Copy for the code sample below.

    fetch logs
    | filter matchesValue(process.technology, "nginx")
    | filter matchesValue(loglevel, "ERROR")
    | filter matchesPhrase(content, "Connection refused")
    | fields timestamp,content, process.technology
    | parse content, "LD '[error] ' INT:error_number '#' INT LD 'Connection refused)' LD 'client:' SPACE? IPADDR:client_ip LD 'request:' SPACE? DQS:http_request"
    | sort timestamp desc
  4. Paste the query into the query edit box and select Run query.

This query:

  • Retrieves NGINX logs with the ERROR log level and the Connection refused phrase found in the log content
  • Parses the exact error number in a separate column

Sample search results

timestamp

content

process.technology

client_ip

error_number

http_request

2023-04-13 10:53:56

2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https:/HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true

["Docker","Nginx"]

123.45.67.890

32

GET/HTTP/1.1

Define and test a processing rule

Define the rule

To define a processing rule

  1. Select a row in the Search results from the query above to display a side panel with details about the selected row.

  2. Select Create processing rule to add a rule.

  3. Define a Rule name—In this example, Copy to clipboard Copy and paste the following name.

    NGINX connection refused: extract error, client IP, request
  4. Define a Matcher query.

    Your matcher query is a one-line query that contains conditions set using matcher-specific functions. Your matcher can be extracted directly from the filter fields in your DQL query.

    If there are multiple filter expressions in your query, you need to join them with and and operator to maintain the one-line structure.

    For this example, Copy to clipboard Copy and paste the following matcher:

    matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx")
  5. Define a Processor definition. In this example, Copy to clipboard Copy and paste the following processor definition:

    PARSE(content, "LD '[error] ' INT:error.number '#' INT LD 'Connection refused' LD 'client:' SPACE? IPADDR:client_ip LD 'request:' SPACE? DQS:http_request")

Before you save your rule, test it.

Test the rule

In the Rule testing section:

  1. Select Copy to clipboard Copy below and paste the sample log into Paste log/JSON sample.

    {
    "content":"2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: \"GET /cart HTTP/1.1\", upstream: \"http://01.222.3.44:55/cart\", host: \"HOST-1\", referrer: \"https://HOST-1/cart\", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true",
    "process.technology": "nginx",
    "loglevel": "ERROR"
    }
  2. Select Test the rule and examine Test result.

Let's test the rule again, but this time for an error (error.number`) 32.

  1. Select Copy to clipboard Copy below and paste the sample log into Paste log/JSON sample.

    {
    "content": "2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: \"GET /cart HTTP/1.1\", upstream: \"http://01.222.3.44:55/cart\", host: \"HOST-1\", referrer: \"https://HOST-1/cart\", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true",
    "timestamp": "2023-04-17T14:20:20.222000000 +0000",
    "process.technology": "nginx",
    "loglevel": "ERROR",
    "error.number": "32",
    "client.ip": "123.45.67.89",
    "http.request": "GET /cart HTTP/1.1"
    }
  2. Select Test the rule and examine Test result.

Create a log event

To set up an alert, create a log event.

  1. Go to Settings and select Log Monitoring > Events extraction.

  2. Select Add log event.

  3. Enter a Summary. For this example, you can Copy to clipboard Copy below and paste into Summary.

    Alert on Nginx connection refused
  4. Enter a Log query. For this example, you can Copy to clipboard Copy below and paste into Log query.

    matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx")

    Optionally, you can add more conditions. For example, to include only logs with error numbers, Copy to clipboard Copy and paste the following matcher:

    matchesValue(loglevel, "ERROR") and matchesPhrase(content, "Connection refused") and matchesValue(process.technology, "nginx") and isNotNull (error_number)
  5. In the Event template section, enter a Title and Description, and set Event type.

    • Add the Title
      For this example, you can Copy to clipboard Copy below and paste into Title.
      Nginx Connection Refused
    • Add a Description
      You can add placeholders to the description that will be filled with the values from the entire log record.
      For this example, you can Copy to clipboard Copy below and paste it into Description. It has placeholders for {error.number}, {http.request}, and {client.ip}.
      Connection refused with error number: {error.number} on request: {http.request} from client: {client.ip}
    • Set Event type to Custom alert.
  6. Select Save changes.

Check logs that match your alert

If your logs fulfill the criteria set, a Problem is created.

To find the Problem

  1. Go to Problems.

  2. Find the problem you defined during custom log event creation. For this example, you can Copy to clipboard Copy below and paste into Filter by.

    Nginx Connection Refused
  3. Select Analyze logs.

  4. On the Logs and events Simple Mode page, select Run query.

    In Results, you can see the errors that were captured according to the criteria that you set.

    Sample results:

    timestamp

    status

    content

    2023-04-13 14:10:08

    ERROR

    2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true

    2023-04-13 14:06:30

    ERROR

    2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true

    2023-04-13 13:57:59

    ERROR

    2023/04/13 08:53:56 [error] 32#32: *100507 connect() failed (111: Connection refused) while connecting to upstream, client: 123.45.67.890, server: , request: "GET /cart HTTP/1.1", upstream: "http://01.222.3.44:55/cart", host: "HOST-1", referrer: "https://HOST-1/cart", dt.trace_id: 1abc2, dt.span_id: d123e, dt.trace_sampled: true