OpenPipeline processing examples

This article focuses on data processing scenarios and provides standalone examples of how to configure the OpenPipeline processors in order to achieve a result.

Configure a new processor

To configure a new processor in OpenPipeline

  1. Go to OpenPipeline OpenPipeline app (new) and find the pipeline for the record source (or create a new pipeline).
  2. Select the stage.
  3. Select Add Processor and choose the processor.
  4. Configure the processor by entering the required fields. Note that required fields vary based on the processor and are indicated in the user interface.
  5. Save the pipeline.

Examples

Expand the steps for the following examples to learn how to configure the processors.

Fix unrecognized timestamp and loglevel based on a matched log source

A stored event from an application (myLogSource) in the log viewer is missing a proper timestamp and loglevel. You can retrieve this information from the source and parse it to achieve the following:

  • Transform the unrecognized timestamp to a log event timestamp.
  • Show a loglevel for the log.
  • Extract the thread name from the log line into a new attribute (thread.name).
  1. Find the matching condition.

    1. Go to Logs and events Logs and Events and turn on Advanced mode.

    2. Enter the following DQL query to filter log data from the log source. Make sure to modify myLogSource with the log source.

      fetch logs
      | filter matchesValue(log.source, "myLogSource")
    3. Run the query and, when you're satisfied with the filter result, copy the matchesValue() function.

      matchesValue(log.source, "myLogSource")
  2. Go to OpenPipeline > Logs, switch to the Pipelines tab, and select (or create) the pipeline for the log ingest source.

  3. Configure a DQL processor in the Processing stage as follows.

    Matching condition

    The matchesValue() function that you copied.

    Sample data

    {
    "content":"April 24, 2022 09:59:52 [myPool-thread-1] INFO Lorem ipsum dolor sit amet",
    "status":"NONE",
    "timestamp":"1650889391528",
    "log.source":"myLogSource",
    "loglevel":"NONE"
    }

    DQL processor definition

    parse content, "TIMESTAMP('MMMMM d, yyyy HH:mm:ss'):timestamp ' [' LD:'thread.name' '] ' UPPER:loglevel
    // Parses out the timestamp, thread name, and log level.
    // `TIMESTAMP` looks for the specific datetime format. The matched value is set as the existing timestamp log attribute.
    // `LD` matches any chars between literals `' ['` and `'] '`.
    // `UPPER` matches uppercase letters.
    // The remaining part of the content is not matched.
  4. Save the pipeline.

Conclusion

The processed log record is displayed with metadata, including a timestamp and the loglevel attribute with proper values and the extracted attribute thread.name. Once new data is ingested, the processed records have a timestamp, a loglevel, and the thread name as separate attributes. You can visualize the new format, for example, in a notebook.

Before

{
"content":"April 24, 2022 09:59:52 [myPool-thread-1] INFO Lorem ipsum dolor sit amet",
"status":"NONE",
"timestamp":"1650889391528",
"log.source":"myLogSource",
"loglevel":"NONE"
}

After

{
"results":
[
{
"matched": true,
"record": {
"loglevel": "INFO",
"log.source": "myLogSource",
"thread.name": "myPool-thread-1",
"content": "April 24, 2022 09:59:52 [myPool-thread-1] INFO Lorem ipsum dolor sit amet",
"timestamp": "2022-04-24T09:59:52.000000000Z",
"status": "NONE"
}
}
]
}

Parse a field containing JSON as a raw string

A record has a field content (String) containing JSON input from which you want to parse out information. You can process specific fields, nested fields, or all fields, and treat them as plain text or bring them to top-level without knowing the schema of the JSON.

Depending on the type of field you want to parse out, configure a DQL processor in the Processing stage with a DQL processor definition copied from one of the following:

Parse out attributes with different formats

Applications log the user ID with different schemes (user ID=, userId=, userId: , user ID =). You can parse out attributes with different formats via a single pattern expression that uses the optional modifier (?) and Alternative Groups.

To extract the user identifier as a standalone log attribute, configure a DQL processor in the Processing with the following DQL processor definition.

parse content, "
LD // Matches any text within a single line
('user'| 'User') // Matches specified literals
SPACE? // Matches optional punctuation
('id'|'Id'|'ID')
SPACE?
PUNCT?
SPACE?
INT:my.user.id"
Conclusion

With a single definition, you've extracted the user identifier from different log schemes and applied a standardized format that can be used in further stages.

Before

03/22 08:52:51 INFO user ID=1234567 Call = 0319 Result = 0
03/22 08:52:51 INFO UserId = 1234567 Call = 0319 Result = 0
03/22 08:52:51 INFO user id=1234567 Call = 0319 Result = 0
03/22 08:52:51 INFO User ID: 1234567 Call = 0319 Result = 0
03/22 08:52:51 INFO userid: 1234567 Call = 0319 Result = 0

After

"my.user.id":"1234567"

Use specialized DPL matchers

A JSON file contains information that you want to parse out and create new dedicate fields for it, based on the format. You can use Dynatrace Pattern Language (DPL) matchers for easier pattern building.

To use DPL matchers to identify and create new dedicated fields for a timestamp, a loglevel, the IP address, the endpoint, and response code from the JSON file content, configure a DQL processor in the Processing stage with the following definition.

parse content, "ISO8601:timestamp SPACE UPPER:loglevel SPACE IPADDR:ip SPACE DQS:request SPACE INTEGER:code"
Conclusion

You created new fields for the timestamp, a loglevel, IP address, endpoint, and response code, based on the format used in your JSON file.

Before

{
"content": "2022-05-11T13:23:45Z INFO 192.168.33.1 GET /api/v2/logs/ingest HTTP/1.0 200"
}

After

{
"request": "GET /api/v2/logs/ingest HTTP/1.0",
"code": 200,
"loglevel": "INFO",
"ip": "192.168.33.1",
"timestamp": "2022-05-11T13:23:45.000000000Z",
"content": "2022-05-11T13:23:45Z INFO 192.168.33.1 GET /api/v2/logs/ingest HTTP/1.0 200"
}

Perform basic math on attributes

You can parse out specific values from a JSON file, perform calculations, and format the results by leveraging DQL functions and operators.

Configure a DQL processor in the Processing stage with the following definition.

parse content, "LD 'total: ' INT:total '; failed: ' INT:failed" // Parses `total` and `failed` field values.
| fieldsAdd failed.percentage = 100.0 * failed / total // Calculates the failure percentage, formats the result to be a percentage, and stores it in a new attribute (`failed.percentage`).
| fieldsRemove total, failed // Removes temporary fields that are no longer needed from the JSON file.
Conclusion

You calculated the failure percentage based on the JSON content and created a new dedicated field.

Before

{
"content": "Lorem ipsum total: 1000; failed: 255",
}

After

{
"content": "Lorem ipsum total: 1000; failed: 255",
"failed.percentage": 25.5
}

Add new attributes

You can add attributes that have static or dynamic values by leveraging different processors, with and without DQL queries.

To add attributes, configure one of the following processors in the Processing stage.

Remove attributes

You can remove attributes by leveraging different processors, with and without DQL queries.

To remove specific fields

  • Configure a Remove fields processor in the Processing stage by providing the field names. This processor doesn't leverage DQL queries.

  • Configure a DQL processor in the Processing stage by entering a definition that contains the fieldsRemove command, such as the following example:

    fieldsRemove redundant.attribute
Conclusion

Before

{
"redundant.attribute": "value",
"content": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla ac neque nisi. Nunc accumsan sollicitudin lacus."
}

After

{
"content": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nulla ac neque nisi. Nunc accumsan sollicitudin lacus."
}

Rename attributes

You can rename attributes by leveraging different processors, with and without DQL queries.

To rename an attribute of a matching record to a static value,

  • Configure a Rename fields processor in the Processing stage by providing the field names that you want to be renamed and the new names. This processor doesn't leverage DQL queries.

  • Configure a DQL processor in the Processing stage by entering a definition that contains the fieldsRename command, such as the following example:

    fieldsRename better_name = field // Renames a field to a static value
Conclusion

Before

{
"content": {"field": "Lorem ipsum"}
}

After

"content": {"better_name": "Lorem ipsum"}

Drop records

You can drop ingested records at different stages by leveraging different processors.

To drop an ingested record

  • Before it's processed, configure a Drop record processor in the Processing stage by providing a matcher query.
  • After it's processed, configure a No storage assignment processor in the Storage stage by providing a matcher query.
Conclusion

The matching records won't be stored in Grail.

Mask data

You can mask parts of an attribute by leveraging replacePattern in combination with other DQL functions.

In this scenario you want to mask part of an IP address. Configure a DQL processor in the Processing stage with one of the following definitions, depending on the part that you want to mask.