Telemetry data can often include sensitive data (such as PII), which may need to be redacted due to security and regulatory reasons. While this can be implemented on the application side, it typically is best to handle this centrally using gateways such as the Collector. This enables the single-point management of redaction rules across all your applications and services, without the need to update your code each time a new redaction rule is required.
This page shows sample Collector configurations for the redaction of specific sensitive data (for example, credit card numbers or email addresses) which may appear in telemetry data and which should be masked/redacted before leaving your network.
The following examples make use of these two Collector processors:
While the following examples use both processors to mask data, each processor has its own distinct purpose. The redaction processor is straightforward and takes a list of values, based on which matching data will be completely redacted. On the other hand, the purpose of the transform processor is more versatile and goes beyond mere data redaction.
For data redaction, typically either processor can be used and you may want to choose the one best for your use case. For example, for full data redaction, the redaction processor may be easier to use. On the other hand, partial data redaction can only be achieved with the transform processor. In addition, the transform processor can also filter by data in the body of logs, whereas the redaction processor only has access to attributes.
This YAML document is a basic Collector configuration skeleton, containing basic, general components (that is, receivers, exporters, and the pipeline definition).
receivers:otlp:protocols:grpc:endpoint: 0.0.0.0:4317http:endpoint: 0.0.0.0:4318processors:PLACEHOLDER-FOR-PROCESSOR-CONFIGURATIONSexporters:otlphttp:endpoint: ${env:DT_ENDPOINT}headers:"Authorization": "Api-Token ${env:DT_API_TOKEN}"service:pipelines:traces:receivers: [otlp]processors: [PLACEHOLDER-FOR-PROCESSOR-REFERENCES]exporters: [otlphttp]metrics:receivers: [otlp]processors: [PLACEHOLDER-FOR-PROCESSOR-REFERENCES]exporters: [otlphttp]logs:receivers: [otlp]processors: [PLACEHOLDER-FOR-PROCESSOR-REFERENCES]exporters: [otlphttp]
Make sure to replace the placeholder values in the document with the respective configurations:
PLACEHOLDER-FOR-PROCESSOR-CONFIGURATIONS
: the relevant processor configurationPLACEHOLDER-FOR-PROCESSOR-REFERENCES
: referencing the applicable processor objects for the individual signal typesUsing the transform processor, we mask the attribute client.address
with the set
statement.
transform:error_mode: ignoretrace_statements:- context: spanstatements: &filter-statements# this will not only mask end user client IP addresses,# but also the address of a server acting as a client when establishing a connection to another server- set(attributes["client.address"], "<masked-ac-ot-clientip>")metric_statements:- context: datapointstatements: *filter-statementslog_statements:- context: logstatements: *filter-statements
Using the transform processor, we mask the attribute user.email
with the set
statement.
transform:error_mode: ignoretrace_statements:- context: spanstatements: &filter-statements- set(attributes["user.email"], "<masked-ac-ot-email>")metric_statements:- context: datapointstatements: *filter-statementslog_statements:- context: logstatements: *filter-statements
Using the redaction processor, we use the regular expression dt0[a-z]0[1-9]\.[A-Za-z0-9]{24}\.([A-Za-z0-9]{64})
to mask all occurrences of Dynatrace API tokens in our telemetry data.
redaction:allow_all_keys: trueblocked_values:- dt0[a-z]0[1-9]\.[A-Za-z0-9]{24}\.([A-Za-z0-9]{64})summary: info
Using the transform processor, we mask the attributes user.id
, user.name
, and user.full_name
with the set
statement.
transform:error_mode: ignoretrace_statements:- context: spanstatements: &filter-statements- set(attributes["user.id"], "<masked-ac-ot-userid")- set(attributes["user.name"], "<masked-ac-ot-username")- set(attributes["user.full_name"], "<masked-ac-ot-userfullname")metric_statements:- context: datapointstatements: *filter-statementslog_statements:- context: logstatements: *filter-statements
Using the transform processor, we configure three replace_all_patterns
statements to mask any occurrences of credit card numbers and mask everything but the last four digits.
transform:error_mode: ignoretrace_statements:- context: spanstatements: &filter-statements- replace_all_patterns(attributes, "value", "^3\\s*[47](\\s*[0-9]){9}((\\s*[0-9]){4})$", "<masked-ac-ot-pcard$$2>")- replace_all_patterns(attributes, "value", "^(5[1-5]([0-9]){2}|222[1-9]|22[3-9]\\d|2[3-6]\\d{2}|27[0-1]\\d|2720)(\\s*[0-9]){8}\\s*([0-9]{4})$", "<masked-ac-ot-pcard$$4>")- replace_all_patterns(attributes, "value", "^4(\\s*[0-9]){8,14}\\s*(([0-9]\\s*){4})$", "<masked-ac-ot-pcard$$2>")metric_statements:- context: datapointstatements: *filter-statementslog_statements:- context: logstatements: *filter-statements
Using the redaction processor, we use the regular expression ^[A-Z]{2}[0-9]{2}(\\s*[A-Z0-9]){8,30}$
to mask all IBAN occurrences in our telemetry data.
redaction:allow_all_keys: trueblocked_values:- "^[A-Z]{2}[0-9]{2}(\\s*[A-Z0-9]){8,30}$"summary: info
Validate your settings to avoid any configuration issues.
For our configuration, we use the following components.
Under receivers
, we specify the standard otlp
receiver as active receiver component for our Collector instance.
Under processors
, we place the configuration for the relevant processor instances.
Under exporters
, we specify the default otlphttp
exporter and configure it with our Dynatrace API URL and the required authentication token.
For this purpose, we set the following two environment variables and reference them in the configuration values for endpoint
and Authorization
.
DT_ENDPOINT
contains the base URL of the Dynatrace API endpoint (for example, https://{your-environment-id}.live.dynatrace.com/api/v2/otlp
)DT_API_TOKEN
contains the API tokenUnder service
, we eventually assemble all the configured objects into pipelines for the individual telemetry signals (traces, etc.) and have the Collector instance run the configured tasks.