Skip to main content
CoreSDK
Core Concepts

PII & Secrets Masking

Automatic redaction of API keys, tokens, passwords, emails, SSNs, and credit card numbers before spans leave the process. Zero-tolerance policy enforced by SpanProcessor, fuzz-tested with cargo fuzz.

PII & Secrets Masking

CoreSDK enforces a zero-tolerance policy on PII and secrets in telemetry. Redaction runs inside a SpanProcessor, not a SpanExporter — sensitive values are scrubbed before they enter the export queue. Nothing reaches your collector, log sink, or external vendor with raw credentials or personal data.

What gets redacted

The default ruleset covers the following categories out of the box:

CategoryExamplesDetection method
API keyssk-..., Bearer ..., Authorization headersPattern prefix + aho-corasick keyword scan
Passwordspassword=, passwd=, pwd= in query strings or JSONaho-corasick keyword scan
TokensJWTs (eyJ...), OAuth tokens, session cookiesRegex (base64url structure)
Email addressesuser@example.comRFC 5322 regex
US Social Security Numbers123-45-6789Regex
Credit card numbers13–19 digit sequences with Luhn checkRegex + Luhn verification
Private keysPEM blocks (-----BEGIN PRIVATE KEY-----)Regex

Redacted values are replaced with [REDACTED]. The span attribute key is preserved so you can see which field was sensitive without exposing its value.

Why SpanProcessor, not SpanExporter

The masking processor implements opentelemetry_sdk::trace::SpanProcessor and hooks into on_end. This fires after your handler sets span attributes but before the batch exporter flushes to the collector. A SpanExporter-based approach would allow raw values to sit in the export queue; the SpanProcessor approach eliminates that window entirely.

Detection internals

Two complementary engines run on every span attribute value:

aho-corasick keyword scan — a compiled multi-pattern automaton built at SDK startup from the keyword list. Runs in O(n) time regardless of how many patterns are active. Used for keyword-prefix matching (e.g., any attribute whose value starts with Bearer , or any JSON key named password).

Regex engine — used for structural patterns (email, SSN, credit card) that require positional matching. Each regex is pre-compiled and anchored to avoid catastrophic backtracking.

Both engines operate on raw attribute string values after JSON serialization, so structured log fields are covered as well as free-form strings.

Python: registering the masking processor

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from coresdk.tracing.processor import PIIMaskingSpanProcessor

provider = TracerProvider()
provider.add_span_processor(PIIMaskingSpanProcessor())
trace.set_tracer_provider(provider)

The PIIMaskingSpanProcessor is registered directly on the OTel TracerProvider. It intercepts every span before export.

Masking individual values

from coresdk.tracing.processor import PIIMaskingSpanProcessor

processor = PIIMaskingSpanProcessor()

# Mask a single value
safe = processor.mask_value("user@example.com")
# → "[REDACTED]"

# Mask a dict of span attributes
safe_attrs = processor.mask_attributes({
    "http.request.body": '{"email":"user@example.com","password":"s3cr3t"}',
    "user.id": "usr_123",
})
# → {"http.request.body": '{"email":"[REDACTED]","password":"[REDACTED]"}', "user.id": "usr_123"}

Rust: registering the masking processor

use coresdk_engine::{Engine, EngineConfig};
use coresdk_engine::masking::PIIMaskingSpanProcessor;
use opentelemetry_sdk::trace::TracerProvider;

let provider = TracerProvider::builder()
    .with_span_processor(PIIMaskingSpanProcessor::default())
    .build();

Structured JSON masking

When a span attribute contains a JSON string, the masking processor deserializes it, walks every key-value pair, and redacts matching values before re-serializing. This covers request/response body logging.

# Attribute value before masking:
# {"user": "alice", "password": "s3cr3t", "email": "alice@example.com"}

# Attribute value after masking:
# {"user": "alice", "password": "[REDACTED]", "email": "[REDACTED]"}

JSON masking is depth-limited to 16 levels to prevent stack overflow on adversarial input.

Testing: assert_no_pii

Every SDK ships an assert_no_pii helper for use in test suites. It inspects all spans produced during a test and raises if any attribute value contains an unredacted sensitive value.

from coresdk.testing._mock import assert_no_pii, MockSDK

async def test_login_does_not_leak_password():
    mock = MockSDK()

    await login_handler(mock, password="hunter2")

    # Raises AssertionError if any span attribute contains "hunter2"
    # or matches any built-in PII pattern
    assert_no_pii(mock.spans())

assert_no_pii checks both the pattern ruleset and a list of literal values you supply. Use it on every endpoint that touches user input.

Rust equivalent

use coresdk_engine::testing::assert_no_pii;
use opentelemetry_sdk::testing::trace::TestSpanExporter;

#[tokio::test]
async fn login_endpoint_does_not_leak_password() {
    let exporter = TestSpanExporter::default();
    // ... configure provider with exporter ...

    login_handler(request_with_password("hunter2")).await;

    assert_no_pii(exporter.spans());
}

Fuzz testing

The masking processor is fuzz-tested with cargo fuzz. The fuzz corpus covers:

  • Arbitrary UTF-8 strings, including null bytes and overlong sequences
  • Deeply nested JSON with adversarial key names
  • Strings designed to trigger catastrophic backtracking in naive regex engines
  • Partial PII values split across JSON field boundaries

Run the fuzz target locally:

// From the repo root:
// cargo fuzz run masking_processor -- -max_total_time=60

CI runs the fuzz corpus in replay mode on every pull request. New crash inputs are committed to the corpus automatically.

Compliance references

StandardRequirement addressed
GDPR Art. 25Data protection by design and by default — masking at the instrumentation layer prevents accidental export
CCPAPersonal information must not be disclosed to third parties without consent — redaction prevents telemetry vendors from receiving PII
ISO 27018Controls for protection of PII in public cloud computing — covers secrets and tokens in log pipelines

Next steps

  • Observability — how spans flow from SpanProcessor to your OTel collector
  • TLS & mTLS — HMAC keys distributed over mTLS only, never in config or environment variables
  • Error Handling — RFC 9457 errors never include raw PII in the detail field

On this page