PII & Secrets Masking
Automatic redaction of API keys, tokens, passwords, emails, SSNs, and credit card numbers before spans leave the process. Zero-tolerance policy enforced by SpanProcessor, fuzz-tested with cargo fuzz.
PII & Secrets Masking
CoreSDK enforces a zero-tolerance policy on PII and secrets in telemetry. Redaction runs inside a SpanProcessor, not a SpanExporter — sensitive values are scrubbed before they enter the export queue. Nothing reaches your collector, log sink, or external vendor with raw credentials or personal data.
What gets redacted
The default ruleset covers the following categories out of the box:
| Category | Examples | Detection method |
|---|---|---|
| API keys | sk-..., Bearer ..., Authorization headers | Pattern prefix + aho-corasick keyword scan |
| Passwords | password=, passwd=, pwd= in query strings or JSON | aho-corasick keyword scan |
| Tokens | JWTs (eyJ...), OAuth tokens, session cookies | Regex (base64url structure) |
| Email addresses | user@example.com | RFC 5322 regex |
| US Social Security Numbers | 123-45-6789 | Regex |
| Credit card numbers | 13–19 digit sequences with Luhn check | Regex + Luhn verification |
| Private keys | PEM blocks (-----BEGIN PRIVATE KEY-----) | Regex |
Redacted values are replaced with [REDACTED]. The span attribute key is preserved so you can see which field was sensitive without exposing its value.
Why SpanProcessor, not SpanExporter
The masking processor implements opentelemetry_sdk::trace::SpanProcessor and hooks into on_end. This fires after your handler sets span attributes but before the batch exporter flushes to the collector. A SpanExporter-based approach would allow raw values to sit in the export queue; the SpanProcessor approach eliminates that window entirely.
Detection internals
Two complementary engines run on every span attribute value:
aho-corasick keyword scan — a compiled multi-pattern automaton built at SDK startup from the keyword list. Runs in O(n) time regardless of how many patterns are active. Used for keyword-prefix matching (e.g., any attribute whose value starts with Bearer , or any JSON key named password).
Regex engine — used for structural patterns (email, SSN, credit card) that require positional matching. Each regex is pre-compiled and anchored to avoid catastrophic backtracking.
Both engines operate on raw attribute string values after JSON serialization, so structured log fields are covered as well as free-form strings.
Python: registering the masking processor
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from coresdk.tracing.processor import PIIMaskingSpanProcessor
provider = TracerProvider()
provider.add_span_processor(PIIMaskingSpanProcessor())
trace.set_tracer_provider(provider)The PIIMaskingSpanProcessor is registered directly on the OTel TracerProvider. It intercepts every span before export.
Masking individual values
from coresdk.tracing.processor import PIIMaskingSpanProcessor
processor = PIIMaskingSpanProcessor()
# Mask a single value
safe = processor.mask_value("user@example.com")
# → "[REDACTED]"
# Mask a dict of span attributes
safe_attrs = processor.mask_attributes({
"http.request.body": '{"email":"user@example.com","password":"s3cr3t"}',
"user.id": "usr_123",
})
# → {"http.request.body": '{"email":"[REDACTED]","password":"[REDACTED]"}', "user.id": "usr_123"}Rust: registering the masking processor
use coresdk_engine::{Engine, EngineConfig};
use coresdk_engine::masking::PIIMaskingSpanProcessor;
use opentelemetry_sdk::trace::TracerProvider;
let provider = TracerProvider::builder()
.with_span_processor(PIIMaskingSpanProcessor::default())
.build();Structured JSON masking
When a span attribute contains a JSON string, the masking processor deserializes it, walks every key-value pair, and redacts matching values before re-serializing. This covers request/response body logging.
# Attribute value before masking:
# {"user": "alice", "password": "s3cr3t", "email": "alice@example.com"}
# Attribute value after masking:
# {"user": "alice", "password": "[REDACTED]", "email": "[REDACTED]"}JSON masking is depth-limited to 16 levels to prevent stack overflow on adversarial input.
Testing: assert_no_pii
Every SDK ships an assert_no_pii helper for use in test suites. It inspects all spans produced during a test and raises if any attribute value contains an unredacted sensitive value.
from coresdk.testing._mock import assert_no_pii, MockSDK
async def test_login_does_not_leak_password():
mock = MockSDK()
await login_handler(mock, password="hunter2")
# Raises AssertionError if any span attribute contains "hunter2"
# or matches any built-in PII pattern
assert_no_pii(mock.spans())assert_no_pii checks both the pattern ruleset and a list of literal values you supply. Use it on every endpoint that touches user input.
Rust equivalent
use coresdk_engine::testing::assert_no_pii;
use opentelemetry_sdk::testing::trace::TestSpanExporter;
#[tokio::test]
async fn login_endpoint_does_not_leak_password() {
let exporter = TestSpanExporter::default();
// ... configure provider with exporter ...
login_handler(request_with_password("hunter2")).await;
assert_no_pii(exporter.spans());
}Fuzz testing
The masking processor is fuzz-tested with cargo fuzz. The fuzz corpus covers:
- Arbitrary UTF-8 strings, including null bytes and overlong sequences
- Deeply nested JSON with adversarial key names
- Strings designed to trigger catastrophic backtracking in naive regex engines
- Partial PII values split across JSON field boundaries
Run the fuzz target locally:
// From the repo root:
// cargo fuzz run masking_processor -- -max_total_time=60CI runs the fuzz corpus in replay mode on every pull request. New crash inputs are committed to the corpus automatically.
Compliance references
| Standard | Requirement addressed |
|---|---|
| GDPR Art. 25 | Data protection by design and by default — masking at the instrumentation layer prevents accidental export |
| CCPA | Personal information must not be disclosed to third parties without consent — redaction prevents telemetry vendors from receiving PII |
| ISO 27018 | Controls for protection of PII in public cloud computing — covers secrets and tokens in log pipelines |
Next steps
- Observability — how spans flow from SpanProcessor to your OTel collector
- TLS & mTLS — HMAC keys distributed over mTLS only, never in config or environment variables
- Error Handling — RFC 9457 errors never include raw PII in the
detailfield
Error Handling
CoreSDK error format (RFC 9457), ProblemDetailError, built-in error types for auth, policy, and tenant failures, mapping SDK errors to HTTP responses, and customizing error output.
TLS 1.3 & mTLS
SDK-to-Sidecar communication is always mutually authenticated — automatic, no application code required. ECDSA P-256 client certificates rotate every 24 hours. rustls only, no OpenSSL.