LLM Trace Export
Send a minimal, privacy-safe diagnostic payload to an LLM for root-cause analysis — no variable values, no PII, local-only by default.
Phase note. Ships Phase 2.
LLM Trace Export
CoreSDK can export a minimal diagnostic payload to a local or external LLM endpoint to assist with root-cause analysis. The payload is deliberately narrow: it contains the RFC 9457 error detail, a call chain of module/file/line/intent per span, and recovery hints. It does not contain variable values, request bodies, or any data that could carry PII.
Variable inspection stays in the local trace viewer (core trace tail). LLM export is a separate,
opt-in pipeline.
Privacy model
| What is included | What is excluded |
|---|---|
RFC 9457 ProblemDetail (type URI, title, status, detail string) | Variable values at any stack frame |
| Call chain: module, file, line number, span intent | Request headers and bodies |
| Span durations and error flags | Secret values |
| Recovery hints attached by CoreSDK middleware | User IDs and tenant IDs |
| Service name and SDK version | Any attribute that matches the PII masking ruleset |
The LLM export pipeline is a separate SpanProcessor that runs after PII masking. Even if a
variable value somehow reached the span as an attribute, it would be redacted by the masking
processor before the LLM processor sees it.
Configuration
LLM export is local-only by default. The payload is written to a local socket consumed by the
core CLI and never leaves the host unless you configure an external endpoint.
To send to an external LLM API, set CORESDK_LLM_ENDPOINT explicitly. This is an opt-in step and
requires deliberate configuration — it will not happen automatically.
# coresdk-sidecar.yaml
llm_export:
enabled: true
endpoint: "" # empty = local only; set to opt in to external LLM
# endpoint: "https://api.openai.com/v1/chat/completions"
customer_egress_policy: "acme/llm" # Rego policy that gates export decisions
max_payload_bytes: 8192 # truncate call chain if neededEnvironment variables:
| Variable | Default | Description |
|---|---|---|
CORESDK_LLM_EXPORT | false | Enable LLM export pipeline |
CORESDK_LLM_ENDPOINT | — | External LLM API endpoint. Unset means local only. |
CORESDK_LLM_EGRESS_POLICY | — | Rego policy path that must return allow = true before export |
Annotating spans with intent
Use the @trace decorator (Python) or the trace! macro (Rust) to attach a human-readable
intent string to a span. This is the primary signal the LLM uses to understand what each frame
was trying to do.
from coresdk.tracing.decorator import trace
@trace(intent="validate payment token before charging")
async def validate_payment(token: str, amount: int) -> bool:
# intent appears in the LLM payload call chain
# the value of `token` is never exported
return await payment_gateway.verify(token, amount)
@trace(intent="persist order to database")
async def save_order(order: Order) -> str:
result = await db.insert("orders", order.dict())
return result.idimport "github.com/coresdk/sdk/telemetry"
func ValidatePayment(ctx context.Context, token string, amount int) (bool, error) {
ctx, span := telemetry.Start(ctx, "validate_payment",
telemetry.Intent("validate payment token before charging"),
// Do NOT add token as an attribute — it stays out of all export paths
)
defer span.End()
return paymentGateway.Verify(ctx, token, amount)
}
func SaveOrder(ctx context.Context, order Order) (string, error) {
ctx, span := telemetry.Start(ctx, "save_order",
telemetry.Intent("persist order to database"),
)
defer span.End()
result, err := db.Insert(ctx, "orders", order)
if err != nil {
span.RecordError(err)
return "", err
}
return result.ID, nil
}use coresdk_engine::telemetry::{trace, RecoveryHint};
#[trace(intent = "validate payment token before charging")]
async fn validate_payment(token: &str, amount: u64) -> Result<bool, PaymentError> {
// intent appears in the LLM payload call chain
// `token` is never captured as a span attribute
payment_gateway::verify(token, amount).await
}
#[trace(
intent = "persist order to database",
on_error = RecoveryHint::RetryWithBackoff,
)]
async fn save_order(order: &Order) -> Result<String, DbError> {
db::insert("orders", order).await
}Recovery hints
Recovery hints are short, actionable strings attached to error spans. They appear in the LLM payload alongside the RFC 9457 error detail to give the LLM context about what to suggest.
use coresdk_engine::telemetry::RecoveryHint;
// Built-in hints
RecoveryHint::RetryWithBackoff
RecoveryHint::CheckDownstreamHealth
RecoveryHint::ValidateInputSchema
RecoveryHint::RotateCredentials
// Custom hint
RecoveryHint::Custom("ensure the tenant ID header is present on all requests")In Python and Go, pass a recovery_hint keyword argument to @trace or telemetry.Start:
from coresdk.tracing.decorator import trace
@trace(intent="call downstream inventory service", recovery_hint="check inventory service health")
async def check_inventory(sku: str) -> int:
return await inventory.get_stock(sku)What the payload looks like
When an error span is exported, the LLM receives a JSON object structured as follows. Note the absence of variable values — only structural metadata is present.
{
"sdk_version": "1.2.0",
"service": "orders-service",
"error": {
"type": "https://coresdk.dev/errors/upstream-timeout",
"title": "Upstream Timeout",
"status": 504,
"detail": "inventory service did not respond within 2000ms"
},
"call_chain": [
{
"module": "orders.handler",
"file": "src/orders/handler.py",
"line": 42,
"intent": "handle incoming order creation request",
"duration_ms": 2105,
"error": false
},
{
"module": "orders.inventory",
"file": "src/orders/inventory.py",
"line": 17,
"intent": "call downstream inventory service",
"duration_ms": 2003,
"error": true,
"recovery_hint": "check inventory service health"
}
],
"egress_policy": "acme/llm",
"egress_decision": "allow"
}Customer egress policy
If your organisation requires approval logic before diagnostic data leaves the host, configure a
Rego policy path under llm_export.customer_egress_policy. The policy receives the same payload
structure shown above (minus the egress_decision field) as input and must return allow.
package acme.llm
import future.keywords.if
default allow = false
# Only export LLM diagnostics for errors — not slow-but-successful spans
allow if {
input.error.status >= 500
not contains(input.service, "payment") # payment service exports nothing externally
}If allow is false, the payload is discarded and a local log entry notes the egress denial. The
running service is not affected.
Local trace viewer vs LLM export
The local trace viewer (core trace tail) shows full span data including attribute values. LLM
export is a separate pipeline with a much narrower payload. They are independent — enabling LLM
export does not change what the local viewer shows, and disabling LLM export does not affect local
visibility.
Next steps
- OpenTelemetry — full OTEL configuration and span attributes
- PII masking — how variable values are masked before any export
- Authorization & Policy — writing Rego egress policies
Alerts & Anomaly Detection
Policy violation rate alerts, auth spike detection, per-tenant anomaly thresholds, and webhook integrations for PagerDuty and Slack.
CloudEvents Envelope
Wrap CoreSDK OTel log events in CloudEvents 1.0 metadata for interoperability with Kafka, EventBridge, Knative Eventing, and Azure Event Grid.