OTel GenAI
Kalibra auto-detects OTel GenAI traces — any JSONL with gen_ai.* attributes. Validated with opentelemetry-instrumentation-openai-v2. Compatible with any exporter that preserves standard gen_ai.* span attributes. No field mapping needed — just export your spans and compare.
Try the interactive tutorial — uses pre-recorded traces, no API key needed.
The workflow
1. Export traces as JSONL
Any exporter that writes OTel spans with gen_ai.* attributes works. The key requirement is that your instrumentor emits gen_ai.* attributes (not openinference.*). For example, opentelemetry-instrumentation-openai-v2 does this natively.
Export from your OTel collector or tracing backend as one span per line:
import json
# spans = your_collector.export() # however your platform exports
with open("traces.jsonl", "w") as f:
for span in spans:
f.write(json.dumps(span, default=str) + "\n")
2. Compare with Kalibra
from kalibra.loader import load_traces
from kalibra.engine import compare
from kalibra.renderers import render
baseline = load_traces("before.jsonl")
current = load_traces("after.jsonl")
result = compare(baseline, current, require=["regressions <= 2"])
print(render(result, "terminal", verbose=True))
That's it. Kalibra detects the gen_ai.* attributes automatically and handles the rest.
Single file with both populations?
If your OTel collector streams all traces into one JSONL file, use where clauses in the YAML config to split them dynamically by any root span attribute — variant, git_sha, branch, etc.
What happens under the hood
OTel GenAI exports are flat arrays of spans — one span per line, each carrying a context.trace_id. Kalibra:
- Groups spans into traces by
trace_id - Builds the span tree using
parent_idrelationships - Extracts tokens from
gen_ai.usage.input_tokensandgen_ai.usage.output_tokens - Maps finish reasons to outcomes —
gen_ai.response.finish_reasons: ["stop"]→ success,["length"]→ failure (truncated). Also mapsmax_tokens,content_filter,safety, andrecitationto failure - Reports cost as N/A — the OTel GenAI spec has no cost attribute. If your platform adds a vendor-specific cost field, map it via
fields.costin your config - Counts steps as leaf spans, not total spans
- Computes duration as wall-clock time (
max(end) - min(start))
Platform compatibility
Kalibra works with any exporter that preserves gen_ai.* span attributes in JSONL. Platforms adopting the OTel GenAI semantic conventions include:
| Platform | Expected gen_ai.* support |
Notes |
|---|---|---|
opentelemetry-instrumentation-openai-v2 |
Validated | What the tutorial traces use |
| PydanticAI / Logfire | Expected | Built on OTel |
| Langfuse | Expected | May include custom cost attributes — map via fields.cost |
| Datadog LLM Observability | Expected | Uses gen_ai.* conventions |
| OpenLLMetry (Traceloop) | Expected | OTel-based auto-instrumentation |
Note
"Expected" means the platform uses OTel GenAI conventions based on their documentation, but Kalibra has not been tested against their specific JSONL export format. If you encounter issues, please open an issue.
What's detected
| Format | Auto-detected | Notes |
|---|---|---|
OTel GenAI JSONL (gen_ai.* attributes) |
Yes | Any exporter that preserves gen_ai.* attributes |
| Explicit format selection | --trace-format otel-genai |
Use when auto-detection fails |
| Flat Kalibra JSONL (one trace per line) | Yes | Fallback format, no gen_ai.* needed |
Differences from OpenInference
| Aspect | OTel GenAI | OpenInference |
|---|---|---|
| Token attributes | gen_ai.usage.input_tokens |
llm.token_count.prompt |
| Cost attribute | None (spec doesn't define it) | llm.cost.total |
| Finish reason | gen_ai.response.finish_reasons (array) |
Nested in output.value JSON |
| Span classification | gen_ai.operation.name |
openinference.span.kind |
| Typical tree depth | 2 levels (agent → operation) | Arbitrary (CHAIN → LLM/TOOL → ...) |