OTel GenAI

Kalibra auto-detects OTel GenAI traces — any JSONL with gen_ai.* attributes. Validated with opentelemetry-instrumentation-openai-v2. Compatible with any exporter that preserves standard gen_ai.* span attributes. No field mapping needed — just export your spans and compare.

Try the interactive tutorial — uses pre-recorded traces, no API key needed.

The workflow

Your OTel GenAI exporter → JSONL → Kalibra (compare)

1. Export traces as JSONL

Any exporter that writes OTel spans with gen_ai.* attributes works. The key requirement is that your instrumentor emits gen_ai.* attributes (not openinference.*). For example, opentelemetry-instrumentation-openai-v2 does this natively.

Export from your OTel collector or tracing backend as one span per line:

import json

# spans = your_collector.export()  # however your platform exports
with open("traces.jsonl", "w") as f:
    for span in spans:
        f.write(json.dumps(span, default=str) + "\n")

2. Compare with Kalibra

CLIPythonYAML config

kalibra compare --baseline before.jsonl --current after.jsonl -v

from kalibra.loader import load_traces
from kalibra.engine import compare
from kalibra.renderers import render

baseline = load_traces("before.jsonl")
current = load_traces("after.jsonl")

result = compare(baseline, current, require=["regressions <= 2"])
print(render(result, "terminal", verbose=True))

# kalibra.yml
sources:
  baseline:
    path: traces.jsonl
    where:
      - variant == baseline
  current:
    path: traces.jsonl
    where:
      - variant == current

fields:
  task_id: task.id

require:
  - token_delta_pct <= -10
  - regressions <= 2

kalibra compare --config kalibra.yml -v

That's it. Kalibra detects the gen_ai.* attributes automatically and handles the rest.

Single file with both populations?

If your OTel collector streams all traces into one JSONL file, use where clauses in the YAML config to split them dynamically by any root span attribute — variant, git_sha, branch, etc.

What happens under the hood

OTel GenAI exports are flat arrays of spans — one span per line, each carrying a context.trace_id. Kalibra:

Groups spans into traces by trace_id
Builds the span tree using parent_id relationships
Extracts tokens from gen_ai.usage.input_tokens and gen_ai.usage.output_tokens
Maps finish reasons to outcomes — gen_ai.response.finish_reasons: ["stop"] → success, ["length"] → failure (truncated). Also maps max_tokens, content_filter, safety, and recitation to failure
Reports cost as N/A — the OTel GenAI spec has no cost attribute. If your platform adds a vendor-specific cost field, map it via fields.cost in your config
Counts steps as leaf spans, not total spans
Computes duration as wall-clock time (max(end) - min(start))

Platform compatibility

Kalibra works with any exporter that preserves gen_ai.* span attributes in JSONL. Platforms adopting the OTel GenAI semantic conventions include:

Platform	Expected `gen_ai.*` support	Notes
`opentelemetry-instrumentation-openai-v2`	Validated	What the tutorial traces use
PydanticAI / Logfire	Expected	Built on OTel
Langfuse	Expected	May include custom cost attributes — map via `fields.cost`
Datadog LLM Observability	Expected	Uses `gen_ai.*` conventions
OpenLLMetry (Traceloop)	Expected	OTel-based auto-instrumentation

Note

"Expected" means the platform uses OTel GenAI conventions based on their documentation, but Kalibra has not been tested against their specific JSONL export format. If you encounter issues, please open an issue.

What's detected

Format	Auto-detected	Notes
OTel GenAI JSONL (`gen_ai.*` attributes)	Yes	Any exporter that preserves `gen_ai.*` attributes
Explicit format selection	`--trace-format otel-genai`	Use when auto-detection fails
Flat Kalibra JSONL (one trace per line)	Yes	Fallback format, no `gen_ai.*` needed

Differences from OpenInference

Aspect	OTel GenAI	OpenInference
Token attributes	`gen_ai.usage.input_tokens`	`llm.token_count.prompt`
Cost attribute	None (spec doesn't define it)	`llm.cost.total`
Finish reason	`gen_ai.response.finish_reasons` (array)	Nested in `output.value` JSON
Span classification	`gen_ai.operation.name`	`openinference.span.kind`
Typical tree depth	2 levels (agent → operation)	Arbitrary (CHAIN → LLM/TOOL → ...)