Skip to content

Latest commit

 

History

History
72 lines (49 loc) · 3.9 KB

File metadata and controls

72 lines (49 loc) · 3.9 KB

Features

Baseline inventory of what otel-a2a-relay ships today. Update the relevant section when a feature is added, removed, or reshaped. Last full sweep: 2026-05-08.

Core relay

Exercise: coily exec test-core.

  • A2A JSON-RPC 2.0 HTTP server: message/send, message/stream, tasks/get, tasks/cancel.
  • AgentCard discovery at /.well-known/agent.json; /peers aggregation.
  • Star-topology enforcement (only the orchestrator targets peers); violations return -32010.
  • Peer registry from OTEL_A2A_RELAY_PEERS; synthetic task synthesis when no peers are set.
  • Thread-safe in-memory task store with a submitted/working/completed/failed/canceled state machine.
  • W3C traceparent propagation end-to-end across hops.

Telemetry emission

  • Backend-agnostic OTLP/HTTP exporter via OTEL_EXPORTER_OTLP_ENDPOINT (Phoenix, Tempo, Honeycomb, Datadog).
  • OpenInference-compatible span attributes; session propagation via using_session().
  • Conventions: openinference.span.kind, agent.role, agent.specialization, o2r.relay.failure_class, graph.node.{id,parent_id}.
  • Per-span payloads carry task state, state-change events, stream chunks, and input/output.
  • Reusable span-assertion library and an in-memory span store for fixtures.

Arize Phoenix integration

Exercise: coily exec test-arize-phoenix.

  • o2r-harness posts a worked-example trace and prints validation steps.
  • o2r-phoenix-bootstrap idempotently provisions annotation configs + datasets via REST.
  • o2r-view reduces session spans to a readable per-hop log.
  • Deterministic GIF rendering of session topologies (Pillow, viz extra); same session.id is byte-identical.
  • Real-run hero GIF from an Agent Channel; the scrub gate blocks any IP / tailnet host from a label.

Agent Channel coordination layer

Exercise: coily exec test-channels. Spec: channels-protocol.md.

  • otel-a2a-relay-channels: FastAPI router + Postgres schema + Pydantic models.
  • 8 routes under /agent-channel (create, list, onboarding, spec, state, event log, append, close).
  • 4-character dictatable channel IDs; append-only event log (spec/state/status/comms/log).
  • One OTel span per event (agent-channel.event.{kind}), so channels share the A2A trace view.
  • Backend-agnostic make_router(...) (caller-injected pool, auth, base_url).

Tempo + Grafana integration

Exercise: coily exec test-tempo-grafana. Detail: ../tempo_grafana/README.md.

  • Dockerized Tempo 2.6.1 + Prometheus 2.55.1 + Grafana 11.3.1 with provisioned datasources.
  • Provisioned dashboards: o2r-overview and LUCA-flow.
  • o2r-tempo-harness and bootstrap_tempo() helper.
  • Span + service-graph metrics via Tempo's metrics_generator, keyed by agent.role / session.id / a2a.method.

LUCA-flow demo

Detail: ../examples/luca-flow/README.md.

  • Multi-agent choreography (orchestrator + planner + validator + deployer + eight workers) building the AURORA microsite from NASA imagery.
  • Real validation (HTML5, single <h1>, alt text, link resolution, word/image counts).
  • Intentional failures (worker-d crash, worker-g topology bypass -> -32010); frozen timestamps via LUCA_FREEZE_TIME.
  • Span shape: orchestrator.flow / orchestrator.step.<n> / orchestrator.acceptance.

Deploy

Deploy artifacts live in-tree (core/chart/, core/Dockerfile); only the GitOps deployment state lives elsewhere. Relay-only, backend-agnostic. channels and the harnesses are not deployed.

See also

Cross-reference convention from coilysiren/agentic-os#59. This repo is the worked example.