GitHub - InftyAI/alphatrion: ⚒️ The open-source framework for LLM experiments and agent orchestration.

alphatrion

The open-source framework for LLM experiments and agent orchestration.

AlphaTrion is an open-source experiment tracking and agent orchestration framework for LLM application developers and AI engineers. Orchestrate multi-agent workflows, track LLM experiments, manage artifacts, and gain deep observability into your GenAI applications—all through an intuitive Python API and modern dashboard. Named after the oldest and wisest Transformer.

Trusted By

Key Features

🔬 Experiment Management - Hierarchical experiments and runs with smart checkpointing (save on best metrics, early stopping, target optimization)
📦 Artifact Registry - Version datasets and model checkpoints using OCI registries or S3, with native push/pull APIs
📊 Metrics & Observability - Built-in Prometheus metrics and distributed tracing (OpenTelemetry + ClickHouse) for LLM calls
🪝 Extensible Hooks - Pre/post-save hooks and post-run hooks for custom workflows
🎯 Modern Dashboard - Explore experiments, visualize metrics, and analyze traces through an intuitive web UI
🔌 Production-Ready - Async-first design, PostgreSQL metadata storage, and support for distributed workloads

Core Concepts

Organization - Top-level entity for grouping teams and users
Team - Collaborative workspace for organizing experiments and runs
User - Individual account with secure authentication and team memberships
Experiment - Logical grouping of runs with shared purpose, organized by labels
Run - Individual execution instance with configuration and metrics

Quick Start

1. Installation

# From PyPI
pip install alphatrion

# Or from source
git clone https://github.com/inftyai/alphatrion.git && cd alphatrion
source start.sh

2. Setup

# Start PostgreSQL, ClickHouse, and Registry
cp .env.example .env
make up

# Wait for services to be ready, then run migrations
make migrate-all

# Initialize your organization, team, and user account
alphatrion init

Optional Tools:

pgAdmin: http://localhost:8081 (alphatrion@inftyai.com / alphatr1on)
Registry UI: http://localhost:80
Grafana: http://localhost:3000 (admin / admin) - LLM metrics dashboard
Prometheus: http://localhost:9090 - Metrics explorer

3. Run Your First Experiment

import alphatrion as alpha
from alphatrion.experiment import CraftExperiment

# Initialize with your user ID
alpha.init(user_id="<your_user_id>")

async def my_task():
    # Your code here
    await alpha.log_metrics({"accuracy": 0.95, "loss": 0.12})

async with CraftExperiment.start(name="my_experiment") as exp:
    run = exp.run(my_task)
    await exp.wait()

4. Launch Dashboard

# Start backend server (terminal 1)
alphatrion server

# Launch dashboard (terminal 2)
alphatrion dashboard

Access the dashboard at http://127.0.0.1:5173 and log in with your email and password to explore experiments, visualize metrics, and analyze traces.

5. Distributed Tracing

AlphaTrion provides decorators for instrumenting your code with OpenTelemetry distributed tracing:

@tracing.workflow() - Top-level orchestration
@tracing.agent() - Autonomous AI agents with decision-making
@tracing.task() - Reusable units of work
@tracing.tool() - Atomic leaf operations

All decorators automatically capture execution duration, status, span hierarchy, and context (run_id, experiment_id, team_id, org_id). LLM calls, database queries, and HTTP requests are auto-instrumented.

View captured traces in the dashboard:

6. Using Post-Run Hooks (Optional)

Automatically sync metadata and status after run completion.

from alphatrion.experiment import CraftExperiment
from alphatrion.run import PostRunHookFn

async def train_model():
    # Your training code
    return {
        "metadata": {"accuracy": 0.95, "loss": 0.05},
        "status": "COMPLETED",
    }

async with CraftExperiment.start("training") as exp:
    run = exp.run(
        train_model,
        post_run_hooks=[PostRunHookFn.sync_metadata, PostRunHookFn.sync_status]
    )
    await exp.wait()

7. Cleanup

make down

References

Architecture: Diagrams
Dashboard: Setup Guide | CLI Reference | Architecture
Development: Contributing Guide
Claude Code Integration: Hooks Setup

Contributing

We welcome contributions! Check out our development guide to get started.

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
.github		.github
alphatrion		alphatrion
config		config
dashboard		dashboard
docs		docs
hack		hack
migrations		migrations
site/images		site/images
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.env.integration-test		.env.integration-test
.env.test		.env.test
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
README.md		README.md
VERSION		VERSION
alembic.ini		alembic.ini
docker-compose.test.yaml		docker-compose.test.yaml
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
start.sh		start.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The open-source framework for LLM experiments and agent orchestration.

Trusted By

Key Features

Core Concepts

Quick Start

1. Installation

2. Setup

3. Run Your First Experiment

4. Launch Dashboard

5. Distributed Tracing

6. Using Post-Run Hooks (Optional)

7. Cleanup

References

Contributing

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The open-source framework for LLM experiments and agent orchestration.

Trusted By

Key Features

Core Concepts

Quick Start

1. Installation

2. Setup

3. Run Your First Experiment

4. Launch Dashboard

5. Distributed Tracing

6. Using Post-Run Hooks (Optional)

7. Cleanup

References

Contributing

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages