Skip to content

End-to-end workflow

This guide wires Dr.Gero into a production LLM loop.

1. Environment

bash
source examples/env.sh

Edit examples/env.sh with your token and IDs.

2. Create a push leaderboard

bash
bash examples/create-leaderboard-push.sh

Capture the returned leaderboard ID and set:

bash
export PUSH_LEADERBOARD_ID="..."

3. Push production rows

bash
bash examples/push-dataset.sh

Send rows from your backend whenever you have an input/output pair, feedback signal, or trace event worth evaluating later.

4. Add models

Use the UI or API to add at least two candidate models. For OpenRouter:

bash
bash examples/add-openrouter-model.sh

5. Run the leaderboard

bash
bash examples/run-leaderboard.sh

Wait for completion, then inspect the Ranking and Run Logs tabs.

6. Call inference

Once a run completes and a winner is selected:

bash
bash examples/inference.sh

Store response headers such as X-Dr.Gero-Trace-Id in your application logs.

7. Export traces

bash
bash examples/traces.sh

Use traces for debugging, observability, dataset improvement, or future fine-tuning.

8. Iterate

  • Schedule leaderboard runs by model version, new data, or cron.
  • Push more production examples.
  • Fine-tune Dr.Gero models from the best datasets.
  • Add fine-tuned Dr.Gero models back into leaderboards.