Appearance
End-to-end workflow
This guide wires Dr.Gero into a production LLM loop.
1. Environment
bash
source examples/env.shEdit examples/env.sh with your token and IDs.
2. Create a push leaderboard
bash
bash examples/create-leaderboard-push.shCapture the returned leaderboard ID and set:
bash
export PUSH_LEADERBOARD_ID="..."3. Push production rows
bash
bash examples/push-dataset.shSend rows from your backend whenever you have an input/output pair, feedback signal, or trace event worth evaluating later.
4. Add models
Use the UI or API to add at least two candidate models. For OpenRouter:
bash
bash examples/add-openrouter-model.sh5. Run the leaderboard
bash
bash examples/run-leaderboard.shWait for completion, then inspect the Ranking and Run Logs tabs.
6. Call inference
Once a run completes and a winner is selected:
bash
bash examples/inference.shStore response headers such as X-Dr.Gero-Trace-Id in your application logs.
7. Export traces
bash
bash examples/traces.shUse traces for debugging, observability, dataset improvement, or future fine-tuning.
8. Iterate
- Schedule leaderboard runs by model version, new data, or cron.
- Push more production examples.
- Fine-tune Dr.Gero models from the best datasets.
- Add fine-tuned Dr.Gero models back into leaderboards.