Appearance
First leaderboard
This guide creates a leaderboard in the UI, ranks two or more models, and calls the inference API.
1. Save provider integrations
Open Settings → Integrations.
- Add an OpenRouter token. This is required before creating leaderboards or models.
- Add a Hugging Face token if your dataset is private or gated.
2. Create the leaderboard
Open Leaderboards → Create Leaderboard.
- Enter a name.
- Enter a system prompt. A safe default is:
text
Answer the following task as clearly and concisely as possible.
Input:
{input}- Choose dataset mode:
- Get dataset for a Hugging Face
.jsonlURL. - Push dataset for a webhook-based dataset.
- Get dataset for a Hugging Face
- Choose an evaluation type: exact, judge, or human.
- Create the leaderboard.
3. Add models
Open the leaderboard and click Add Model.
- Use Auto-select models to let Dr.Gero choose OpenRouter models under cost/latency constraints.
- Use Manual model to add OpenRouter, Custom, Hugging Face, or Dr.Gero models.
You need at least two candidate models to run a ranking.
4. Run the leaderboard
Click Run. When it completes, the Ranking tab shows model scores, costs, latencies, and the selected winner.
5. Create an API token
Open Settings → Tokens and create a token with at least:
leaderboards:inferencefor inference.leaderboards:readfor traces.leaderboards:writeif you will push dataset rows or manage leaderboard resources.
Copy the token once; the UI will not show the secret again.
6. Call inference
bash
export API_BASE="https://dr-gero-frontend-99142474693.europe-west1.run.app"
export DRGERO_TOKEN="drgero_REPLACE_WITH_TOKEN_FROM_SETTINGS"
export LEADERBOARD_ID="b60fe691-06a3-4261-bec3-6080380dc72d"
curl -sS -X POST "$API_BASE/v1/leaderboard/$LEADERBOARD_ID/inference" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"input": "What did the user ask?"
}'7. Inspect traces
bash
curl -sS "$API_BASE/v1/leaderboard/$LEADERBOARD_ID/traces?limit=100&format=json&source=inference" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jq