Skip to content

First leaderboard

This guide creates a leaderboard in the UI, ranks two or more models, and calls the inference API.

1. Save provider integrations

Open Settings → Integrations.

  1. Add an OpenRouter token. This is required before creating leaderboards or models.
  2. Add a Hugging Face token if your dataset is private or gated.

2. Create the leaderboard

Open Leaderboards → Create Leaderboard.

  1. Enter a name.
  2. Enter a system prompt. A safe default is:
text
Answer the following task as clearly and concisely as possible.

Input:
{input}
  1. Choose dataset mode:
    • Get dataset for a Hugging Face .jsonl URL.
    • Push dataset for a webhook-based dataset.
  2. Choose an evaluation type: exact, judge, or human.
  3. Create the leaderboard.

3. Add models

Open the leaderboard and click Add Model.

  • Use Auto-select models to let Dr.Gero choose OpenRouter models under cost/latency constraints.
  • Use Manual model to add OpenRouter, Custom, Hugging Face, or Dr.Gero models.

You need at least two candidate models to run a ranking.

4. Run the leaderboard

Click Run. When it completes, the Ranking tab shows model scores, costs, latencies, and the selected winner.

5. Create an API token

Open Settings → Tokens and create a token with at least:

  • leaderboards:inference for inference.
  • leaderboards:read for traces.
  • leaderboards:write if you will push dataset rows or manage leaderboard resources.

Copy the token once; the UI will not show the secret again.

6. Call inference

bash
export API_BASE="https://dr-gero-frontend-99142474693.europe-west1.run.app"
export DRGERO_TOKEN="drgero_REPLACE_WITH_TOKEN_FROM_SETTINGS"
export LEADERBOARD_ID="b60fe691-06a3-4261-bec3-6080380dc72d"

curl -sS -X POST "$API_BASE/v1/leaderboard/$LEADERBOARD_ID/inference" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "input": "What did the user ask?"
  }'

7. Inspect traces

bash
curl -sS "$API_BASE/v1/leaderboard/$LEADERBOARD_ID/traces?limit=100&format=json&source=inference" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq