Skip to content

Leaderboards API

The resource API lets backend services manage leaderboards, models, runs, and schedules.

bash
export API_BASE="https://dr-gero-frontend-99142474693.europe-west1.run.app"
export DRGERO_TOKEN="drgero_REPLACE_WITH_TOKEN_FROM_SETTINGS"

List leaderboards

bash
curl -sS "$API_BASE/api/leaderboards?limit=50&offset=0" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq

Requires leaderboards:read.

Create a GET-dataset leaderboard

bash
curl -sS -X POST "$API_BASE/api/leaderboards" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Support QA",
    "system_prompt": "Answer the support question clearly.\n\nQuestion:\n{input}",
    "dataset_type": "GET",
    "dataset_url": "https://huggingface.co/datasets/acme/support-evals/resolve/main/eval.jsonl",
    "eval_type": "judge",
    "judge_provider": "OpenRouter",
    "judge_model": "openai/gpt-5.2"
  }' | jq

Requires leaderboards:write.

Required fields:

FieldRequiredNotes
nameYesLeaderboard/challenge name.
system_prompt or model_promptYesTask prompt. CamelCase aliases are accepted in several places.
dataset_typeNoGET by default. Use PUSH for webhook datasets.
dataset_urlYes for GETMust be a Hugging Face JSONL URL.
eval_typeNoexact, judge, or human.
judge_provider, judge_modelRequired for manual judge configUsed when judge auto-decide is disabled.

Create a PUSH-dataset leaderboard

bash
curl -sS -X POST "$API_BASE/api/leaderboards" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Production Support Feedback",
    "system_prompt": "Answer the user request.\n\nInput:\n{input}",
    "dataset_type": "PUSH",
    "eval_type": "judge",
    "judge_provider": "OpenRouter",
    "judge_model": "openai/gpt-5.2",
    "auto_limit_size": true,
    "max_samples_to_gather": 1000,
    "daily_event_limit": 5000,
    "monthly_event_limit": 50000,
    "consolidate_every_events": 500,
    "consolidate_every_hours": 24,
    "dedupe": true
  }' | jq

Get leaderboard detail

bash
curl -sS "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq

The response includes leaderboard metadata, challenge, candidate models, and recent runs.

Update a leaderboard

bash
curl -sS -X PATCH "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "description": "Production support leaderboard",
    "chosen_model_strategy": "ranking_winner"
  }' | jq

Updatable fields include:

  • name, description, status
  • dataset_type, dataset_url, dataset_metadata, dataset_auto_labels
  • eval_type, eval_metadata
  • category, constraints, model_prompt, system_prompt
  • chosen_leaderboard_model_id, chosen_model_strategy
  • schedule

Schedule updates may require a paid entitlement.

Delete a leaderboard

bash
curl -sS -X DELETE "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq

Free-plan workspaces may be prevented from deleting leaderboards.

Add a candidate model

bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "GPT OSS 120B via OpenRouter",
    "platform": "OpenRouter",
    "model_url": "openai/gpt-oss-120b"
  }' | jq

Manual candidate fields:

FieldRequiredNotes
name or model_nameYesDisplay name.
platformNoOpenRouter, Custom, HuggingFace, or Dr.Gero. Defaults to Dr.Gero.
model_url or urlRequired except Dr.GeroOpenRouter model ID or endpoint URL.
model_idRequired for Dr.GeroID of a Dr.Gero model.
tokenNoModel-specific token if not using workspace integration. Returned redacted.
auth_typeNobearer, x-api-key, x-dr.gero-api-key, authorization, or custom-header.
auth_header_nameFor custom headerHeader name for custom auth.

Adding models may require a paid entitlement.

Auto-select candidate models

Auto-select is a UI-session endpoint, because it uses the signed-in workspace context and OpenRouter integration. It is useful for browser/admin automation rather than server-to-server API-token automation.

bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models/auto-select" \
  -H "Authorization: Bearer $SUPABASE_ACCESS_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "number_of_models": 5,
    "limit_cost": true,
    "input_price_per_m_tokens": 0.5,
    "output_price_per_m_tokens": 1,
    "limit_latency": true,
    "latency_p95_seconds": 1,
    "latency_p99_seconds": 3,
    "only_open_source": false
  }' | jq

Body fields accept camelCase aliases such as numberOfModels, limitCost, inputPricePerMTokens, latencyP95Seconds, and onlyOpenSource.

List and remove candidate models

bash
curl -sS "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq

curl -sS -X DELETE "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models/$LEADERBOARD_MODEL_ID" \
  -H "Authorization: Bearer $DRGERO_TOKEN" | jq

Run a leaderboard

bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/run" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model_ids": ["LEADERBOARD_MODEL_ID_1", "LEADERBOARD_MODEL_ID_2"],
    "run_source": "manual"
  }' | jq

Requires leaderboards:run.

Use /api/leaderboards/{leaderboard_id}/improve-dataset to force a dataset-improvement run.

Schedule JSON

Schedules are stored on the leaderboard with a JSON structure like:

json
{
  "version": 2,
  "triggers": {
    "model_version": { "enabled": true, "cadence": "WEEKLY" },
    "new_data": { "enabled": true, "check_cadence": "DAILY", "every_new_events": 500 },
    "cron": { "enabled": true, "preset": "CUSTOM", "expression": "0 6 * * 1" }
  },
  "dataset": {
    "mode": "LIMIT",
    "limit": { "auto": false, "rows": 5000, "algorithm": "LAST_N" }
  }
}

Save it with:

bash
curl -sS -X PATCH "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
  -H "Authorization: Bearer $DRGERO_TOKEN" \
  -H "Content-Type: application/json" \
  -d @schedule.json | jq