Appearance
Leaderboards API
The resource API lets backend services manage leaderboards, models, runs, and schedules.
bash
export API_BASE="https://dr-gero-frontend-99142474693.europe-west1.run.app"
export DRGERO_TOKEN="drgero_REPLACE_WITH_TOKEN_FROM_SETTINGS"List leaderboards
bash
curl -sS "$API_BASE/api/leaderboards?limit=50&offset=0" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jqRequires leaderboards:read.
Create a GET-dataset leaderboard
bash
curl -sS -X POST "$API_BASE/api/leaderboards" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Support QA",
"system_prompt": "Answer the support question clearly.\n\nQuestion:\n{input}",
"dataset_type": "GET",
"dataset_url": "https://huggingface.co/datasets/acme/support-evals/resolve/main/eval.jsonl",
"eval_type": "judge",
"judge_provider": "OpenRouter",
"judge_model": "openai/gpt-5.2"
}' | jqRequires leaderboards:write.
Required fields:
| Field | Required | Notes |
|---|---|---|
name | Yes | Leaderboard/challenge name. |
system_prompt or model_prompt | Yes | Task prompt. CamelCase aliases are accepted in several places. |
dataset_type | No | GET by default. Use PUSH for webhook datasets. |
dataset_url | Yes for GET | Must be a Hugging Face JSONL URL. |
eval_type | No | exact, judge, or human. |
judge_provider, judge_model | Required for manual judge config | Used when judge auto-decide is disabled. |
Create a PUSH-dataset leaderboard
bash
curl -sS -X POST "$API_BASE/api/leaderboards" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Production Support Feedback",
"system_prompt": "Answer the user request.\n\nInput:\n{input}",
"dataset_type": "PUSH",
"eval_type": "judge",
"judge_provider": "OpenRouter",
"judge_model": "openai/gpt-5.2",
"auto_limit_size": true,
"max_samples_to_gather": 1000,
"daily_event_limit": 5000,
"monthly_event_limit": 50000,
"consolidate_every_events": 500,
"consolidate_every_hours": 24,
"dedupe": true
}' | jqGet leaderboard detail
bash
curl -sS "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jqThe response includes leaderboard metadata, challenge, candidate models, and recent runs.
Update a leaderboard
bash
curl -sS -X PATCH "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"description": "Production support leaderboard",
"chosen_model_strategy": "ranking_winner"
}' | jqUpdatable fields include:
name,description,statusdataset_type,dataset_url,dataset_metadata,dataset_auto_labelseval_type,eval_metadatacategory,constraints,model_prompt,system_promptchosen_leaderboard_model_id,chosen_model_strategyschedule
Schedule updates may require a paid entitlement.
Delete a leaderboard
bash
curl -sS -X DELETE "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jqFree-plan workspaces may be prevented from deleting leaderboards.
Add a candidate model
bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "GPT OSS 120B via OpenRouter",
"platform": "OpenRouter",
"model_url": "openai/gpt-oss-120b"
}' | jqManual candidate fields:
| Field | Required | Notes |
|---|---|---|
name or model_name | Yes | Display name. |
platform | No | OpenRouter, Custom, HuggingFace, or Dr.Gero. Defaults to Dr.Gero. |
model_url or url | Required except Dr.Gero | OpenRouter model ID or endpoint URL. |
model_id | Required for Dr.Gero | ID of a Dr.Gero model. |
token | No | Model-specific token if not using workspace integration. Returned redacted. |
auth_type | No | bearer, x-api-key, x-dr.gero-api-key, authorization, or custom-header. |
auth_header_name | For custom header | Header name for custom auth. |
Adding models may require a paid entitlement.
Auto-select candidate models
Auto-select is a UI-session endpoint, because it uses the signed-in workspace context and OpenRouter integration. It is useful for browser/admin automation rather than server-to-server API-token automation.
bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models/auto-select" \
-H "Authorization: Bearer $SUPABASE_ACCESS_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"number_of_models": 5,
"limit_cost": true,
"input_price_per_m_tokens": 0.5,
"output_price_per_m_tokens": 1,
"limit_latency": true,
"latency_p95_seconds": 1,
"latency_p99_seconds": 3,
"only_open_source": false
}' | jqBody fields accept camelCase aliases such as numberOfModels, limitCost, inputPricePerMTokens, latencyP95Seconds, and onlyOpenSource.
List and remove candidate models
bash
curl -sS "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jq
curl -sS -X DELETE "$API_BASE/api/leaderboards/$LEADERBOARD_ID/models/$LEADERBOARD_MODEL_ID" \
-H "Authorization: Bearer $DRGERO_TOKEN" | jqRun a leaderboard
bash
curl -sS -X POST "$API_BASE/api/leaderboards/$LEADERBOARD_ID/run" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model_ids": ["LEADERBOARD_MODEL_ID_1", "LEADERBOARD_MODEL_ID_2"],
"run_source": "manual"
}' | jqRequires leaderboards:run.
Use /api/leaderboards/{leaderboard_id}/improve-dataset to force a dataset-improvement run.
Schedule JSON
Schedules are stored on the leaderboard with a JSON structure like:
json
{
"version": 2,
"triggers": {
"model_version": { "enabled": true, "cadence": "WEEKLY" },
"new_data": { "enabled": true, "check_cadence": "DAILY", "every_new_events": 500 },
"cron": { "enabled": true, "preset": "CUSTOM", "expression": "0 6 * * 1" }
},
"dataset": {
"mode": "LIMIT",
"limit": { "auto": false, "rows": 5000, "algorithm": "LAST_N" }
}
}Save it with:
bash
curl -sS -X PATCH "$API_BASE/api/leaderboards/$LEADERBOARD_ID" \
-H "Authorization: Bearer $DRGERO_TOKEN" \
-H "Content-Type: application/json" \
-d @schedule.json | jq