Skip to content

Custom model endpoints

Custom models let you add any HTTPS endpoint as a leaderboard candidate.

What Dr.Gero sends

For custom/Hugging Face/Dr.Gero endpoints, Dr.Gero sends a JSON POST request using the rendered leaderboard prompt or messages.

A typical request looks like:

json
{
  "messages": [
    {"role": "user", "content": "Answer the support question.\n\nInput:\nHow do I reset my password?"}
  ],
  "temperature": 0.2,
  "max_tokens": 300
}

Your endpoint should return either an OpenAI-compatible chat-completions response or another JSON/text response from which Dr.Gero can extract output.

json
{
  "id": "cmpl_123",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "message": {"role": "assistant", "content": "Open Settings, then Security, and choose a new password."},
      "finish_reason": "stop"
    }
  ],
  "usage": {"prompt_tokens": 42, "completion_tokens": 17, "total_tokens": 59},
  "cost_usd": 0.00012
}

Including usage and cost fields helps Dr.Gero display accurate run and token budget metrics.

Auth methods

When adding a custom model, choose one of:

MethodHeader sent
BearerAuthorization: Bearer <token>
X-API-KeyX-API-Key: <token>
X-Dr.Gero-API-KeyX-Dr.Gero-API-Key: <token>
Authorization rawAuthorization: <token>
Custom header<auth_header_name>: <token>

Reliability expectations

  • Respond with JSON when possible.
  • Return non-2xx status codes for true failures.
  • Keep latency predictable; leaderboard runs multiply latency by dataset rows and model count.
  • Avoid streaming responses; Dr.Gero runtime inference rejects stream: true.
  • Include stable model IDs in metadata when your endpoint fronts multiple models.