API REFERENCE · MODELS

Models

List and inspect models available on Hoonify. The catalog mirrors the inference catalog UI but in machine-readable form.

List models

GEThttps://api.hoonify.dev/v1/models

Returns every model your org can call. Filter with ?pool=eu to scope to a specific pool, or ?family=Llama to filter by family.

Retrieve a model

GEThttps://api.hoonify.dev/v1/models/{model_id}

Same fields as the list endpoint, plus a deprecation notice when scheduled.

Response shape

json
{
  "object": "list",
  "data": [
    {
      "id": "llama-3.3-70b",
      "object": "model",
      "created": 1737936000,
      "owned_by": "meta-llama",
      "family": "Llama",
      "context_window": 131072,
      "quantizations": ["fp16", "fp8"],
      "pools": ["na", "eu", "apac"],
      "modalities": ["text-in", "text-out"],
      "rate_card_id": "rc_llama_33_70b"
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "created": 1734624000,
      "owned_by": "deepseek-ai",
      "family": "DeepSeek",
      "context_window": 65536,
      "quantizations": ["fp8"],
      "pools": ["na", "apac"],
      "modalities": ["text-in", "text-out"],
      "rate_card_id": "rc_deepseek_v3"
    }
    /* … */
  ]
}

Field reference

FieldDescription
idStable model identifier. Use this in model on completions/embeddings calls.
familyBrand family — Llama, Qwen, DeepSeek, Mistral, etc.
context_windowMax input + output tokens, in tokens.
quantizationsAvailable variants. Pass via the quantization field on chat completions.
poolsPools where the model has live capacity. Pin via X-Hoonify-Pool.
modalitiesI/O kinds — currently text-in + text-out for all models. Vision is on the roadmap.
rate_card_idPricing identifier. Cross-reference with the rate card.
deprecationObject with shutdown_at (RFC3339) and a replacement model ID. Null when not deprecated.

Catalog drift

New models go live without notice; deprecated models get a 90-day shutdown window and a model.deprecation_announced webhook event. Treat the API as the source of truth — don't hardcode a static model list.

Related: Quantization · Pools