API REFERENCE · MODELS

Models

List and inspect models available on Hoonify. The catalog mirrors the inference catalog UI but in machine-readable form.

List models

GEThttps://api.hoonify.dev/v1/models

Returns every model your org can call. Filter with ?pool=eu to scope to a specific pool, or ?family=Llama to filter by family.

Retrieve a model

GEThttps://api.hoonify.dev/v1/models/{model_id}

Same fields as the list endpoint, plus a deprecation notice when scheduled.

Response shape

json

{
  "object": "list",
  "data": [
    {
      "id": "llama-3.3-70b",
      "object": "model",
      "created": 1737936000,
      "owned_by": "meta-llama",
      "family": "Llama",
      "context_window": 131072,
      "quantizations": ["fp16", "fp8"],
      "pools": ["na", "eu", "apac"],
      "modalities": ["text-in", "text-out"],
      "rate_card_id": "rc_llama_33_70b"
    },
    {
      "id": "deepseek-v3",
      "object": "model",
      "created": 1734624000,
      "owned_by": "deepseek-ai",
      "family": "DeepSeek",
      "context_window": 65536,
      "quantizations": ["fp8"],
      "pools": ["na", "apac"],
      "modalities": ["text-in", "text-out"],
      "rate_card_id": "rc_deepseek_v3"
    }
    /* … */
  ]
}

Field reference

Field	Description
id	Stable model identifier. Use this in `model` on completions/embeddings calls.
family	Brand family — `Llama`, `Qwen`, `DeepSeek`, `Mistral`, etc.
context_window	Max input + output tokens, in tokens.
quantizations	Available variants. Pass via the `quantization` field on chat completions.
pools	Pools where the model has live capacity. Pin via `X-Hoonify-Pool`.
modalities	I/O kinds — currently `text-in` + `text-out` for all models. Vision is on the roadmap.
rate_card_id	Pricing identifier. Cross-reference with the rate card.
deprecation	Object with `shutdown_at` (RFC3339) and a `replacement` model ID. Null when not deprecated.

Catalog drift

New models go live without notice; deprecated models get a 90-day shutdown window and a model.deprecation_announced webhook event. Treat the API as the source of truth — don't hardcode a static model list.

Related: Quantization · Pools