API REFERENCE · MODELS
Models
List and inspect models available on Hoonify. The catalog mirrors the inference catalog UI but in machine-readable form.
List models
GET
https://api.hoonify.dev/v1/modelsReturns every model your org can call. Filter with ?pool=eu to scope to a specific pool, or ?family=Llama to filter by family.
Retrieve a model
GET
https://api.hoonify.dev/v1/models/{model_id}Same fields as the list endpoint, plus a deprecation notice when scheduled.
Response shape
json
{
"object": "list",
"data": [
{
"id": "llama-3.3-70b",
"object": "model",
"created": 1737936000,
"owned_by": "meta-llama",
"family": "Llama",
"context_window": 131072,
"quantizations": ["fp16", "fp8"],
"pools": ["na", "eu", "apac"],
"modalities": ["text-in", "text-out"],
"rate_card_id": "rc_llama_33_70b"
},
{
"id": "deepseek-v3",
"object": "model",
"created": 1734624000,
"owned_by": "deepseek-ai",
"family": "DeepSeek",
"context_window": 65536,
"quantizations": ["fp8"],
"pools": ["na", "apac"],
"modalities": ["text-in", "text-out"],
"rate_card_id": "rc_deepseek_v3"
}
/* … */
]
}Field reference
| Field | Description |
|---|---|
| id | Stable model identifier. Use this in model on completions/embeddings calls. |
| family | Brand family — Llama, Qwen, DeepSeek, Mistral, etc. |
| context_window | Max input + output tokens, in tokens. |
| quantizations | Available variants. Pass via the quantization field on chat completions. |
| pools | Pools where the model has live capacity. Pin via X-Hoonify-Pool. |
| modalities | I/O kinds — currently text-in + text-out for all models. Vision is on the roadmap. |
| rate_card_id | Pricing identifier. Cross-reference with the rate card. |
| deprecation | Object with shutdown_at (RFC3339) and a replacement model ID. Null when not deprecated. |
Catalog drift
New models go live without notice; deprecated models get a 90-day shutdown window and a
model.deprecation_announced webhook event. Treat the API as the source of truth — don't hardcode a static model list.Related: Quantization · Pools