API REFERENCE · COMPUTE
Compute / instances
Provision and manage GPU instances on Hoonify-routed operators. Use this when you need raw compute — training, fine-tuning, batch inference — rather than the hosted chat completions API.
Create an instance
POST
https://api.hoonify.dev/v1/instancesRequest body
| Field | Type | Description |
|---|---|---|
| sku | string · required | SKU identifier from the GPU catalog. |
| gpus | integer · required | 1, 2, 4, or 8 — must match a configuration the SKU exposes. |
| pool | string | na · eu · apac. Defaults to nearest with capacity. |
| duration_hours | number | Reservation duration. Omit for on-demand (terminate manually). Hard cap at 168h (7d). |
| image | string | OCI image. Defaults to ghcr.io/hoonify/cuda-base:cu128. |
| ssh_keys | array | OpenSSH public keys to inject into the instance. |
| labels | object | Free-form key/value pairs returned on every read. Useful for cost attribution. |
Example
json
{
"sku": "h200-141",
"gpus": 8,
"pool": "na",
"duration_hours": 4,
"image": "ghcr.io/your-org/training:cu128",
"ssh_keys": ["ssh-ed25519 AAAAC3..."],
"labels": { "team": "research", "exp": "scaling-laws-v3" }
}List, retrieve, terminate
shell
# List your instances (any status)
GET /v1/instances?status=ready,starting
# Inspect a single instance
GET /v1/instances/{id}
# Terminate
POST /v1/instances/{id}/terminateTermination is graceful by default — Hoonify sends SIGTERM and waits up to 60 seconds before SIGKILL. Pass force=true to skip the grace period.
Lifecycle states
| Status | Meaning |
|---|---|
| provisioning | Operator selected, hardware allocating. Typical: 30–120s. |
| starting | Image pulling, container starting. Typical: 30–90s. |
| ready | SSH and Jupyter endpoints live. Billing started. |
| draining | Stop requested. Existing connections kept; no new ones. |
| terminated | Stopped. Final usage recorded. Endpoints offline. |
| failed | Provisioning or start failed. No charge. See failure_reason. |
Webhook events
Subscribe to
instance.lifecycle via webhooks to avoid polling. Events fire on every status transition with the instance object attached.Billing
Billing starts at the ready transition and ends when the instance leaves ready or draining. Failed provisions are not charged. Reservations are billed up-front for the full duration_hours; on-demand instances bill per second.
Related: Reservation vs on-demand · Webhooks