API REFERENCE · COMPUTE

Compute / instances

Provision and manage GPU instances on Hoonify-routed operators. Use this when you need raw compute — training, fine-tuning, batch inference — rather than the hosted chat completions API.

Create an instance

POSThttps://api.hoonify.dev/v1/instances

Request body

FieldTypeDescription
skustring · requiredSKU identifier from the GPU catalog.
gpusinteger · required1, 2, 4, or 8 — must match a configuration the SKU exposes.
poolstringna · eu · apac. Defaults to nearest with capacity.
duration_hoursnumberReservation duration. Omit for on-demand (terminate manually). Hard cap at 168h (7d).
imagestringOCI image. Defaults to ghcr.io/hoonify/cuda-base:cu128.
ssh_keysarrayOpenSSH public keys to inject into the instance.
labelsobjectFree-form key/value pairs returned on every read. Useful for cost attribution.

Example

json
{
  "sku": "h200-141",
  "gpus": 8,
  "pool": "na",
  "duration_hours": 4,
  "image": "ghcr.io/your-org/training:cu128",
  "ssh_keys": ["ssh-ed25519 AAAAC3..."],
  "labels": { "team": "research", "exp": "scaling-laws-v3" }
}

List, retrieve, terminate

shell
# List your instances (any status)
GET /v1/instances?status=ready,starting

# Inspect a single instance
GET /v1/instances/{id}

# Terminate
POST /v1/instances/{id}/terminate

Termination is graceful by default — Hoonify sends SIGTERM and waits up to 60 seconds before SIGKILL. Pass force=true to skip the grace period.

Lifecycle states

StatusMeaning
provisioningOperator selected, hardware allocating. Typical: 30–120s.
startingImage pulling, container starting. Typical: 30–90s.
readySSH and Jupyter endpoints live. Billing started.
drainingStop requested. Existing connections kept; no new ones.
terminatedStopped. Final usage recorded. Endpoints offline.
failedProvisioning or start failed. No charge. See failure_reason.

Webhook events

Subscribe to instance.lifecycle via webhooks to avoid polling. Events fire on every status transition with the instance object attached.

Billing

Billing starts at the ready transition and ends when the instance leaves ready or draining. Failed provisions are not charged. Reservations are billed up-front for the full duration_hours; on-demand instances bill per second.

Related: Reservation vs on-demand · Webhooks