InferenceUse casesAboutPricingBlogDocs
Sign in

Get started

  • Overview
  • Quickstart
  • Authentication
  • Rate limits

API reference

  • Chat completions
  • Embeddings
  • Models
  • Compute / instances
  • Webhooks

Concepts

  • Pools
  • Rate card
  • Quantization
  • Reservation vs on-demand

SDKs

  • Python
  • TypeScript
  • OpenAI compatibility

Other

  • Changelog
  • Status
  • Migration from Vast.AI

DOCS

//Build against Hoonify in minutes.

Hoonify exposes an OpenAI-compatible API. If you can call openai.chat.completions.create(), you can use Hoonify — just swap the base URL and key.

Demo · search not wired

Quickstart

Make your first chat completion in two minutes. Includes curl, Python, and TypeScript.

Read

Chat completions API

OpenAI-compatible endpoint, with extensions for quantization and pool routing.

Read

Pools

How Hoonify routes your traffic to the right operator, transparently.

Read

OpenAI compatibility

Drop-in replacement for the OpenAI Python and TS SDKs. One base URL, same shapes.

Read

Authentication

API key formats, scopes, rotation policy, and how to revoke without downtime.

Read

Migrating from Vast.AI

Moving an existing workload over: account setup, key migration, and per-request differences.

Read

ENDPOINT

https://api.hoonify.dev/v1OpenAI-compatible

One base URL routes to the closest pool with capacity. Pin a pool with the X-Hoonify-Pool header (na / eu / apac) when you need data residency.

//Hoonify your hardware.

The full software stack for NeoCloud operators — bare-metal provisioning, Kubernetes, inference scheduling, billing, and the Howl customer portal.

TurbOS

Product

  • Catalog
  • Plans
  • Rates
  • Dashboard
  • Docs

For Operators

  • List your capacity
  • Operator stack
  • TurbOS provisioning
  • Howl portal
  • Inference scheduler
  • Billing

Company

  • About
  • Blog
  • Careers
  • Press
  • Contact
© 2026 Hoonify Labs · Demo prototype//Real capacity, real prices, real tokens.