FOR OPERATORS // OPERATOR STACK

The full software stack for NeoCloud operators.

Hoonify gives you everything from PXE boot to the customer portal, so your team can focus on capacity and power — not Kubernetes operators and tax filing.

TurbOS·Hoonify Kubernetes·Inference scheduler·Howl portal

THE STACK

Six layers. One control plane.

You bring the racks. We bring the rest.

Every layer is opinionated, opt-out, and works on day one. No integration tax, no homegrown billing pipeline, no second Kubernetes cluster you wish you hadn't built.

  • Per-second metering across compute and tokens
  • SOC 2 Type II + ISO 27001 inherited from the platform
  • Sub-100ms region routing on the demand side
  • Tenant isolation enforced at the kernel and network layer
  • Always-on observability — Grafana, Loki, Tempo on us
Request → Token
  1. Demand Aggregator
    Marketplace · compare · checkout
    Hoonify
  2. Howl portal
    Customer surface · API · usage
    Hoonify
  3. Hoonify scheduler
    Inference batching · GPU sharing
    Hoonify
  4. Hoonify Kubernetes
    Pods · namespaces · CNI · isolation
    Hoonify
  5. TurbOS
    PXE · drivers · BMC · imaging
    Hoonify
  6. Bare metal
    GPUs · NICs · NVMe · power
    Operator
Hoonify-managed (5)Operator-owned (1)

ARCHITECTURE

How a request becomes a token.

Customer hits the marketplace, Howl routes it to your region, the scheduler picks the right GPU, Kubernetes spawns the pod, TurbOS makes sure the metal is healthy. The whole loop closes in single-digit milliseconds for inference, single-digit minutes for compute.

Request lifecycle
Request · ControlTokens · Response
KEEPS METAL HEALTHYTOKENS · RESULTS · USAGE01DEMANDCustomerACME-AI02HOONIFYMarketplaceBROWSE · COMPARE03HOONIFYHowl portalAUTH · API · USAGE04HOONIFYSchedulerROUTE · BATCH · SHARE05HOONIFYKubernetesPOD · ISOLATION06OPERATORGPU nodeBARE METALHOONIFYTurbOSPXE · DRIVERS · BMC

LAYER BY LAYER

What each piece actually does.

BARE-METAL PROVISIONING

TurbOS

//Image bare metal in <90 seconds.

  • PXE boot + BMC orchestration across heterogeneous hardware
  • GPU drivers, CUDA, NCCL, RDMA pre-baked into operator-curated images
  • Hot-swap nodes without downtime; live migration on supported SKUs
  • First-boot to first-token in under 90 seconds

ORCHESTRATION

Hoonify Kubernetes

//K8s, but the GPU bits already work.

  • Multi-tenant namespaces with hardware isolation by default
  • GPU operator + topology-aware scheduling out of the box
  • Network policies for tenant isolation; CNI tuned for east-west bandwidth
  • Roll back faulty kernels without touching customer workloads

INFERENCE SCHEDULER

Hoonify Scheduler

//More tokens per watt.

  • Continuous batching + paged-attention serving (vLLM, SGLang, TGI)
  • Cross-tenant GPU sharing with MIG / MPS / time-slicing where supported
  • Auto-quantization pipeline (FP16 → FP8 → INT4) per latency tier
  • Spot reclamation in seconds — workloads checkpoint and resume

METERING & BILLING

Per-second billing

//Charge for actual GPU-seconds, not invoices.

  • Per-second metering on compute, storage, network, and tokens
  • Credits, invoiced postpaid, prepaid contracts — all in one ledger
  • Tax + compliance for US, EU, UK, JP, SG (handled, not your problem)
  • Daily payouts to operator wallet; weekly USD wire on request

CUSTOMER PORTAL

Howl

//Your renters get a real product.

  • Drop-in customer portal — co-branded, custom domain, your colors
  • Live usage, spend, jobs, API keys; no console of your own to maintain
  • OpenAI-compatible API surface for inference; ssh/k8s for compute
  • Status page, audit logs, support inbox — all included

DEMAND

Hoonify Demand Aggregator

//Real customers. Not just a checkout button.

  • Your capacity surfaces in front of every Hoonify customer in the network
  • Comparison page sets a real price floor — no race-to-zero, just clarity
  • Reserved + on-demand + spot routing handled per tenant policy
  • Demand forecasting per region so you can pre-warm capacity

ECONOMICS

Take rate that doesn't apologize for itself.

Flat 12% across both compute and tokens, billed against actual customer revenue — not your list price. No SaaS license, no per-GPU seat fee, no "enterprise tier" gating.

Take rate
12%
flat across compute + tokens
Payout
Daily
auto wire, weekly on request
Onboarding
<14 days
from contract to first listing
Time to first token
<90s
via TurbOS imaging

OPERATOR ECONOMICS · ILLUSTRATIVE

What 1,000 H100s cleared last quarter.

Single mid-size operator on the Hoonify network. Mix of on-demand and 1-month reserved. All numbers are fictional and meant to illustrate the unit economics — your power and capex aren't.

Effective utilization73%
Realized rate$1.62 / GPU-hr
Gross GPU-hours billed1.6M
Quarterly customer revenue$2.59M
Hoonify take rate (12%)−$311K
Net to operator$2.28M

READY TO LIST

Have the racks. Need the renters.

From contract to first listing, under fourteen days.

Tell us what you have and where it is. We'll handle the rest.

  • Hot-rack PoC in a weekend
  • Heterogeneous SKU support — H100, H200, MI300X, B200, 4090
  • Inherited compliance — SOC 2 Type II, ISO 27001
  • Live demand forecasting before you list