FOR OPERATORS // OPERATOR STACK

The full software stack for NeoCloud operators.

Hoonify gives you everything from PXE boot to the customer portal, so your team can focus on capacity and power — not Kubernetes operators and tax filing.

List your capacity Architecture overview

TurbOS·Hoonify Kubernetes·Inference scheduler·Howl portal

THE STACK

Six layers. One control plane.

You bring the racks. We bring the rest.

Every layer is opinionated, opt-out, and works on day one. No integration tax, no homegrown billing pipeline, no second Kubernetes cluster you wish you hadn't built.

Per-second metering across compute and tokens
SOC 2 Type II + ISO 27001 inherited from the platform
Sub-100ms region routing on the demand side
Tenant isolation enforced at the kernel and network layer
Always-on observability — Grafana, Loki, Tempo on us

Request → Token

Demand Aggregator
Marketplace · compare · checkout
Hoonify
Howl portal
Customer surface · API · usage
Hoonify
Hoonify scheduler
Inference batching · GPU sharing
Hoonify
Hoonify Kubernetes
Pods · namespaces · CNI · isolation
Hoonify
TurbOS
PXE · drivers · BMC · imaging
Hoonify
Bare metal
GPUs · NICs · NVMe · power
Operator

Hoonify-managed (5)Operator-owned (1)

ARCHITECTURE

How a request becomes a token.

Customer hits the marketplace, Howl routes it to your region, the scheduler picks the right GPU, Kubernetes spawns the pod, TurbOS makes sure the metal is healthy. The whole loop closes in single-digit milliseconds for inference, single-digit minutes for compute.

Request lifecycle

Request · ControlTokens · Response

LAYER BY LAYER

What each piece actually does.

BARE-METAL PROVISIONING

TurbOS

//Image bare metal in <90 seconds.

PXE boot + BMC orchestration across heterogeneous hardware
GPU drivers, CUDA, NCCL, RDMA pre-baked into operator-curated images
Hot-swap nodes without downtime; live migration on supported SKUs
First-boot to first-token in under 90 seconds

ORCHESTRATION

Hoonify Kubernetes

//K8s, but the GPU bits already work.

Multi-tenant namespaces with hardware isolation by default
GPU operator + topology-aware scheduling out of the box
Network policies for tenant isolation; CNI tuned for east-west bandwidth
Roll back faulty kernels without touching customer workloads

INFERENCE SCHEDULER

Hoonify Scheduler

//More tokens per watt.

Continuous batching + paged-attention serving (vLLM, SGLang, TGI)
Cross-tenant GPU sharing with MIG / MPS / time-slicing where supported
Auto-quantization pipeline (FP16 → FP8 → INT4) per latency tier
Spot reclamation in seconds — workloads checkpoint and resume

METERING & BILLING

Per-second billing

//Charge for actual GPU-seconds, not invoices.

Per-second metering on compute, storage, network, and tokens
Credits, invoiced postpaid, prepaid contracts — all in one ledger
Tax + compliance for US, EU, UK, JP, SG (handled, not your problem)
Daily payouts to operator wallet; weekly USD wire on request

CUSTOMER PORTAL

Howl

//Your renters get a real product.

Drop-in customer portal — co-branded, custom domain, your colors
Live usage, spend, jobs, API keys; no console of your own to maintain
OpenAI-compatible API surface for inference; ssh/k8s for compute
Status page, audit logs, support inbox — all included

DEMAND

Hoonify Demand Aggregator

//Real customers. Not just a checkout button.

Your capacity surfaces in front of every Hoonify customer in the network
Comparison page sets a real price floor — no race-to-zero, just clarity
Reserved + on-demand + spot routing handled per tenant policy
Demand forecasting per region so you can pre-warm capacity

ECONOMICS

Take rate that doesn't apologize for itself.

Flat 12% across both compute and tokens, billed against actual customer revenue — not your list price. No SaaS license, no per-GPU seat fee, no "enterprise tier" gating.

Take rate

12%

flat across compute + tokens

Payout

Daily

auto wire, weekly on request

Onboarding

<14 days

from contract to first listing

Time to first token

<90s

via TurbOS imaging

OPERATOR ECONOMICS · ILLUSTRATIVE

What 1,000 H100s cleared last quarter.

Single mid-size operator on the Hoonify network. Mix of on-demand and 1-month reserved. All numbers are fictional and meant to illustrate the unit economics — your power and capex aren't.

Effective utilization73%

Realized rate$1.62 / GPU-hr

Gross GPU-hours billed1.6M

Quarterly customer revenue$2.59M

Hoonify take rate (12%)−$311K

Net to operator$2.28M

READY TO LIST

Have the racks. Need the renters.

From contract to first listing, under fourteen days.

Tell us what you have and where it is. We'll handle the rest.

List your capacity See what customers see

Hot-rack PoC in a weekend
Heterogeneous SKU support — H100, H200, MI300X, B200, 4090
Inherited compliance — SOC 2 Type II, ISO 27001
Live demand forecasting before you list