RATES
One rate sheet. Set by Hoonify.
Hoonify sets a single rate per SKU — operators don't price-compete on the marketplace. Rates are benchmarked against Vast.AI public medians and the major hyperscalers' on-demand list prices.
SourceVast.AI public bundles snapshot · 2026-04-27vast.ai Hyperscaler list prices via published rate sheets
GPU compute
Per-GPU/hr on-demand. Bundles of 1× and 8× GPU available where capacity allows.
| GPU | Hoonify rate | Vast.AI median | Hyperscaler list | vs hyperscaler | Capacity | |
|---|---|---|---|---|---|---|
H100 80GB SXMDatacenter NVIDIA · 80 GB · 989 TFLOPS | $1.39/ GPU / HR | $1.55 | $12.29 AWS p5.48xlarge | ~89% | 724GPUs | Launch |
H100 80GB PCIe NVIDIA · 80 GB · 756 TFLOPS | $1.55/ GPU / HR | $1.74 | $12.29 AWS p5.48xlarge | ~87% | 669GPUs | Launch |
H200 141GBDatacenter NVIDIA · 141 GB · 989 TFLOPS | $2.09/ GPU / HR | $2.32 | $14.50 AWS p5e.48xlarge | ~86% | 647GPUs | Launch |
A100 80GB SXM NVIDIA · 80 GB · 312 TFLOPS | $0.66/ GPU / HR | $0.73 | $4.60 AWS p4d.24xlarge | ~86% | 550GPUs | Launch |
A100 40GB NVIDIA · 40 GB · 250 TFLOPS | $0.55/ GPU / HR | $0.73 | $3.90 AWS p4d.24xlarge | ~86% | 196GPUs | Launch |
L40S 48GB NVIDIA · 48 GB · 362 TFLOPS | $0.50/ GPU / HR | $0.56 | $2.95 AWS g6e.12xlarge | ~83% | 200GPUs | Launch |
RTX 4090 24GBCommodity NVIDIA · 24 GB · 165 TFLOPS | $0.29/ GPU / HR | $0.32 | n/a | — | 67GPUs | Launch |
MI300X 192GBDatacenter AMD · 192 GB · 1,307 TFLOPS | $1.85/ GPU / HR | n/a | $7.99 Azure ND MI300X | ~77% | 520GPUs | Launch |
B200 180GBReserved only NVIDIA · 180 GB · 2,250 TFLOPS | $3.59/ GPU / HR | $4.00 | $16.50 AWS p6e (preview) | ~78% | 0GPUs | Launch |
Inference endpoints
Per-million-token. Hoonify pools quantized variants — operators pick the precision that hits the Hoonify-set per-token rate.
| Model | $ / 1M IN | $ / 1M OUT | Quant | Replicas | |
|---|---|---|---|---|---|
Llama 3.1 8B8B 128K ctx | $0.06 | $0.12 | FP8 | 1 | Launch |
Llama 3.3 70B70B 128K ctx | $0.18 | $0.54 | FP8FP16 | 3 | Launch |
Llama 4 ScoutMoE 109B 1M ctx | $0.22 | $0.66 | FP16 | 1 | Launch |
Qwen 2.5 72B72B 128K ctx | $0.20 | $0.58 | FP8FP16 | 2 | Launch |
Qwen 3 32B32B 128K ctx | $0.12 | $0.34 | INT4FP8 | 2 | Launch |
DeepSeek V3MoE 671B 128K ctx | $0.45 | $1.10 | FP8 | 5 | Launch |
Mistral Large 2123B 128K ctx | $0.30 | $0.90 | FP8 | 1 | Launch |
Mistral Mixtral 8x22BMoE 8x22B 64K ctx | $0.25 | $0.80 | FP16 | 2 | Launch |
Gemma 2 27B27B 32K ctx | $0.10 | $0.30 | FP8FP16 | 8 | Launch |
Phi 414B 16K ctx | $0.07 | $0.18 | FP16INT4FP8 | 3 | Launch |
MARKET POSITION
Cheaper than the marketplace floor. Cheaper than the hyperscaler list.
Real capacity, real prices, real tokens.Rates refresh weekly against Vast.AI public bundles and the AWS / Azure / GCP on-demand sheets. We pass the savings to you — you don't haggle with operators.
- Snapshot2026-04-27
- SKUs priced10 GPUs · 10 models
- Rate stability14 day notice